Use secondary boot disks to preload data or container images

Autopilot Standard

This page shows you how to improve workload startup latency by using secondary boot disks in Google Kubernetes Engine (GKE) to preload data or container images on new nodes. This enables workloads to achieve a fast cold start and to improve the overall utilization of provisioned resources.

This page assumes a knowledge of Google Cloud, Kubernetes, containers, YAML, containerd runtime, and the Google Cloud CLI.

Overview

Starting in GKE version 1.28.3-gke.1067000 in Standard clusters and in GKE version 1.30.1-gke.1329000 in Autopilot clusters, you can configure the node pool with secondary boot disks. You can tell GKE to provision the nodes and preload them with data, such as a machine learning model, or a container image. Using preloaded container images or data in a secondary disk has the following benefits for your workloads:

Reduced latency when pulling large container images, or downloading data
Faster autoscaling
Quicker recovery from disruptions like maintenance events and system errors

The following sections describe how to configure the secondary boot disk in GKE Autopilot and Standard clusters.

How secondary boot disks work

Your workload can start more quickly by using the preloaded container image or data on secondary boot disks. Secondary boot disks have the following characteristics:

Secondary boot disks are Persistent Disks which are backed by distributed block storage. If the disk image is already in use in the zone, the creation time of all subsequent disk from the same disk image will be lower.
The secondary boot disk type is the same as the node boot disk.
The size of the secondary boot disk is decided by disk image size.

Adding secondary boot disks to your node pools does not increase the node provisioning time. GKE provisions secondary boot disks from a disk image in parallel with the node provisioning process.

To support preloaded container images, GKE extends the containerd runtime with plugins that read the container images from secondary boot disks. Container images are reused by the base layers—we recommend that you preload large base layers into the secondary boot disk, while the small upper layers can be pulled from the container registry.

Before you begin

Before you start, make sure you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update.
Note: For existing gcloud CLI installations, make sure to set the compute/region and compute/zone properties. By setting default locations, you can avoid errors in gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location.

Enable the Container File System API.

Enable Container File System API

Requirements

The following requirements apply to using secondary boot disk:

Your clusters are running GKE version 1.28.3-gke.1067000 in GKE Standard or version 1.30.1-gke.1329000 in GKE Autopilot.
When you modify the disk image, create a new node pool. Updating the disk image on existing nodes is not supported.
Configure Image streaming to use the secondary boot disk feature.
Use the Container-Optimized OS with a containerd node image. Autopilot nodes use this node image by default.
Prepare the disk image with data ready during build time or with preloaded container images. Ensure that your cluster has access to the disk image to load in the nodes. We recommend automating the disk image in a CI/CD pipeline.

Prepare the secondary boot disk

To prepare the secondary boot disk, choose the Images tab for preloading container images or choose the Data tab for preloading data, then complete the following instructions:

Images

GKE provides a tool called gke-disk-image-builder to create a virtual machine (VM) and pull the container images on a disk and then create a disk image from that disk.

To create a disk image with multiple preloaded container images, complete the following steps:

Create a Cloud Storage bucket to store the execution logs of gke-disk-image-builder.
Create a disk image with gke-disk-image-builder.

go run ./cli \
    --project-name=PROJECT_ID \
    --image-name=DISK_IMAGE_NAME \
    --zone=LOCATION \
    --gcs-path=gs://LOG_BUCKET_NAME \
    --disk-size-gb=10 \
    --container-image=docker.io/library/python:latest \
    --container-image=docker.io/library/nginx:latest

Replace the following:

PROJECT_ID: the name of your Google Cloud project.
DISK_IMAGE_NAME: the name of the image of the disk. For example, nginx-python-image.
LOCATION: the cluster location.
LOG_BUCKET_NAME: the name of the Cloud Storage bucket to store the execution logs. For example,gke-secondary-disk-image-logs/.

When you create a disk image with gke-disk-image-builder, Google Cloud creates multiple resources to complete the process (for example, a VM instance, a temporary disk, and a persistent disk). After its execution, the image builder cleans up all the resources except the disk image that you created.

Data

Create a custom disk image as the data source by completing the following steps:

Configure the secondary boot disk

You can configure the secondary boot disk in a GKE Autopilot or Standard cluster. We recommend that you use an Autopilot cluster for a fully managed Kubernetes experience. To choose the GKE mode of operation that's the best fit for your workloads, see Choose a GKE mode of operation.

Use GKE Autopilot

In this section, you create a disk image allowlist to allow the disk image in an existing GKE Autopilot cluster. Then, you modify the Pod node selector to use a secondary boot disk.

Allow the disk images in your project

In this section, you create a GCPResourceAllowlist to allow GKE to create nodes with secondary boot disks from the disk images in your Google Cloud project.

Save the following manifest as allowlist-disk.yaml:

apiVersion: "node.gke.io/v1"
kind: GCPResourceAllowlist
metadata:
  name: gke-secondary-boot-disk-allowlist
spec:
  allowedResourcePatterns:
  - "projects/PROJECT_ID/global/images/.*"

Replace the PROJECT_ID with your project ID to host the disk image.

Apply the manifest:
```
kubectl apply -f allowlist-disk.yaml
```
GKE creates nodes with secondary boot disks from all disk images in the project.

Update the Pod node selector to use a secondary boot disk

In this section, you modify the Pod spec so that GKE creates the nodes with the secondary boot disk.

Add a nodeSelector to your Pod template:
```
nodeSelector:
    cloud.google.com.node-restriction.kubernetes.io/gke-secondary-boot-disk-DISK_IMAGE_NAME=CONTAINER_IMAGE_CACHE.PROJECT_ID
```
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- PROJECT_ID: your project ID to host the disk image.
Use the kubectl apply command to apply the Kubernetes specification with the Pod template.

Confirm that the secondary boot disk cache is in use:

kubectl get events --all-namespaces

The output is similar to the following:

75s         Normal      SecondaryDiskCachin
node/gke-pd-cache-demo-default-pool-75e78709-zjfm   Image
gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache

Check the image pull latency:

kubectl describe pod POD_NAME

Replace POD_NAME with the name of the Pod.

The output is similar to following:

…
  Normal  Pulled     15m   kubelet            Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s
…

The expected image pull latency for the cached container image should be significantly reduced, regardless of image size.

Use GKE Standard

To create a GKE Standard cluster and a node pool, complete the following instructions, choosing the Images or Data tab based on whether you want to preload container images or preload data on the secondary boot disk:

Images

You can configure a secondary boot disk by using the Google Cloud CLI or Terraform:

gcloud

Create a GKE Standard cluster with image streaming enabled:
```
gcloud container clusters create CLUSTER_NAME \
    --location=LOCATION \
    --cluster-version=VERSION \
    --enable-image-streaming
```
Replace the following:
- CLUSTER_NAME: the name of your cluster.
- LOCATION: the cluster location.
- VERSION: the GKE version to use. The GKE version must be 1.28.3-gke.106700 or later.
Create a node pool with a secondary boot disk in the same project:
```
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--location LOCATION \
--enable-image-streaming \
--secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE
```
Replace the following:
- NODE_POOL_NAME: the name of the node pool.
- CLUSTER_NAME: the name of the existing cluster.
- LOCATION: the compute zone or zones separated by comma of the cluster.
- DISK_IMAGE_NAME: the name of your disk image.
To create a node pool with a secondary boot disk from the disk image in a different project, complete the steps in Use a secondary boot disk in a different project.

Add a nodeSelector to your Pod template:

nodeSelector:
    cloud.google.com/gke-nodepool: NODE_POOL_NAME

Confirm that the secondary boot disk cache is in use:

kubectl get events --all-namespaces

The output is similar to the following:

75s       Normal      SecondaryDiskCachin
node/gke-pd-cache-demo-default-pool-75e78709-zjfm Image
gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache

Check the image pull latency by running the following command:

kubectl describe pod POD_NAME

Replace POD_NAME with the name of the Pod.

The output is similar to following:

…
  Normal  Pulled     15m   kubelet            Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s
…

The expected image pull latency for the cached container image should be no more than a few seconds, regardless of image size.

Terraform

To create a cluster with the default node pool using Terraform, refer to the following example:

resource "google_container_cluster" "default" {
  name               = "default"
  location           = "us-central1-a"
  initial_node_count = 1
  # Set `min_master_version` because secondary_boot_disks require GKE 1.28.3-gke.106700 or later.
  min_master_version = "1.28"
  # Setting `deletion_protection` to `true` would prevent
  # accidental deletion of this instance using Terraform.
  deletion_protection = false
}

Add a nodeSelector to your Pod template:

nodeSelector:
    cloud.google.com/gke-nodepool: NODE_POOL_NAME

Confirm that the secondary boot disk cache is in use:

kubectl get events --all-namespaces

The output is similar to the following:

75s       Normal      SecondaryDiskCachin
node/gke-pd-cache-demo-default-pool-75e78709-zjfm Image
gcr.io/k8s-staging-jobsejt/pytorch-mnist:latest is backed by secondary disk cache

Check the image pull latency by running the following command:

kubectl describe pod POD_NAME

Replace POD_NAME with the name of the Pod.

The output is similar to following:

…
  Normal  Pulled     15m   kubelet            Successfully pulled image "docker.io/library/nginx:latest" in 0.879149587s
…

The expected image pull latency for the cached container image should be no more than a few seconds, regardless of image size.

To learn more about using Terraform, see Terraform support for GKE.

Data

You can configure a secondary boot disk and preload data by using the Google Cloud CLI or Terraform:

gcloud

Create a GKE Standard cluster with image streaming enabled:
```
gcloud container clusters create CLUSTER_NAME \
    --location=LOCATION \
    --cluster-version=VERSION \
    --enable-image-streaming
```
Replace the following:
- CLUSTER_NAME: the name of your cluster.
- LOCATION: the cluster location.
- VERSION: the GKE version to use. The GKE version must be 1.28.3-gke.106700 or later.
Create a node pool with a secondary boot disk by using the --secondary-boot-disk flag:
```
gcloud container node-pools create NODE_POOL_NAME \
--cluster=CLUSTER_NAME \
--location LOCATION \
--enable-image-streaming \
--secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME
```
Replace the following:
- NODE_POOL_NAME: the name of the node pool.
- CLUSTER_NAME: the name of the existing cluster.
- LOCATION: the compute zone or zones separated by comma of the cluster.
- DISK_IMAGE_NAME: the name of your disk image.
To create a node pool with a secondary boot disk from the disk image in a different project, complete the steps in Use a secondary boot disk in a different project.

GKE creates a node pool where each node has a secondary disk with preloaded data. GKE attaches and mounts the secondary boot disk on the node.
Optionally, you can mount the secondary disk image in the Pod containers using a hostPath volume mount. Use the following manifest to define a Pod resources and use a hostPath volume mount to preload the data disk in its containers:
```
apiVersion: v1
kind: Pod
metadata:
  name: pod-name
spec:
  containers:
  ...
  volumeMounts:
  - mountPath: /usr/local/data_path_sbd
    name: data_path_sbd
...
volumes:
  - name: data_path_sbd
    hostPath:
        path: /mnt/disks/gke-secondary-disks/gke-DISK_IMAGE_NAME-disk
```
Replace DISK_IMAGE_NAME with the name of your disk image.

Terraform

To create a cluster with the default node pool using Terraform, refer to the following example:

resource "google_container_cluster" "default" {
  name               = "default"
  location           = "us-central1-a"
  initial_node_count = 1
  # Set `min_master_version` because secondary_boot_disks require GKE 1.28.3-gke.106700 or later.
  min_master_version = "1.28"
  # Setting `deletion_protection` to `true` would prevent
  # accidental deletion of this instance using Terraform.
  deletion_protection = false
}

To learn more about using Terraform, see Terraform support for GKE.

Optionally, you can mount the secondary disk image in the Pod containers using a hostPath volume mount. Use the following manifest to define a Pod resources and use a hostPath volume mount to preload the data disk in its containers:
```
apiVersion: v1
kind: Pod
metadata:
  name: pod-name
spec:
  containers:
  ...
  volumeMounts:
  - mountPath: /usr/local/data_path_sbd
    name: data_path_sbd
...
volumes:
  - name: data_path_sbd
    hostPath:
        path: /mnt/disks/gke-secondary-disks/gke-DISK_IMAGE_NAME-disk
```
Replace the DISK_IMAGE_NAME with the name of your disk image.

Cluster autoscaling with secondary boot disks

You can create a node pool and configure cluster autoscaling on a secondary boot disk by using Google Cloud CLI:

  gcloud container node-pools create NODE_POOL_NAME \
      --cluster=CLUSTER_NAME \
      --location LOCATION \
      --enable-image-streaming \
      --secondary-boot-disk=disk-image=global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE \
      --enable-autoscaling \
      --num-nodes NUM_NODES \
      --min-nodes MIN_NODES \
      --max-nodes MAX_NODES

Replace the following:

NODE_POOL_NAME: the name of the node pool.
CLUSTER_NAME: the name of the existing cluster.
LOCATION: the compute zone or zones separated by comma of the cluster.
DISK_IMAGE_NAME: the name of your disk image.
MIN_NODES: the minimum number of nodes to automatically scale for the specified node pool per zone. To specify the minimum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-min-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.
MAX_NODES: the maximum number of nodes to automatically scale for the specified node pool per zone. To specify the maximum number of nodes for the entire node pool in GKE versions 1.24 and later, use --total-max-nodes. The flags --total-min-nodes and --total-max-nodes are mutually exclusive with the flags --min-nodes and --max-nodes.

Node auto-provisioning with secondary boot disks

In GKE 1.30.1-gke.1329000 and later, you can configure node auto-provisioning to automatically create and delete node pools to meet the resource demands of your workloads.

Create a disk image allowlist custom resource for secondary boot disk for GKE node auto-provisioning similar to the following:
```
apiVersion: "node.gke.io/v1"
kind: GCPResourceAllowlist
metadata:
  name: gke-secondary-boot-disk-allowlist
spec:
  allowedResourcePatterns:
  - "projects/<PROJECT_ID>/global/images/.*"
```
Replace the PROJECT_ID with your project ID to host the disk image.
Deploy the allowlist custom resource in the cluster, run the following command:
```
kubectl apply -f ALLOWLIST_FILE
```
Replace the ALLOWLIST_FILE with the manifest filename.
Update the Pod node selector to use secondary boot disk:
```
nodeSelector:
    cloud.google.com.node-restriction.kubernetes.io/gke-secondary-boot-disk-DISK_IMAGE_NAME=CONTAINER_IMAGE_CACHE.PROJECT_ID
```
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- PROJECT_ID: your project ID to host the disk image.

Use a secondary boot disk in a different project

When creating a node pool with a secondary boot disk, you can tell GKE to use the disk image in a different project by using the --secondary-boot-disk flag.

Create a node pool with a secondary boot disk from the disk image in a different project by using the --secondary-boot-disk flag. For example:
```
gcloud beta container node-pools create NODE_POOL_NAME \
    --cluster=CLUSTER_NAME \
    --location LOCATION \
    --enable-image-streaming \
    --secondary-boot-disk=disk-image=projects/IMAGE_PROJECT_ID/global/images/DISK_IMAGE_NAME,mode=CONTAINER_IMAGE_CACHE
```
Replace the following:
- DISK_IMAGE_NAME: the name of your disk image.
- IMAGE_PROJECT_ID: the name of the project that the disk image belongs to.
GKE creates a node pool where each node has a secondary disk with preloaded data. This attaches and mounts the secondary boot disk onto the node.

Grant access to disk images belonging to a different project by adding "Compute Image User" roles for the cluster service accounts:

Default compute service account: CLUSTER_PROJECT_NUMBER@cloudservices.gserviceaccount.com
GKE service account: service-CLUSTER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com

gcloud projects add-iam-policy-binding IMAGE_PROJECT_ID \
    --member serviceAccount:CLUSTER_PROJECT_NUMBER@cloudservices.gserviceaccount.com \
    --role roles/compute.imageUser

gcloud projects add-iam-policy-binding IMAGE_PROJECT_ID \
    --member serviceAccount:service-CLUSTER_PROJECT_NUMBER@container-engine-robot.iam.gserviceaccount.com \
    --role roles/compute.imageUser

What's next

Use Use Image streaming to pull container images to pull container images by streaming the image data as your workloads need.
See Improve workload efficiency using NCCL Fast Socket to learn how to use the NVIDIA Collective Communication Library (NCCL) Fast Socket plugin.