Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to mount on initContainer #38

Closed
jackcjf opened this issue Jun 21, 2023 · 7 comments
Closed

Unable to mount on initContainer #38

jackcjf opened this issue Jun 21, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@jackcjf
Copy link

jackcjf commented Jun 21, 2023

Using static provisioning PVC in init containers causes pod initialization to hang, and eventually getting Init:CreateContainerError.

pod init container spec

  initContainers:
  - image: ubuntu:latest
    command:
    - "echo"
    - "12345"
    imagePullPolicy: Always
    name: initer
    resources:
      requests:
        memory: "2Gi"
        cpu: "1000m"
        ephemeral-storage: "10Gi"
    volumeMounts:
    - name: gcs-fuse-csi-pvc
      mountPath: /data
      readOnly: true

k8s events

Events:
  Type     Reason       Age                  From                                   Message
  ----     ------       ----                 ----                                   -------
  Normal   Scheduled    2m31s                gke.io/optimize-utilization-scheduler  Successfully assigned test/my_pod to gk3-test-pool-2-3d6bf18a-xnl2
  Warning  FailedMount  28s                  kubelet                                Unable to attach or mount volumes: unmounted volumes=[gcs-fuse-csi-pvc], unattached volumes=[gcs-fuse-csi-pvc kube-api-access-8r69w gke-gcsfuse-tmp]: timed out waiting for the condition
  Warning  FailedMount  23s (x9 over 2m31s)  kubelet                                MountVolume.MountDevice failed for volume "gcs-fuse-csi-pv" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name gcsfuse.csi.storage.gke.io not found in the list of registered CSI drivers

Dirver version
Running Google Cloud Storage FUSE CSI driver sidecar mounter version v0.1.3-gke.0

GKE
1.26.3-gke.1000

As sidecar is needed for normal containers to access the bucket, I believe this issue arise due to init container not having a sidecar to support mounting.

@songjiaxun
Copy link
Collaborator

Currently, the driver does not support init container. However, we may have a potential solution:

The KEP for sidecar container pattern https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/753-sidecar-containers is implemented and merged: kubernetes/kubernetes#116429

The new feature gate "SidecarContainers" is now available. This feature introduces sidecar containers, a new type of init container that starts before other containers but remains running for the full duration of the pod's lifecycle and will not block pod termination.

This change can be a good solution. Instead of injecting the sidecar container as a regular container, we can leverage the new SidecarContainers feature to inject the container as an init container, so that other non-sidecar init container can also use the driver.

@songjiaxun songjiaxun added the enhancement New feature or request label Jul 13, 2023
@everpeace
Copy link

kubernetes/kubernetes#116429 is now merged🎉 Sidecar feature will release in Kubernetes v1.28 with feature-gate.

So, the driver can inject the sidecar-mounter container as a sidecar container which runs in the whole pod lifecycle (as an initContainer with restartPolicy: Always). Then, normal init containers can access to fuse volume properly.

@msau42
Copy link
Collaborator

msau42 commented Jul 28, 2023

FYI, the sidecar containers feature is alpha in Kubernetes 1.28, which means it will not be available on GKE. The Kubernetes feature needs to be at least beta until GKE can use it.

@songjiaxun
Copy link
Collaborator

We are working to adopt the k8s native sidecar container feature. We are targeting mid-March to support the init container mount. Note that it will only be supported in >1.29 clusters.

@songjiaxun
Copy link
Collaborator

songjiaxun commented Apr 7, 2024

The new GKE version rollout completed. Starting from GKE 1.29.3-gke.1093000, the CSI driver injects the GCSFuse sidecar container as an init container that supports mounting GCSFuse volumes in other init containers.

To try out the new feature, please upgrade your GKE cluster to 1.29.3-gke.1093000 or later, and make sure ALL your nodes are also upgraded to GKE version 1.29 or later, then re-deploy your workloads.

Closing this issue for now.

@vanushah
Copy link

vanushah commented Jul 31, 2024

@songjiaxun Hey. I've just upgraded my control plane and the nodes in GKE to v1.29.6-gke.1038001 and I'm still experiencing this issue:

Events:
  Type     Reason     Age                     From               Message
  ----     ------     ----                    ----               -------
  Normal   Scheduled  7m47s                   default-scheduler  Successfully assigned influence-tracker/airflow-worker-0 to gke-production-main-prod-7-dd30ec0c-pkwg
  Normal   Pulled     7m39s                   kubelet            Container image "docker.bingo-boom.ru/digital_department/ru/application-layer/resources/dependencies/docker-images/airflow:1" already present on machine
  Normal   Created    7m39s                   kubelet            Created container install-pip-packages
  Normal   Started    7m39s                   kubelet            Started container install-pip-packages
  Warning  Failed     5m12s                   kubelet            Error: context deadline exceeded
  Warning  Failed     3m23s (x10 over 5m12s)  kubelet            Error: failed to reserve container name "check-db_airflow-worker-0_influence-tracker_1a400420-e015-44cb-b2ef-98d22d0cfc30_0": name "check-db_airflow-work
er-0_influence-tracker_1a400420-e015-44cb-b2ef-98d22d0cfc30_0" is reserved for "2886f180b426a48824f724d99d18c28d4f7a44ec197551d8331acac7d0420319"
  Normal   Pulled     2m30s (x15 over 7m12s)  kubelet            Container image "docker.bingo-boom.ru/digital_department/ru/application-layer/resources/dependencies/docker-images/airflow:1" already present on machine

I'm trying to launch an Airflow worker here with logs volume being the FUSE one.

make sure ALL your nodes are also upgraded to GKE version 1.29 or later

BTW, why do we need this? I've upgraded only nodes that run the workloads for the FUSE mount. Do I need to upgrade all nodes in the cluster and why?

@songjiaxun
Copy link
Collaborator

Hi @vanushah, you will need to upgrade all of your nodes to 1.29 or later, because the native sidecar container is only enabled on GKE 1.29 nodes. The CSI driver checks if all the nodes have version 1.29 and above, and then decides whether the cluster can support native sidecar container feature -- this design is to avoid the case where a Pod having the native sidecar container got scheduled to 1.28 nodes and the Pod will be stuck there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants