Ondat Shared Filesystems


💡 This feature is available in release v2.3.0 or greater.

What Are Ondat Shared Filesystems?

Ondat provides support for ReadWriteMany (RWX) persistent volumes.

  • A RWX PVC can be used simultaneously by many Pods in the same Kubernetes namespace for read and write operations.
  • Ondat RWX persistent volumes are based on a shared filesystem, the protocol being used for this feature’s backend is Network Files System (NFS).

Architecture Of Ondat Shared Filesystems

For each RWX persistent volume, the following components below are required:

  1. Ondat ReadWriteOnly (RWO) Volume
    • Ondat provisions a standard volume that provides a block device for the file system of the NFS server.
    • This means that every RWX Volume has its own RWO Volume - thus allowing RWX Volumes to leverage the synchronous replication and automatic failover functionality of Ondat, providing the NFS server with high availability.
  2. NFS-Ganesha Server
    • For each RWX volume, an NFS-Ganesha server is spawned by Ondat.
    • The NFS server runs in user space on the bode containing the primary volume. Each NFS server uses its own RWO volume to store data so the data of each Volume is isolated.
    • Ondat binds an ephemeral port to the host network interface for each NFS-Ganesha server.
    • The NFS export is presented using NFS v4.2. Ensure that you review the official prerequisites page for more information on the port number range, that is for Ondat RWX persistent volumes to successfully run.
  3. Ondat API Manager
    • The Ondat API Manager resource monitors Ondat RWX volumes to create and maintain a Kubernetes service that points towards each RWX volume’s NFS export endpoint.
    • The API Manager is responsible for updating the service endpoint when a RWX volume failover occurs.

How are Ondat ReadWriteMany (RWX) PersistentVolumeClaims (PVCs) Provisioned?

The sequence in which a RWX PVC is provisioned and used demonstrated in the steps below:

  1. A PersistentVolumeClaim (PVC) is created with ReadWriteMany (RWX) access mode using any Ondat StorageClass.
  2. Ondat dynamically provisions the PersistentVolume (PV).
  3. A new Ondat ReadWriteOnly (RWO) Volume is provisioned internally (not visible in Kubernetes).
  4. When the RWX PVC is consumed by a pod, an NFS-Ganesha server is instantiated on the same node as the primary volume.
  5. The NFS-Ganesha server then uses the RWO Ondat volume as its backend disk.
  6. The Ondat API Manager publishes the host IP and port for the NFS service endpoint, by creating a Kubernetes service that points to the NFS-Ganesha server export endpoint.
  7. Ondat issues a NFS mount on the Node where the Pod using the PVC is scheduled.

For more information on how to get started with Ondat Shared Filesystems, review the How To Create ReadWriteMany (RWX) Volumes operations page.

High Availability For Ondat Shared Filesystems

Ondat RWX volumes failover in the same way as standard Ondat RWO volumes.

  • The replica volume is promoted upon detection of node failure and the NFS-Ganesha server is started on the node containing the promoted replica.
  • The Ondat API Manager updates the endpoint of the Volume’s NFS service, causing traffic to be routed to the URL of the new NFS-Ganesha server.
  • The NFS client in the application node (where the user’s pod is running) automatically reconnects.

Further Information

  • All Ondat Feature Labels that work on RWO volumes will also work on RWX volumes.
  • A Ondat RWX volume is matched one-to-one with a PVC. Therefore the Ondat RWX volume can only be accessed by pods in the same Kubernetes namespace.
  • Ondat RWX volumes support volume resize.
    • For more information on how to resize a volume, review the Volume Resize operations page.
  • As it’s backed by an NFS instance, the resource consumption of a RWX volume can grow.
    • This consumption scales linearly with the volume’s throughput.
    • If given insufficient resources the NFS server’s IO can be blocked and it can fail.
    • The resources in question are the speed of the underlying disk and CPU time of the machine hosting the volume’s primary replica.
    • Our attachments are unlikely to cause any issue outside of NFS. We have happily tested up to 800 consumers for volumes hosted on small hosts, for very low-throughput applications.