Adopt Plumbing CNCF Graduated Runtime / Cloud Native Storage

Technology Guide

CubeFS

License: Apache-2.0

CubeFS Logo

Field Guide

Complete Guide

CubeFS (originally ChubaoFS) is a distributed file system written in Go that exposes POSIX, HDFS, and S3-compatible APIs over the same backend. It was open-sourced by JD.com, where it runs large-scale AI training, container image storage, and database workloads, and was accepted into the CNCF in 2019, graduating in December 2024.

Architecturally, CubeFS splits metadata and data across independent services. The Master handles cluster topology. Metanodes store file system metadata in memory, replicated via Raft, and shard a single filesystem (“volume”) into metadata partitions. Datanodes store the actual file content in data partitions, using either multi-replica for small files or Reed-Solomon erasure coding for large cold data to cut storage overhead. Clients mount volumes via a FUSE client, an HDFS-compatible client, or the S3 gateway. Each volume can be tuned independently for consistency, replication factor, and erasure-coding policy.

CubeFS competes with JuiceFS, Alluxio, Ceph (CephFS), and Lustre. Its pitch compared to Ceph is simpler operations and Go-based tooling; compared to JuiceFS, that it stores data in its own datanodes rather than delegating to object storage, which gives it lower latency for small-file AI training workloads.

CNCF Project

Cloud Native Computing Foundation

Accepted: 2019-12-16
Incubating: 2022-07-03
Graduated: 2024-12-11

Community

Join the conversation

No articles found for CubeFS yet. Check back soon!