Dragonfly Adds Native Hugging Face and ModelScope Protocols

What happened

Dragonfly added native Hugging Face and ModelScope support to dfget via hf:// and modelscope://.

For platform teams, this removes a lot of custom model sync and cache glue.

You can pull models directly with built-in auth, revision pinning, and recursive fetches.

Why it matters

On inference clusters, model pull is usually the cold-start bottleneck and one of the biggest avoidable egress costs.

The traffic math is straightforward. A 130 GB model across 200 nodes is 26 TB if every node pulls from origin. With Dragonfly in front, origin should see roughly one full pull (~130 GB). That is the 99.5% claim.

It also reduces lock-in to internal mirror pipelines and custom fetch services that become permanent maintenance debt.

How it looks

dfget hf://deepseek-ai/DeepSeek-R1/model.safetensors -O /models/DeepSeek-R1/model.safetensors
dfget hf://owner/repo -O ./repo/ -r

ModelScope follows the same pattern with --ms-token and --ms-revision.

The key behavior is piece-level sharing. Seed peers can upload chunks before the full model download completes, so pulls become parallel instead of serialized.

Proven vs unproven

Proven:

Native hf:// and modelscope:// support in dfget
Cleaner integration path than custom mirror jobs
Clear origin-traffic reduction math

Unproven (in public data so far):

p50/p95 model-ready startup time at scale
cross-AZ and cross-region behavior
token handling guidance that avoids secrets in shell history
failure behavior when origin is slow or rate-limited
end-to-end security posture for private model pulls

What to do next

If you currently mirror Hugging Face or ModelScope into internal storage, run a controlled pilot and compare:

cold-start time per pod
origin egress
east-west network load
operational complexity (how much code/config you can delete)

If the numbers hold, delete the mirror pipeline.

What happened

Why it matters

How it looks

Proven vs unproven

What to do next

Stay on top of cloud-native releases

More stories

vLLM, SGLang, Kubernetes, Kueue, and Helm ship runtime fixes

containerd patches a runAsNonRoot bypass across every supported branch

Kubernetes 1.36 formally deprecates Service externalIPs