How I Split rclone and rsync When Moving Hugging Face Models from Cold to Hot Storage
A transfer procedure that splits rclone for blobs and rsync for snapshots/refs when moving Hugging Face models between cold and hot storage tiers.
Introduction
When I move a Hugging Face model from a cold archive path to a hot runtime path, a full directory copy is not the cleanest approach. The repository layout mixes large file objects with symlink-based revision structure, so using one tool for everything tends to blur an important distinction.
This note captures the split I settled on for a GLM-5-GGUF repository: rclone for blobs, and rsync for snapshots and refs. In my storage strategy, active data and slower archival storage are treated differently; this transfer pattern is the concrete way I move a model into the hot tier while keeping the Hugging Face directory structure usable.
Background and Motivation
A local Hugging Face hub directory is more structured than it first appears. Most of the bytes live under blobs, but actual revision resolution depends on the trees under snapshots and refs.
If I treat all of that as the same kind of data, I either lose performance on the large object copy or risk mangling the symlink structure that makes the repository usable. The simplest answer was to split the transfer by data type: move file objects with a tool that is good at parallel copying, and move the reference tree with a tool that preserves filesystem structure.
Prerequisites
I start by fixing the model-specific paths.
BASE="models--unsloth--GLM-5-GGUF"
SRC_BASE="/srv/archive/cold/hf/hub/$BASE"
DST_BASE="/srv/archive/hot/hf/hub/$BASE"
mkdir -p "$DST_BASE"
With this layout, /srv/archive/cold/hf/hub/ is the archive side and /srv/archive/hot/hf/hub/ is the active side. For another model, the main change is just BASE.
Copying blobs in Parallel with rclone
The first step is to move the actual file objects under blobs.
rclone copy "$SRC_BASE/blobs" "$DST_BASE/blobs" \
--exclude "*.incomplete" \
--transfers 16 --checkers 32 \
--local-no-check-updated \
-P
This is where rclone makes sense. blobs is just a large object store, so parallel transfer is the main concern. The starting point is --transfers 16 --checkers 32, with room to push to 32/64 on 10GbE plus SSD.
I also exclude .incomplete files from the start. If the destination is meant to be a clean hot-side copy, unfinished objects should not come along for the ride.
Copying snapshots While Keeping Symlinks Intact
Next, I copy snapshots.
mkdir -p "$DST_BASE/snapshots"
rsync -aH --info=progress2 \
--exclude="*.incomplete" \
"$SRC_BASE/snapshots/" "$DST_BASE/snapshots/"
Here I switch to rsync -aH because the goal is no longer raw throughput alone. The point is to preserve the symlink-based structure as-is. In a Hugging Face hub layout, snapshots references objects under blobs, so keeping those links intact matters more than forcing everything through the same copy tool.
--info=progress2 is also useful here because snapshot trees can still take time, and I want a single progress view while the copy is running.
Copying refs for Repository Consistency
refs is small, but I still move it explicitly.
mkdir -p "$DST_BASE/refs"
rsync -aH --info=progress2 \
"$SRC_BASE/refs/" "$DST_BASE/refs/"
This directory does not dominate transfer time, but it is part of the repository state. Leaving it behind would make the hot-side copy feel incomplete even if the large files and snapshot tree are already there.
Running a Minimal Integrity Check
After the copy, I run one quick validation step.
find "$DST_BASE/snapshots" -type l ! -exec test -e {} \; -print | head
If the command prints nothing, there are no broken symlinks detected under snapshots. It is not a full audit, but it is a practical first check and it directly validates the part of the transfer that is easiest to break when mixing tools or copy modes.
For me, this one line is what turns the procedure from “copied some directories” into “copied the repository structure and confirmed it still resolves.”
Copying Only a Specific Revision
Sometimes I do not want the whole snapshot set. In that case, I can narrow the copy to one revision and one quantization subtree.
REV="acc91597d28b7ebd3a8c20fd5331ceaf07a4ece1"
mkdir -p "$DST_BASE/snapshots/$REV"
rsync -aH --info=progress2 \
"$SRC_BASE/snapshots/$REV/IQ4_NL/" "$DST_BASE/snapshots/$REV/IQ4_NL/"
This is useful when I only need a specific variant such as IQ4_NL on the hot side. For larger repositories, that kind of narrowing is often the difference between a short operational copy and an unnecessary full promotion.
The tradeoff is operational clarity: once partial copies are allowed, I need a clear rule for which revisions and quantizations are expected to live on the hot tier.
Reusing the Pattern for Other Models
The same structure can be reused for another model such as DeepSeek-V3.2-Speciale by changing BASE=.
That is the part worth keeping as a template. The stable idea is not model-specific naming; it is the division of labor:
- Use
rclonefor object-heavyblobs - Use
rsyncfor symlink and reference trees undersnapshotsandrefs
As long as that split stays intact, the Hugging Face repository layout remains usable after the move.
Results
The procedure clarifies a few things:
blobsshould be treated as parallel file transfer worksnapshotsandrefsshould be treated as structure-preservation work.incompletefiles should be excluded from the hot-side copy- a broken-symlink check should be part of the workflow
- partial promotion by revision or quantization is possible when needed
That gives me a transfer method that is fast enough for the large files and conservative enough for the repository structure.
Future Work
The next improvement is not another copy command. It is tightening the operational guardrails around this one.
- Add post-transfer file-count or size verification for
blobs - Record model-size-based presets for
--transfersand--checkers - Define a separate policy for which revisions belong on hot storage
Once cold and hot storage have distinct roles, the procedure should define not only how I copy a model, but also what deserves promotion in the first place.
