social.kernel.org

Conversation

K. Ryabitsev 🍁

Ask fedi: replicating large file collections over slow links

Show content

I have a server "primary-na" with 50TB of arbitrary content in /srv, mostly in millions of small files, many of them identical hardlinks. I have 3 other servers across the world (copy-na, copy-eu, copy-ap) where I want to have the exact replica of primary-na's /srv. These replicas may be occasionally unavailable for hours on end, or they may be occasionally slow or under high load. The content on them may also occasionally bitrot and must be identified and healed.

I've researched this multiple times over the last few years, but I've still not found a solution that would beat "just run rsync over it when something changes on replica-na." It's simple and effective, but obviously super inefficient and IO-heavy on both ends.

Any suggestions on how you would do it?

Palmer Dabbelt

palmer

2 years ago

Reply to @monsieuricon

re: Ask fedi: replicating large file collections over slow links

Show content

@monsieuricon In theory I manage my backups using BTRFS snapshot/send/recv. In practice I don't really manage them at all, so not sure how strong of a recommendation that is.

K. Ryabitsev 🍁

monsieuricon

2 years ago

Reply to

@asdil12 I'm aware of syncthing, but I'd need to see some evidence that it can scale up to 50TB and millions of files, properly recognize things like moves and hardlinks, etc. Unfortunately, I'm not in a position to easily experiment with it.

K. Ryabitsev 🍁

monsieuricon

2 years ago

Reply to

@amonakov Yes, I find it intriguing for some of its concepts, but it also has a major downside of needing extra 50+ TB for storing the repository. Also, it is really written to solve a different problem -- backing up data as opposed to replicating it to multiple nodes efficiently.

K. Ryabitsev 🍁

monsieuricon

2 years ago

Reply to

re: Ask fedi: replicating large file collections over slow links

Show content

@mss Correct, where it says "replica-na" it should say "primary-na".

The question of temporary files is actually an important consideration. The content of primary-na is distro data that is copied to the system via rsync with --delay-updates, so everything is written into ~tmp~ dirs and then moved in-place at the end of a successful run. Theoretically, this should be handled correctly by fs-based replication.

About social.kernel.org

Terms of service

Please do not use this service in violation of the Linux Kernel Code of Conduct. Doing so will result in your account suspension with the referral of the matter to the CoC committee.
"Repeating"/"boosting" someone else's status on this platform will be treated as endorsement and will fall under rule #1.
You are encouraged to use this platform to promote your work on the Linux Kernel, but there is no restriction on permitted topics (with the exception of anything covered by #1 above).
There is no requirement to post in English, but it should be considered the primary language of communication on this platform.

Privacy notice

The admins of this service have access to all posted statuses. They aren't looking, but if it's something they shouldn't know about, then you should not post it on this platform.

Please see the Linux Foundation Privacy Policy, which applies to this platform as well.

Getting your own account

If you would like an account on this instance, please check that the following applies to you:

You are listed in MAINTAINERS or CREDITS
OR: You have a kernel.org account or email address
OR: You have a long and established history of involvement with the Linux Kernel

If the above is true and you agree with the Terms of Service and Privacy Notice listed above, please use these instructions to request an account:

How to request an account on social.kernel.org