#6420 Hypothetical maximum speed mirroring
Closed: Fixed 7 years ago Opened 7 years ago by tibbs.

Over the past few days I've put together something I cal quick-fedora-mirror. It lives at https://pagure.io/quick-fedora-mirror and the documentation there should describe how it works. But an executive summary:

  • There's a server-generated file list.

  • It turns the "transferring file list" portion of a full rsync update of fedora-buffet0 from eleven hours to a few seconds unless there has been a large amount of updated content. In all cases except one (when the mirror has no content at all) it is faster.

  • I can poll every module in fedora-buffet0 every ten minutes and I don't think the servers even notice. Each poll requires one rsync call to fetch one file per rsync module, plus one to transfer everything. If there's no changed content, the poll takes about six seconds.

  • Yes, it correctly copies hardlinked content on the server as hardlinks on the client. This works even between modules, assuming the client actually mirrors those modules.

  • It works for mirrors mirroring from other mirrors.

  • It handles the bitflip and fetching files which are missing or deleted on the client.

  • The client needs only rsync and (currently) zsh. I believe it is portable to bash4 without too much trouble. bash3 might be tough and would need an expert. But the point is that clients don't need to have any special software to do this.

I am looking at ways to leverage quick-fedora-mirror, fedmsg and perhaps mirrormanager to maximize the speed at which we get content out to the mirrors.

The first pass:

When an updated file list is generated on the master, emit a fedbus message. I think this is easy to do. Tier 1 mirrors can just watch for this message, sleep $((RANDOM*60/32768)) or whatever, and fire off a transfer. They should all have, say, updated fedora updates content in ten or fifteen minutes.

The second pass:

The server can tell mirrormanager that updated content is ready, and mirrormanager can emit the message.

The third pass:

When a Tier 1 runs report-mirror, mirrormanager can emit a message to the bus indicating the particular tier 1 is saying it now has updated content. Tier 2 mirrors can listen for their upstream host (maybe even several of them) and start their run. And so on.

I have a two-tiered system running right now. One mirror pulls from the Fedora masters (by polling, currently) and then tickles two slave mirrors to get the new content.

Questions:

  • Does anyone see any flaws in this plan?

  • Is this (i.e. fanning out new content quickly) something we would want?

  • Is it reasonable for mirrormanager to do this? I know that currently the master only informs it of new fedora/epel content, I don't know about the other modules The master does know when they change (if only because a cron job runs) and I don't think it would be hard to let mirrormanager know. I don't know if mirrormanager can do anything with that information.

  • Is there any possibility of getting sufficient mirrors onboard to make this useful? I tried to severely limit the dependencies on the client (to a shell script, rsync, awk and the usual utilities). Even report_mirror needs python.


I completely forgot that I filed this. Here's a status update:

  • The file list generation code is functional and implemented on the master mirrors. There are minor issues that still need fixing but the file list format is stable and everything is suitable for consumption by a client.

  • The client is functional. It runs quickly and presents essentially an absolutely minimal amount of load on the server. I can poll every ten minutes; a single poll takes about four seconds and makes one rsync connection. (Unless there are changes to download, of course.)

  • I run this on three mirrors, with one pulling from the Fedora masters and the other two pulling from it. No problems so far.

  • Fedora runs the client on the Ibiblio mirror and it appears to be doing fine.

  • When there are changes to download, processing takes some client CPU but not too much, and it does take a full client directory traversal for each changed module (so most of the time you can skip archive). However, previously rsync required one full traversal on the client and one on the server (which was very slow) and so from the server end, things are very efficient.

  • Nothing saves you from actually having to download the content, of course. This just saves both ends from having to actually look through their files to see what differs.

  • Hardlinks are copied as hardlinks, but there are server-side requirements for this which aren't always met currently. Namely that when you link between two separate rsync modules, you must regenerate the file lists for both modules at the same time.

  • There is an efficient hardlinker which runs on the client and fix up hardlinks which exist on the server but not on the client (in case the server gets a full hardlink run, for example). It checks only those files which are already hardlinked on the server and does not have to do a complete tree traversal.

  • The hardlinker can run in a mode where it doesn't trust the server to have found all possible hardlinks. It's slower due to the fact that fedora-secondary has many files with the same names and sizes but not the same content, though I could optimize that. This is suitable for running on the master mirrors and should still be far faster than a full hardlink run.

  • I'm working on mirrormanager checkin. Just waiting for a way to actually do that without having to use both xmlrpc and json. Doing just json will be fun enough. This will add a dependency on curl, but the whole thing will be optional and off by default.

  • We have a plan for having the master servers emit messages when updated file lists are available. I'll still need to figure out how to handle them. Right now, polling is easy and very cheap.

Again, this is at https://pagure.io/quick-fedora-mirror

Metadata Update from @tibbs:
- Issue set to the milestone: Fedora 24 Alpha
- Issue tagged with: meeting

7 years ago

AFAIK we have implemented everything we need to in order to support this, We are all for it. I believe we have done all we can from our side, as a result I am closing this issue.

Metadata Update from @ausil:
- Issue untagged with: meeting
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

7 years ago

Login to comment on this ticket.

Metadata