rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] rdiff-backup 4 windows (no cygwin needed!)


From: Josh Nisly
Subject: Re: [rdiff-backup-users] rdiff-backup 4 windows (no cygwin needed!)
Date: Fri, 13 Jun 2008 11:39:23 +0600
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Andrew Ferguson wrote:

On Jun 11, 2008, at 4:51 AM, Josh Nisly wrote:
Attached is a patch that implements the make_file_dict as described above. Let me know what you think.

The patch for cmodule.c looks about right.

For rpath.py I have a concern: From earlier testing, I found try/except to be about 3-4x slower than if/else, with (almost) no penalty if the except branch is not taken. So, although your patch would cause (almost) no penalty for non-Windows, there is a slow down for Windows. Maybe the solution is to bind the setdata function when rdiff-backup is loaded or connections are made (based on os.name) -- an extra round-trip to the remote inside the function is certainly a no-no...

Looking in the CVS logs, it looks like cmodule.c was introduced six years ago as an "up to 3x speed improvement". It would be interesting to see if that is still the case today. If it isn't, then, heck, maybe we can just scrap C.make_file_dict(). I think the old way breaks down like this:
I just tried this on my server, running a local->local backup with a little under 90K files. Here are the timings:

Initial run:

cmodule:
real    8m38.344s
user    0m48.100s
sys     7m49.830s

Python lstat():
real    8m46.321s
user    0m58.550s
sys     7m46.940s


Subsequent run:

cmodule:
real    0m36.551s
user    0m35.460s
sys     0m0.970s

native:
real    0m38.154s
user    0m37.180s
sys     0m0.800s


I think that these that the difference here (user CPU) is not enough to justify maintaining a C module. We simply have a lot faster cpus than we did in 2002. My reasoning is that the worst CPU scenarios would be small NAS boxes, like the one mentioned a couple of days ago. But in those cases, I think this overhead is negligible compared to the overhead of SSH. What do you think?

I also realized that my previous patch (with the try/except) actually adds an extra roundtrip for Windows servers, since it would make one call for make_file_dict (which would always throw), and a second that would actually work. Since I have clients backing up many files over a 300ms latency link, this would be a real problem!

I vote that we always use what is now make_file_dict_old. I'll submit a patch shortly that does this.

Looking forward to your feedback,

JoshN





reply via email to

[Prev in Thread] Current Thread [Next in Thread]