[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Restarting development

From: Scott Carpenter
Subject: Re: [rdiff-backup-users] Restarting development
Date: Tue, 06 Apr 2010 10:04:39 -0500
User-agent: Thunderbird (X11/20100317)

Josh Nisly spake thusly on 04/06/2010 09:27 AM:

I'm a little torn - I don't want to discourage you from starting from scratch at all, but I think that the current codebase has lots of value in it that make it worth salvaging. I can understand where you're coming from to start a new version, yet I wonder if you might not be underestimating the amount of work required to bring it to parity with the current codebase. Supporting OS X resource forks and Windows ACLs, for example. A lot of value that I see in rdiff-backup is in its myriad of workarounds for handling all sorts of situations, from simple things like chmod'ing unreadable files temporarily to be able to back them up, to handling backups from a unix (case-sensitive) file system to a windows (case-insensitive) one.

Personally, I'd like to have your help developing tests for the current codebase. I really think that if we come up with good functionality tests, we can refactor the codebase to the point where we can start writing unit tests. However, that's certainly less enjoyable than starting from scratch(!), so I can't blame you for not getting excited about that.


This discussion and your comments here remind me of this old post from Joel Spolsky:

"Things you should never do, part I"


I'll include some excerpts below, but first want to say I'm not necessarily advocating any particular approach to this. I'm a relatively new rdiff-backup user and I love it. It's so much nicer to use than my old approach of my own bash scripts + rsync + hard links. I'm just happy that people are talking about development and maintenance -- I want to see this project thrive. But in the role of a user, I don't want to vote for any development plans. (I will be happy to help with some testing for either version however.)

But! I think Joel's points are valid and I always think of them when people talk about wanting to start over.

From the article (with the acknowledgement that he may be talking about larger programs, but it seems that rdiff-backup has grown to be quite comprehensive...):

We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by incremental renovation: tinkering, improving, planting flower beds.

There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

It’s harder to read code than to write it.

This is why code reuse is so hard. This is why everybody on your team has a different function they like to use for splitting strings into arrays of strings. They write their own function because it's easier and more fun than figuring out how the old function works.

As a corollary of this axiom, you can ask almost any programmer today about the code they are working on. "It's a big hairy mess," they will tell you. "I'd like nothing better than to throw it out and start over."

Why is it a mess?

"Well," they say, "look at this function. It is two pages long! None of this stuff belongs in there! I don't know what half of these API calls are for."


Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it's like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.

When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.


Is there an alternative? The consensus seems to be that the old Netscape code base was really bad. Well, it might have been bad, but, you know what? It worked pretty darn well on an awful lot of real world computer systems.

When programmers say that their code is a holy mess (as they always do), there are three kinds of things that are wrong with it.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]