[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [wdiff-bugs] [patch] html support

From: Denver Gingerich
Subject: Re: [wdiff-bugs] [patch] html support
Date: Tue, 29 Jan 2008 13:33:27 -0500

On Jan 10, 2008 1:10 PM, Denver Gingerich <address@hidden> wrote:
> On Jan 6, 2008 10:40 PM, Andrew Clausen <address@hidden> wrote:
> > Hi Denver,
> >
> > On Sun, Jan 06, 2008 at 01:58:18PM -0500, Denver Gingerich wrote:
> > > Thanks for the patch.  I'm not sure what the conventions are for GNU
> > > command-line tools with respect to outputting HTML.  It seems that it
> > > might be better to have an HTML post-processor that takes the normal
> > > output of wdiff and converts it to HTML.  That way wdiff only has to
> > > worry about one type of formatting (plain text).  This appears to be
> > > what the GNU diff people expect since GNU diff doesn't natively
> > > support HTML output.
> >
> > From a usability point of view, I think it's desirable to have a single
> > wdiff front-end command to handle everything.  (It's easier to find
> > what you want with a single front-end, and the options in all the various
> > output formats are likely to overlap substantially.)
> >
> > From a maintainability point of view, I don't see a big advantage from 
> > having
> > HTML output generated via a wdiff post-processor.  Most of the code would be
> > parsing wdiff's output rather than generating html.
> One should consider where HTML output would be used most.  In most
> cases, HTML-ized diff output is used in web-based version control
> viewing systems (such as ViewVC).  Since wdiff works better than diff
> for long lines (which are more common in written text than in source
> code), it might also be used in web-based document histories, such as
> those provided by MediaWiki.
> In both cases, the request is made via a web interface (ie. by
> clicking a link that says "compare with previous revision") and the
> response is provided via a web interface (the HTML-ized diff output).
> As a result, implementing HTML-ized output in wdiff does not make
> sense because its input is from a command line and its output is to a
> command line.
> Now you could make the argument that wdiff could be run from within a
> web scripting language (ie. in PHP: "$diff = `wdiff --html a.txt
> b.txt`").  However, this is generally considered to be a hack and for
> good reason.  First of all, it requires a significant amount of
> processing overhead in converting the input data to files and starting
> a new process on the web server.  Secondly, it makes dependencies
> difficult to trace because a web application using wdiff needs to
> specify that it requires wdiff to be installed on the web server.
> Generally web server administrators prefer to install plugins for the
> web server than command-line applications.  Additionally, not all web
> servers will support running command-line applications (ie. the above
> PHP command will not work) for security reasons.
> I believe it is best for this sort of thing to be done in a library or
> a dynamically-loadable web server module.  Examples of this are the
> use of Python's difflib in ViewVC [1] and MediaWiki's use of wikidiff2
> [2] (a dynamically-loadable module).  wikidiff2 does include a
> command-line version that prints HTML to standard out, but it is
> exclusively for testing purposes.
> So the best bet for getting this into wdiff is to abstract out the
> diffing part of wdiff into a library and then make wdiff use the
> library and make a dynamically-loadable module that uses the library
> to produce HTML output.  Unfortunately, this is unlikely to work
> properly until wdiff uses diff as a library because wdiff currently
> exec()s the diff command directly (which I consider a hack), which
> makes using the currently wdiff code as a dynamically-loaded module
> equivalent in ugliness to using wdiff directly from the command-line
> (it's just that your web server would depend on having the
> command-line "diff" tool installed instead of "wdiff").
> If you are interested in doing this (making wdiff use a diff library
> instead of calling diff directly), I would encourage you and could
> probably provide some help.  This is on my long-term todo list for
> wdiff anyway.

If you're still interested in making a tool that produces HTML output,
I suggest looking at the diffseq module in gnulib [4], which I learned
about after inquiring as to how one might split diff into a library
and a command-line tool [5].  I'm not sure if diffseq will do
word-wise diffing, but it shouldn't be too hard to modify it so it

I will likely be moving wdiff to using diffseq after I finish all the
other cleanup that needs doing.  You may want to check back on the
diffseq module after that happens as it (or a similar module) will
definitely support word-wise diffing by then.

Sorry I won't be adding your patch to wdiff.  I wish you all the best
in your work on free software projects.


> 1. http://viewvc.tigris.org/svn/viewvc/trunk/lib/idiff.py
> 2. http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/wikidiff2/
> 3. 
> http://packages.debian.org/changelogs/pool/main/w/wdiff/wdiff_0.5-16/changelog
5. http://lists.gnu.org/archive/html/bug-gnu-utils/2008-01/msg00040.html

reply via email to

[Prev in Thread] Current Thread [Next in Thread]