bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gettext] msgmerge --diff-to-previous (feature request)


From: Chusslove Illich
Subject: Re: [bug-gettext] msgmerge --diff-to-previous (feature request)
Date: Sun, 30 Oct 2011 20:17:25 +0100
User-agent: KMail/1.9.9

> [: Ineiev :]
> When msgmerge is invoked with both --previous and --diff-to-previous, it
> will insert wdiff output after previous msgids, like
>
>   #, fuzzy
>   #| msgid "(previous msgid)"
>   #| msgid_plural "(previous msgid_plural, if available)"
>   #| msgid_diff "(diff to previous msgid)"
>   #| msgid_plural_diff "(diff to previous msgid_plural)"
>   msgid "(current msgid)"
> [...]
> [...] would it be better if the diffs went into messages prefixed with
> "#<some-other-character>"?

(The general concept of having diffs from previous to current text while
updating fuzzy messages, I have come to consider a necessity. It's great
that you are trying to add that functionality directly into Gettext.)

The idea of having another set of fields for diffs is interesting. In that
direction, I think the second solution would be better: have same-name
fields in different comment type (e.g. #!). That would be consistent with
the current state, i.e. msgid, #| msgid, and #! msgid.

>> [: Alexander Potashev :]
>> I suggest adding "diffed-to-previous" flags to all messages that contain
>> wdiffs in previous msgid/msgctxts, or adding a per-file flag into the .po
>> header.
>
> [: Ineiev :]
> I may not understand all usage scenarios, but there is no similar
> "previous" flag, is it?

The idea here was that diffs go directly into existing #| fields, so a flag
(or some other indicator) would be necessary to tell tools that previous
fields now contain diffs. I would favor this solution, because the scenario
with both #| and #! fields is too much clutter, especially when editing the
PO file in a text editor. On the negative side, this would have less
backward compatibility, e.g. for dedicated PO editors that automatically
perform diffing. But they should be easily patchable.

Either way -- using another set of fields or embedding into previous
fields -- it has to be done carefully with respect to subsequent mergings.
If clean previous fields are not present but diffed previous fields are,
msgmerge should undiff them and use that as previous fields when fuzzy
matching.

> [...] (in particular, wdiff) [...]

I think that default wdiff-format delimiter for deleted text, [-, is not
quite suitable, because it is frequently found verbatim in text (e.g.
command line synopses). wdiff is also somewhat lax at diffing. These two
lines (the second one has a trailing space as well):

  Blah, blah, foo blah, blah:
  Blah, blah, bar blah, blah: 

will produce the diff:

  Blah, blah, [-foo-] {+bar+} blah, blah: 

which has one extra space after -] and no indication of trailing space
addition.

Therefore rather than leaving the diff format upon an arbitrary command, it
should be fixed and well defined. This would be benefitial both for
translators and especially for tools (e.g. PO syntax highlighting in text
editors). Diffs in PO context should be such that it is possible to
unambiguosly recover old and new text (like in the two examples above, of
dedicated PO editor and subsequent merge). This implies that an escaping
mechanism for diff delimiters should be present as well. I have defined one
such diff format, documented here:
http://pology.nedohodnik.net/doc/user/en_US/ch-diffpatch.html#sec-dpfrmembstr
Worked nicely so far.

What could be variable is the diffing algorithm. For example, I like that
word and non-word segments are diffed differently, such that words are
atomic, but non-words are diffed by character. If an external command were
used, its output should be parsed internally into the canonical diff format.
(But note that calling an external command for each message would be
prohibitively expensive in large-scale merges.)

-- 
Chusslove Illich (Часлав Илић)

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]