lilypond-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: XML to .ly and Lilypond, again


From: David Wright
Subject: Re: XML to .ly and Lilypond, again
Date: Fri, 12 May 2017 15:28:48 -0500
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri 12 May 2017 at 12:03:36 (+0200), David Kastrup wrote:
> Urs Liska <address@hidden> writes:
> 
> > Am 12.05.2017 um 11:26 schrieb Jan-Peter Voigt:
> >> Dear Leszek,
> >>
> >> when I look with hexdump, I see a lot of NUL-bytes inside the file.
> >> And it seems to me that the strings are in UTF-16 and the rest in
> >> latin-1 or the like. At least it seems like a mixture of encodings.
> >
> > I think this is exactly what was discussed in the mentioned recent thread.
> > So it seems this is an issue in musicxml2ly that has recently been
> > introduced. Does anyone know or can anyone find out when this happened
> > (by trying musicxml2ly from different LilyPond versions)?
> 
> My guess would rather be on some auto-encoding/decoding choice by
> Python, possibly triggered by badly (or unexpectedly?) encoded material
> in the MusicXML file.

My observations are:
. The downloaded XML file has probably come from Windows as it
  has CR-LF line terminations.
. musicxml2ly 2.18.2 handles the file just fine.
. musicxml2ly 2.19.49 can't handle writing © to the output.
. With the © replaced by C in the XML file, 2.19.49 can process
  the file, but some of the output is encoded wrongly (UTF16?).
. Strings that might sometimes be expected to contain non-US-ASCII
  are the ones mainly affected, even where the string here happens
  to be all US-ASCII (Vivace, Allegro).
. It's noticeable that the only other parts of this file encoded badly
  are \key commands and the final \midi and \tempo commands.
. None of the strings introduced by = is affected.

My guess is that the wrong encoding comes from "original" strings
being passed from the input to the output as opposed to all the
normal strings that are generated by the conversion program. These
original strings would carry their encoding with them.

Things that would fit into that model are the titles etc
(I suspect the way the credit_dict dictionary is handled),
the names of the (musical) keys (perhaps surviving a change
from upper to lower case), and the midi tempo, "92".

Cheers,
David.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]