qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v10] Support vhd type VHD_DIFFERENCING


From: Xiaodong Gong
Subject: Re: [Qemu-devel] [PATCH v10] Support vhd type VHD_DIFFERENCING
Date: Wed, 11 Mar 2015 14:22:59 +0800


2015年3月8日 18:53于 "Philipp Hahn" <address@hidden>写道:
>
> Hello,
>
> On 08.03.2015 02:53, Xiaodong Gong wrote:
> > the encoding type of parent location is must be utf 8,utf16e,according
> > to the draft
>
> Yes, the SPEC for VPC/VHD specifies the character encoding to use, which
> is good for being portable.
>
> > ascii is the encoding type to store the string of parent location in
> > memery and to use fopen()
>
> No: For the (Linux) kernel the filename is a sequence of 8 bit bytes,
> where only '\0'=end_of_string and '/'=path_separator are handled
> specially. All other bytes have no special meaning and are passed in and
> out as is.
>
> Only the applications are doing the character encoding. Normally this is
> not a problem as you setup your system once with one encoding (nowadays
> UTF-8) and use that consistently: If you enter &auml; on the keyboard,
> the kernels input layer returns \u00E4 as the two-byte UTF-8 sequence
> > $ echo -n ä | xxd -g 1
> > 0000000: c3 a4
> Any application can either just pass the byte sequence around as a CLOB
> (or use any other encoding internally - but then it must know that the
> input-encoding is UTF-8), but when again doing any system call, they
> will again pass that same byte sequence as the file-name, which the
> kernel will store on disk.
> If you take that disk to another computer, which does NOT use Unicode,
> you have a problem: If, for example, that one is still using the old
> ISO-8859-1 encoding used in western Europe, you file will be named
> differently:
> > $ echo -n ä | iconv -f ISO-8859-1 -t UTF-8
> > ä
>
> (The reverse is even more painful, as not any ISO-8859-1 character
> sequence is a valid UTF-8 byte sequence - several years back when I
> moved from my old ISO-8859-1 to a more modern UTF-8 setup, I had to
> rename lots of files to be readable again)
>
> You can even test that locally on one system by creating a file
> containing an umlaut in its name and then to display that in a non-UTF-8
> terminal / environment:
> > $ touch ä
> > $ LANG=C ls -NQ
> > "\303\244"
>
> > ascii need to translate to other encoding type according to LANG when to
> > show the information of the vhd file using the qemu-info and so on
>
> No: your assumption that ASCII is used is IMHO wrong: ASCII is only 7
> bit, but the kernel interface is 8 bit. The terminal input- and output
> layer nowadays are UTF-8, so as long as you're working on the console
> everything is fine. If you mix in GUIs and libraries doing their own
> encoding/decoding, things get more interesting.
>
> But when you do explicit character conversion like you do for VHD, you
> must honor the user configured character encoding of the environment
> yourself, that is use LC_CTYPE for any conversion from input, for output
> which includes file names.
>
> I checked xen/tools/blktap2/vpc/lib/libvhd.c #
> vhd_initialize_header_parent_name()
> which also (wrongly) assumes ASCII. Because of the creating a snapshot
> using vhd-utils is also broken:
>
> > $ /usr/bin/vhd-util create -n ä.vhd -s 1
> > $ /usr/bin/vhd-util snapshot -n snap.vhd -p ä.vhd ; echo $?
> > 84
>
> Next I checked
> <https://technet.microsoft.com/de-de/library/gg318052%28v=ws.10%29.aspx>
> to create a VHD using umlauts with Windows 7:
>
> > cmd # as Admin
> > diskpart
> > create vdisk file="C:\ä.vhd" maximum=2000 type=expandable
> > create vdisk file="C:\snap.vhd" parent="C:\ä.vhd"
>
> But vhd-utils from Xen is broken:
>
> > $ /usr/bin/vhd-util read -n snap.vhd -p
> > VHD Header Summary:
> ...
> > Parent name         : failed to read name
> ...
> > VHD Parent Locators:
> > --------------------
> > locator:            : 0
> ....
> > failed to read parent name
>
> With the attached patch it works:
>
> > VHD Header Summary:
> > -------------------
> ...
> > Parent name         : /ä.vhd
> ...
> > VHD Parent Locators:
> > --------------------
> > locator:            : 0
> >        code         : PLAT_CODE_W2KU
> ...
> >        decoded name : /ä.vhd
> >
> > locator:            : 1
> >        code         : PLAT_CODE_W2RU
> ...
> >        decoded name : ./ä.vhd
>
> Hope that clarified things.
>
> Philipp

first,your patch is very clear,a good sample.

store ascii code in kernel that I said before is a mistake,I mean the glibc need the input of arguments of fuction such as fopen(path)is ascii code

I think:

icovn_open(utf16le,ascii)in encode
icovn_open(ascii,utf16le)in decode
icovn_open(codeset,ascii)in show


reply via email to

[Prev in Thread] Current Thread [Next in Thread]