[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] Writing grapheme with weird characters
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] Writing grapheme with weird characters |
Date: |
Mon, 17 Oct 2011 16:34:33 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 |
Hi Victor,
Technically, the CDATA tag is not required -- if the attribute values
contain characters like "<", ">", "&", "'" or the quotation mark itself,
igraph will escape them using a standard &-based escape sequence. All other
characters should be encoded in UTF-8 and igraph will print them as usual.
E.g., in Python:
>>> g= Graph()
>>> g["name"] = u"\u1234 < > & '"
>>> g.write_graphml("test.graphml")
yields the following GraphML file:
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<!-- Created by igraph -->
<key id="name" for="graph" attr.name="name" attr.type="string"/>
<graph id="G" edgedefault="undirected">
<data key="name">ሴ < > & '</data>
</graph>
</graphml>
where the first character of the attribute value is the Unicode character
with code 1234 (in hexadecimal). According to an XML validator at
http://www.validome.org/xml/validate/, the generated file is perfectly valid
XML.
Footnote: there is a catch when you load the GraphML file back into igraph
in Python. Since Python has a separate data type for Unicode strings and
"normal" strings, the "name" will be a standard string containing the
original string in UTF-8 encoded form, and you must convert it back to
Unicode manually as follows:
>>> g["name"] = g["name"].decode("utf-8")
>>> g["name"]
u"\u1234 < > & '"
Cheers,
Tamas
On 10/17/2011 03:59 PM, Víctor Pascual Cid wrote:
> Hi all,
>
> I need to generate a GraphML which nodes contain some weird characters. The
> way to deal with strange characters in XML is to use the tag CDATA. However,
> I haven't seen this possibility with write.graph(g, format="graphml").
> Any hint or workaround to solve this problem?
>
> Cheers,
>
> Víctor
>
>
> _______________________________________________
> igraph-help mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/igraph-help
>