help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Look for data serialisation format to implement communication betwee


From: Oleksandr Gavenko
Subject: Re: Look for data serialisation format to implement communication between Emacs and external program.
Date: Mon, 07 Jan 2013 15:53:57 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

On 2013-01-07, Helmut Eller wrote:

> On Sun, Jan 06 2013, Oleksandr Gavenko wrote:
>
>> Is that right to use ASN.1 BER as serialisation data format for communication
>> between Emacs and external program?
>
> S-expressions is the only format that Emacs can write and parse quickly
> because the printer and reader are implemented in C.  This is likely 10
> times faster than any parser that you write in Emacs Lisp.  The downside
> is that the external program needs to be able to do the same.  Not such
> a bad tradeoff as S-expressions are fairly easy to parse.
>
> For communication with an external format I recommend a "framed" format:
> a frame is a fixed sized header followed by a variable length payload.
> The header describes the length of the frame.  The length should be in
> bytes (not characters as counting characters in UTF8 strings is
> uneccessary complicated).  Knowing the length of the frame is very
> useful because that makes it easy to wait for a complete frame.  After
> you received a complete frame, parsing is simpler because you don't have
> to worry about incomplete input.
>
> I also recommend to limit the frame length to 24 bits (not 32 bit)
> because Emacs fixnums are limited to 29 bits on 32 bit machines.
>
> The payload can then be an S-expression printed with the Emacs prin1 and
> parsed back with the read function.  The encoding of the payload can be
> utf-8.  But use the Emacs 'binary coding system for communication with
> the external process and unibyte buffers for parsing.  For the
> binary-to-utf8 conversion of the payload use something like
> decode-coding-string (which is written C and should be fast).
>
Seems that this is good solution in case of Emacs:

  (assoc ':title (read "((:type blog-entry) (:title \"Hello\") (:article 
\"world!\"))"))

Data validation:

  (read ")")   ;; ==> invalid-read-syntax

or when assoc return unknown ":type", etc...

Only things that annoying is escaping (like <div>hello</div> for
<div>hello</div> in XML or in SLIP protocol where 0x7e escaped by 0x7d 0x5e
and escape character 0x7d escaped by 0x7d 0x5d).

> If you like, you can also use extra bits in the header to indicate the
> format of the payload.  E.g. it might be useful to have frames that
> contain only plain strings (not encoded as S-expr).
>
I start from using custom TLV data format but parsing and validation is hand
written so I decide as for suggestions...

-- 
Best regards!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]