|
From: | Dmitry Antipov |
Subject: | Re: Using empty_string as the only "" string |
Date: | Thu, 26 Apr 2007 18:24:01 +0400 |
User-agent: | Thunderbird 1.5.0.7 (X11/20061008) |
Stefan Monnier wrote:
PS: But if you're interested in such small optimizations, I have another one in my local Emacs where the Lisp_String data type is changed to: struct Lisp_String { EMACS_INT size; EMACS_INT size_byte : BITS_PER_EMACS_INT - 1; unsigned inlined : 1; /* 0 -> ptr, 1 -> chars; in union below. */ INTERVAL intervals; /* text properties in this string */ union { unsigned char *ptr; unsigned char chars[STRING_MAXINLINE]; } data; }; this way, on 32bit systems, strings of up to 3 bytes can be represented with just a Lisp_String without any `sdata'. On 64bit systems, this can be used for strings up to 7 bytes long (i.e. almost 50% of all allocated strings, IIRC). And it can also be used for all the strings in the pure space (no matter how long), so it saves about 50KB of pure space (can't remember the exact number, but IIRC it was more than 10KB and less than 100KB).
I'm interesting in _any_ optimization. Here is a brain-damaged :-) Lisp_String I'm thinking about: #define STRING_IMMEDIATE_SIZE (sizeof (EMACS_INT) * 3 - 2) struct Lisp_String { union { /* Immediate string. */ struct { unsigned immediate : 1; unsigned gcmarkbit : 1; unsigned size : BITS_PER_CHAR - 1; unsigned size_byte : BITS_PER_CHAR - 1; unsigned char data[STRING_IMMEDIATE_SIZE]; } __attribute__ ((packed)) imm; /* Contains pointer to sdata. */ struct { unsigned immediate : 1; unsigned gcmarkbit : 1; unsigned size : BITS_PER_EMACS_INT - 1; unsigned size_byte : BITS_PER_EMACS_INT - 1; unsigned char *data; } __attribute__ ((packed)) dat; } u; INTERVAL intervals; /* text properties in this string */ }; This gives 9-byte "immediate" string on 32-bit and 21-byte on 64-bit (excluding trailing '\0'). This is not suitable for long pure strings, btw. Strictly speaking, this is not an optimization - it saves space at the (minimal ?) cost of speed since the most of string operations involves extra conditional expression at least. For example, #define STRING_BYTES(STR) ((STR)->size_byte < 0 ? (STR)->size : (STR)->size_byte) becomes (over?)complicated #define __IMM_P(STR) ((STR)->u.imm.immediate) #define __IMMSIZE(STR) ((STR)->u.imm.size_byte < 0 ? (STR)->u.imm.size : (STR)->u.imm.size_byte) #define __DATSIZE(STR) ((STR)->u.dat.size_byte < 0 ? (STR)->u.dat.size : (STR)->u.dat.size_byte) #define STRING_BYTES(STR) (__IMM_P (STR) ? __IMMSIZE (str) : __DATSIZE (STR)) Dmitry
[Prev in Thread] | Current Thread | [Next in Thread] |