[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: null terminated strings
From: |
Ken Anderson |
Subject: |
Re: null terminated strings |
Date: |
Mon, 19 Jan 2004 14:16:44 -0500 |
At 10:46 AM 1/19/2004 -0800, Per Bothner wrote:
>Ken Anderson wrote:
>
>> In Java, which does copy-on-write
>
>String (including substrings) are immutable, so they cannot be written.
>The implementation of the StringBuffer class does do copy-on-write, but
>that doesn't affect substrings.
>
>>i often find myself carefully copying the substrings so they don't share
>>structure.
>
>Why? The only reason I can think of is garbage collection: A shared
>substring prevents the base from being collected.
Yes. Say you do something like (this is JScheme):
> (define text "foo bar")
"foo bar"
> (define r (StringReader. text))
address@hidden
> (define b (BufferedReader. r))
address@hidden
> (define line (.readLine b))
"foo bar"
> (define a (.substring line 0 3))
"foo"
> (define b (.substring line 4))
"bar"
> (describe a)
foo
is an instance of java.lang.String
// from java.lang.String
value: address@hidden
offset: 0
count: 3
hash: 0
()
> (describe b)
bar
is an instance of java.lang.String
// from java.lang.String
value: address@hidden
offset: 4
count: 3
hash: 0
()
> (vector-length (.value$# a))
80
a and b share the same char[] of size 80, which wastes a lot of space in this
case. (80 is the default string buffer size in BufferedReader).
>>This is because of things like:
>>- i don't know how long the underlying string (char array actuall) is.
>
>So?
So you don't know how much space your line is taking up.
>>Java only has one kind of string, which is fairly heavy weight. For example,
>>the string "" takes 36 bytes:
>>
>>>(describe "")
>> is an instance of java.lang.String
>> // from java.lang.String
>> value: address@hidden
>> offset: 0
>> count: 0
>> hash: 0
>
>This depends on the implementation, and the version of the
>implementation.
>
>GCJ uses for "":
> object header (4 bytes on 32-but systems)
> private Object data; /* points to itself in this case */
> private int boffset; /* offset of first char within data */
> int count; /* number of character */
> private int cachedHashCode;
> /* chars follow if data==this */
>(The data and boffset fields are only accessed by native C++ code.)
>
>Total 20 bytes.
Much better.
- Re: null terminated strings, (continued)
- Message not available
- Re: null terminated strings, Andreas Voegele, 2004/01/16
- Re: null terminated strings, Roland Orre, 2004/01/16
- Re: null terminated strings, Andreas Voegele, 2004/01/16
- Re: null terminated strings, Brian S McQueen, 2004/01/16
- Re: null terminated strings, Paul Jarc, 2004/01/16
- Re: null terminated strings, Tom Lord, 2004/01/16
- Re: null terminated strings, Paul Jarc, 2004/01/16
- Re: null terminated strings, Roland Orre, 2004/01/16
- Re: null terminated strings, Ken Anderson, 2004/01/19
- Re: null terminated strings, Per Bothner, 2004/01/19
- Re: null terminated strings,
Ken Anderson <=