gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New ABI NSConstantString


From: David Chisnall
Subject: Re: New ABI NSConstantString
Date: Sun, 1 Apr 2018 12:21:02 +0100

On 1 Apr 2018, at 11:36, Fred Kiefer <address@hidden> wrote:
> 
> Wouldn’t the most useful structure be the one we already use for GSString?

That’s certainly a good starting point!

> 
> @interface GSString : NSString
> {
> @public
>  GSCharPtr _contents;
>  unsigned int _count;

Is this the number of bytes or the number of characters?  I imagine that both 
are useful.

>  struct {
>    unsigned int       wide: 1;        // 16-bit characters in string?
>    unsigned int       owned: 1;       // Set if the instance owns the
>                                       // _contents buffer

Owned is presumably redundant for constant strings.

>    unsigned int       unused: 2;
>    unsigned int       hash: 28;
>  } _flags;
> }
> @end
> 
> Of course constant strings won’t require  the hidden reference count that 
> come with all ObjC objects. But apart from that it seems to be a more useful 
> structure. Storing the length with the string should speed up some common 
> operations and 28 bit of hash should still be enough. There are even two 
> unused bits in the flags that could encode the specific hash function.

I’d like to have more than 2 bits spare for future expansion.  The current 
NXConstantString structure is now 30 years old, and I think there have been 
several times in the past when it would have been nice to add other things to 
it if we’d had a good way of maintaining compatibility.

This structure does have the advantage that it doesn’t need padding on any 32- 
or 64-bit architectures.

Do we have any measurements to tell us that 28 bits is enough for the hash?  
The -hash method returns an NSUInteger, which is 64 bits on most platforms, so 
we’re not using much of the available range.  At some point, I’d like to move 
the hash implementation for NSString to MurmurHash3, which should give better 
distribution and is very fast on modern hardware.

I’m also a bit nervous about using C bitfields in static data structures, 
because their layout is ABI dependent (and on some platforms can change between 
compiler versions).

I’m also tempted to teach the compiler about GSTinyString for 64-bit platforms, 
though so far that’s not been part of the ABI.  That gives us 8 7-bit ASCII 
strings and a 5-bit length.  The hash for them needs computing dynamically, but 
they fit into a 64-bit pointer directly.

David




reply via email to

[Prev in Thread] Current Thread [Next in Thread]