Re: [Tinycc-devel] C99 token pasting

From: grischka
Subject: Re: [Tinycc-devel] C99 token pasting
Date: Sun, 13 Apr 2014 14:12:36 +0200
User-agent: Thunderbird (Windows/20090812)

Thomas Preud'homme wrote:
On April 12, 2014 9:53:51 PM GMT+08:00, grischka <address@hidden> wrote:
Good, however note that the mechanism to perform token pasting
     tcc_open_bf(tcc_state, ":paste:", cstr.size);
is extremely slow and per se has a good share in making current
tcc about twice as slow compiling itself compared to 0.9.25.

You mean even without this patch tcc is already slower than for 0.9.26?

No, I meant 0.9.25. ;)

However looking more closely the results for the current tcc are more
like at ~135% compared to those with 0.9.24/25.  Most of that seems
due to changes/complications in the preprocessor, such as:


Your patch would add additional 15-20%:

Also I just noticed it breaks the test case given in

There seems btw a similar patch at

Now I observe that (in self compilation) token pasting happens
3113 times,  however the fix (which as the comment suggests is to
improve certain cases of token pasting) runs similar code additional
22669 times.  This raises some questions.

o_O Strange indeed. I see two ways to reduce the cost of this patch.
First one is to rename next_nomacro1 become next_nomacro2 that would take a 
char * pointer to the buffer to parse for tokens and create a next_nomacro1 
wrapper for compatibility. Then tcc_open_bf would not be necessary. It could 
maybe allow to remove another tcc_open_bf in the same function.

A second solution would be to create a new function capable of identifying all 
the cases where a space needs to be added. That would duplicate part of what 
next_nomacro1 already know about tokens but should be quite a small function 
and would be faster.

Maybe the first change should be done anyway if choosing the second approach 
for the already existing call to tcc_open_bf in macro_twosharps.

And solution 3 (my favorite):  Just hope for someone to rewrite
those tiny five "macro_subst_stuff" functions from scratch altogether,

Until then, why not add that space regardless of what token follows
(even if it doesn't look exactly like gcc -E). But then only if a
## paste did actually happen and then only in tcc -E mode.  Otherwise
there is no point to add spaces, they're all removed anyway.

--- grischka

Thanks for monitoring performance regression grischka.



