[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bu
From: |
Jim Meyering |
Subject: |
bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086 |
Date: |
Tue, 09 Oct 2012 14:32:07 +0200 |
Pádraig Brady wrote:
> On 10/09/2012 12:52 PM, Jim Meyering wrote:> Eric Blake wrote:
>>> On 07/26/2011 02:33 PM, marcel partap wrote:
>>>> Here's a patch. Adds STRCASEEQ_LEN macro for case insensitive extension
>>>> matching.
>>>> #regards/marcel.
>>>
>>> Your patch would make the new behavior unconditional. But I like case
>>> sensitivity, and think that case insensitivity should be an opt-in
>>> process that I request, with coordination between dircolors to
>>> generate a new string for LS_COLORS to be honored by ls. Furthermore,
>>> the patch is lacking in NEWS, documentation, and testsuite coverage.
>>>
>>> Additionally, you should be aware that strncasecmp() has undefined
>>> behavior in non-C multibyte locales. It would probably be better to
>>> use c_strncasecmp(), so that you are guaranteed defined behavior
>>> regardless of the current locale.
>>>
>>> Would you care to tackle those additional issues? And are you set up
>>> for copyright assignment, since the patch will probably be non-trivial
>>> by that point in time?
>>
>> I saw this while going back through old bugs, so started poking around.
>> How about this: if a suffix is not matched, convert it to lower case
>> and check again. That way, anyone who cares to highlight .TaR or .TAR
>> differently from .tar can still easily do so, yet names ending with
>> uppercase .TAR, .JPG, .FLAC, etc. will now be highlighted by default.
>>
>> Does anyone think it's worth making this new fallback-to-case-insensitivity
>> an option?
>
> Not me anyway.
>
>> While looking at that, I realized that comparing each name in an ls'd
>> directory with each of nearly 100 suffixes should leave nontrivial room
>> for improvement. Sure enough, once I'd replaced that linear search
>> with a hash-table lookup, I see a measurable improvement.
>
>> and taking best-of-10 times, I see a 0.34 -> 0.23 (~30%) improvement.
>
> Very nice.
>
>> Of course, I have deliberately used the case that shows the most improvement:
>> - a directory with very many files
>> - no suffix match, so the old code searches all suffixes
>> - using tmpfs: minimal stat overhead
>
>> @@ -4343,21 +4412,11 @@ print_color_indicator (const struct fileinfo *f,
>> bool symlink_target)
>> }
>>
>> /* Check the file's suffix only if still classified as C_FILE. */
>> - ext = NULL;
>> - if (type == C_FILE)
>> - {
>> - /* Test if NAME has a recognized suffix. */
>> -
>> - len = strlen (name);
>> - name += len; /* Pointer to final \0. */
>> - for (ext = color_ext_list; ext != NULL; ext = ext->next)
>> - {
>> - if (ext->ext.len <= len
>> - && STREQ_LEN (name - ext->ext.len, ext->ext.string,
>> - ext->ext.len))
>> - break;
>> - }
>> - }
>> + char *suffix;
>> + struct sufco *ext; /* Color extension */
>> + ext = (type == C_FILE && (suffix = strrchr (name, '.'))
>> + ? suffix_color_lookup (suffix)
>> + : NULL);
>
> So do we now only support suffixes delimited by '.' ?
> Previously the delimiter was arbitrary or optional:
>
> touch star; LS_COLORS="*tar=01;31" /bin/ls --color *tar
Good catch. I realized that early on, but then forgot to mention it.
Yes, I would have to document that the "." is now required.
It seems like a reasonable restriction, but technically
it could be called a regression.
Also, with these changes, a multiple-"." suffix will no longer work.
I.e., before, if you wanted to give *.tar.xz files a color different
from plain *.xz files, you could.
Does anyone object to that?
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Jim Meyering, 2012/10/09
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Pádraig Brady, 2012/10/09
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086,
Jim Meyering <=
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Pádraig Brady, 2012/10/09
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Jim Meyering, 2012/10/09
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Pádraig Brady, 2012/10/09
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Jim Meyering, 2012/10/13
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Pádraig Brady, 2012/10/13
- bug#9086: ls --color: 30% speed-up and case-insensitive suffixes [Re: bug#9086, Jim Meyering, 2012/10/13