On 10/09/2012 12:52 PM, Jim Meyering wrote:> Eric Blake wrote:
On 07/26/2011 02:33 PM, marcel partap wrote:
Here's a patch. Adds STRCASEEQ_LEN macro for case insensitive extension
matching.
#regards/marcel.
Your patch would make the new behavior unconditional. But I like case
sensitivity, and think that case insensitivity should be an opt-in
process that I request, with coordination between dircolors to
generate a new string for LS_COLORS to be honored by ls. Furthermore,
the patch is lacking in NEWS, documentation, and testsuite coverage.
Additionally, you should be aware that strncasecmp() has undefined
behavior in non-C multibyte locales. It would probably be better to
use c_strncasecmp(), so that you are guaranteed defined behavior
regardless of the current locale.
Would you care to tackle those additional issues? And are you set up
for copyright assignment, since the patch will probably be non-trivial
by that point in time?
I saw this while going back through old bugs, so started poking around.
How about this: if a suffix is not matched, convert it to lower case
and check again. That way, anyone who cares to highlight .TaR or .TAR
differently from .tar can still easily do so, yet names ending with
uppercase .TAR, .JPG, .FLAC, etc. will now be highlighted by default.
Does anyone think it's worth making this new fallback-to-case-insensitivity
an option?
Not me anyway.
While looking at that, I realized that comparing each name in an ls'd
directory with each of nearly 100 suffixes should leave nontrivial room
for improvement. Sure enough, once I'd replaced that linear search
with a hash-table lookup, I see a measurable improvement.
and taking best-of-10 times, I see a 0.34 -> 0.23 (~30%) improvement.
Very nice.
Of course, I have deliberately used the case that shows the most improvement:
- a directory with very many files
- no suffix match, so the old code searches all suffixes
- using tmpfs: minimal stat overhead
@@ -4343,21 +4412,11 @@ print_color_indicator (const struct fileinfo *f, bool
symlink_target)
}
/* Check the file's suffix only if still classified as C_FILE. */
- ext = NULL;
- if (type == C_FILE)
- {
- /* Test if NAME has a recognized suffix. */
-
- len = strlen (name);
- name += len; /* Pointer to final \0. */
- for (ext = color_ext_list; ext != NULL; ext = ext->next)
- {
- if (ext->ext.len <= len
- && STREQ_LEN (name - ext->ext.len, ext->ext.string,
- ext->ext.len))
- break;
- }
- }
+ char *suffix;
+ struct sufco *ext; /* Color extension */
+ ext = (type == C_FILE && (suffix = strrchr (name, '.'))
+ ? suffix_color_lookup (suffix)
+ : NULL);
So do we now only support suffixes delimited by '.' ?
Previously the delimiter was arbitrary or optional:
touch star; LS_COLORS="*tar=01;31" /bin/ls --color *tar