[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Scanner takes a long time to build.
From: |
Vern Paxson |
Subject: |
Re: Scanner takes a long time to build. |
Date: |
Tue, 30 Jul 2002 12:44:14 -0700 |
> Vern> My main current project these days is Bro, a network intrusion
> Vern> detection system. It's written in C++. For its regular
> Vern> expression matching, I took flex's generator and turned it into
> Vern> a set of classes that can be called at run-time.
>
> For my education, I'd like to know why you didn't consider dynamic
> regex compilers, such as Henry Spencer's, or pcre etc.
For various packages, one of up to three reasons: (1) performance,
(2) copyright terms (Bro is BSD like flex, not GPL), (3) anticipating
the need to extend the functionality (e.g., be able to match part of
a string, save the matcher state, and then continue matching later when
more of the string comes in).
> I'm not sure I'm reading you correctly: are you saying that building
> NFA + making it DFA where it matters + running it < running a
> precompiled one?
Yep! (Just a bit)
> Yes, locality is certainly the main culprit, and disk access too.
I'm working with a student on comparing Bro's matching against setwise
Boyer-Moore for fixed strings, and we likewise run into a number of strange
effects that appear to also reflect locality, running quite differently
across similar-but-not-identical processors.
Vern