blech:
posted on 2007/02/14 09:50
via the Daring Fireball linked list, this seems to be to be stating the obvious.
A cursory read of the Jeffrey Friedl regex book - and I've only ever read a chapter or three, shamefully - shows that NFA regex are faster.
As the paper itself says, backreferencing in regex is NP-complete, so it's (currently believed to be) impossible to make them guaranteed to be fast.
jerakeen:
the perl regex engine is perfectly pluggable, you know.
So you can sub in a PCRE-based regex engine. And presumably this one could be ported to the same interface as well.
blech:
Quite why one of the first plugins should be to a library that claims to emulate exactly what Perl does anyway is a bit of a mystery to me.
jerakeen:
it was there?
blech:
Fair enough.
jerakeen:
In fact, I see a perl interface to the GNU regex engine here. And according to the paper, the GNU engine also has very good performance.
Now, it's not built into the language to the same degree, but most languages have clumier regex syntax than perl, so this isn't a great loss.
blech:
Maybe we should note that the pluggable Perl RE engine is only in pre-release versions of the language.
jerakeen:
Since 3rd December, in fact. So no, not mainstream. But perfectly buildable.
blech:
Speaking of demerphq, he posted a response to the paper in January on PerlMonks.
while DFA engines have a very good worst case match time, they dont actually have too many other redeeming features. Construction can be extremely slow, the memory footprint vast, all kinds of trickery is involved to do unicode or capturing properly and they aren't suitable for patterns with backreferences.
We hate you all. Yes, especially you. Sod off and DIE