|
 |
Invisible <voi### [at] dev null> wrote:
> >> It's not the concept of a regular expression as such. It's the fact that
> >> all known implementations work by mixing up code and data in the same
> >> encrypted string.
> >
> > I don't understand what you mean by that.
> I'd much prefer to see a much bigger separation between what's a literal
> character and what's a command.
It would become a burden with simple regular expressions.
> > Regular expressions define less than 10 special characters.
> Depending on /which/ regular expressions you mean, of course. Is that
> POSIX Basic Regular Expressions? POSIX Extended Regular Expressions?
> Perl 5.0 Regular Expressions? Perl Compatible Regular Expressions?
> Something else?
> A quick inspection of Wikipedia suggests that POSIX ERE involves at
> least .[]^$()\*{}?+|:, which is 16, not 10. (Still, it's not the
> thousands it seemed like last time I tried to learn this stuff.)
Well, regular expressions do not have all of those special characters,
extended regular expressions do. Also, I conflated [ and ] into one "special
character" because they are counterparts of the same special meaning
(in other words, I was more thinking of "keywords" of sorts rather than
individual characters). Likewise for () and {}. Also , is special only
inside {} and : is special only inside [], so I conflated them into the
same "special character".
> > Regular expressions are nowhere near Turing strong. They are state
> > machines.
> I recall reading somewhere that Perl's "regular expressions" aren't
> actually regular, and so require exponential time for matching. Truly
> regular expressions apparently require only linear time.
I wasn't really talking about perl. (Perl RE's add things like multi-line
matching, which is a lot more complex.)
> The other thing I dislike is that people seem to have a tendency to use
> regexs where they should be using a real parser. For example, I recently
> saw a Haskell example where they used a fistful of regexs to "parse" SVG
> input.
As Darren said, does the fact that regular expressions are abused for
purposes they are not really intended for make them detestable?
Regular expressions are not a really good way of tokenizing some
formatted input. There are other languages much better designed for
that purpose, such as BNF.
> In short, people tend to use regexs for quick and dirty hacks that kinda
> work, rather than doing the job properly with a full parser. And I'm
> really not fond of hacks.
Writing a parser, even with the aid of a specialized description language
such as BNF, is very laborious. Simple string matching can be often expressed
with very short regular expressions which you can write in a few seconds.
Writing a parser would be complete overkill.
--
- Warp
Post a reply to this message
|
 |