POV-Ray : Newsgroups : povray.off-topic : Other people dislike regexes too Server Time
29 Jul 2024 10:24:12 EDT (-0400)
  Other people dislike regexes too (Message 3 to 12 of 32)  
<<< Previous 2 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: Warp
Subject: Re: Other people dislike regexes too
Date: 16 Jul 2012 07:48:14
Message: <5003ff7e@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> I don't like regular expressions. Or rather, I don't like the mangled 
> ASCII pea-soup typically used to /represent/ regexes.

That complaint is mostly irrelevant. regexes are excellent for their
most common use, which is to match extremely simple patterns.

"hello" is a regex. It might not look like it, but it is. And that's the
beauty of it. If you use that for example as a search pattern, you will
find all occurrences of those five consecutive characters.

The absolute beauty of it is that there's *no* extraneous syntax *at all*
to perform such a simple match. If you were to separate the syntax into
"commands and arguments", you would only be making the pattern needlessly
complicated and more laborious to write.

But how does that differ from a trivial matching that simply searches for
those consecutive characters and that's it? Well, you can refine the pattern
by adding a few additional key characters to it, to make it perform a more
elaborate search, and in the vast majority of cases it does not become
needlessly complicated or long.

For example, suppose you wanted to search for either "hello" or "Hello".
You would write the pattern as "[Hh]ello".

You would have to be really pedantic if you were to argue that's complicated
and difficult to understand. However, imagine that you were to separate that
into "commands and arguments", how much more complicated and lengthy the
pattern would become.

You can argue against regexes by giving examples of really long and
complicated patterns, but then you would be complaining about extreme
fringe cases, not about *the most common* usage for them, for which they
are just superb.

-- 
                                                          - Warp


Post a reply to this message

From: Invisible
Subject: Re: Other people dislike regexes too
Date: 16 Jul 2012 09:01:37
Message: <500410b1$1@news.povray.org>
On 16/07/2012 12:43 PM, Le_Forgeron wrote:

> Issue number 1: there is too many syntaxes for regexes.

Well, that's true enough.

> Trying to implement a BNF syntax with a regex is asking for trouble as
> soon as recursion or reordering is allowed. And interesting BNF syntax
> have always a recursion somewhere (just to allow more than one item...),
> and most are also cool about the required order of appearance of
> sub-items or properties.

I believe the correct phrase is "regular expressions can only recognise 
regular languages", and "most languages described by BNF are not 
regular". Followed shortly by "most regex implementations are not 
actually regular".


Post a reply to this message

From: Invisible
Subject: Re: Other people dislike regexes too
Date: 16 Jul 2012 09:26:39
Message: <5004168f@news.povray.org>
On 16/07/2012 12:48 PM, Warp wrote:
> You can argue against regexes by giving examples of really long and
> complicated patterns, but then you would be complaining about extreme
> fringe cases, not about *the most common* usage for them, for which they
> are just superb.

In my limited experience, people don't use regexes for simple pattern 
matches. They only use them for insanely complex cases - cases where it 
would be far, far better to spell out exactly what you're trying to do, 
rather than encode it into a tangle of punctuation.

I can see how if you're just trying to quickly search some document for 
a piece of information, being able to throw together a short text string 
and get nearly the right results might be useful. And hey, if you're 
only doing this once, who /cares/ that it's completely non-maintainable. 
It's a one-off task; you don't /need/ to maintain anything.

But for building large, complex applications, regexes seem like a 
stupendously bad idea.

Also, the usual formulation of regexes as text strings means that you 
can only match against text strings. Admittedly that's the most common 
case for wanting to do complicated matching. But if, say, you wanted to 
match against a binary file... sorry, you can't do that. As far as I can 
tell, there's no reason why a formal regular expression can't be matched 
against binary data; it's just that most real-world implementations 
don't allow this.


Post a reply to this message

From: nemesis
Subject: Re: Other people dislike regexes too
Date: 16 Jul 2012 12:22:51
Message: <50043fdb$1@news.povray.org>
He doesn't dislike regexes at all.  He dislikes using the wrong tool for 
the job, as he wisely puts it:

"HTML is not a regular language and hence cannot be parsed by regular 
expressions."


Post a reply to this message

From: Warp
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 02:28:19
Message: <50050602@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> In my limited experience, people don't use regexes for simple pattern 
> matches.

Yes, because you have decades of extensive experience on how eg. unix users
typically use regexes.

-- 
                                                          - Warp


Post a reply to this message

From: Invisible
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 04:00:28
Message: <50051b9c$1@news.povray.org>
On 16/07/2012 05:22 PM, nemesis wrote:
> He doesn't dislike regexes at all. He dislikes using the wrong tool for
> the job, as he wisely puts it:
>
> "HTML is not a regular language and hence cannot be parsed by regular
> expressions."

And yet, this seems to be how people almost always try to use regexes...


Post a reply to this message

From: Warp
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 04:21:30
Message: <5005208a@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> And yet, this seems to be how people almost always try to use regexes...

  No, it isn't.

-- 
                                                          - Warp


Post a reply to this message

From: Invisible
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 04:36:50
Message: <50052422$1@news.povray.org>
On 17/07/2012 07:28 AM, Warp wrote:
> Invisible<voi### [at] devnull>  wrote:
>> In my limited experience, people don't use regexes for simple pattern
>> matches.
>
> Yes, because you have decades of extensive experience on how eg. unix users
> typically use regexes.

Perhaps you mean like

   grep -e "ntpd\[[[:digit:]]\+\]" /var/log/messages.4

which obviously searches for... wait, what does it search for exactly?

So how about this?

   egrep 
'\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'

Yeah, that's pretty clear. If by "clear" you mean "it's going to take me 
five minutes to figure out exactly what this is supposed to do".

Or how about

   dmesg | egrep '(s|h)d[a-z]'

At least that one only takes a minute or two to figure out.

And then we come to horrifying things such as

while(<STDIN>)
   {
   my($line) = $_;
   chomp($line);
   if($line !~ /<DIR>/)
     {
     if ($line =~ /.{28}(\d\d)-(\d\d)-(\d\d).{8}(.+)$/)
       {
       my($filename) = $4;
       my($yymmdd) = "$3$1$2";
       if($yymmdd lt "971222")
         {
         print "copy $filename \\oldie\n";
         }
       }
     }
   }

I don't even want to contemplate what the hell that does...

Still, I suppose the fact that you can do bad things with regexes 
doesn't automatically mean that regexes are bad.


Post a reply to this message

From: Invisible
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 04:40:35
Message: <50052503$1@news.povray.org>
On 17/07/2012 09:21 AM, Warp wrote:
> Invisible<voi### [at] devnull>  wrote:
>> And yet, this seems to be how people almost always try to use regexes...
>
>    No, it isn't.

That would explain why the question "how do I parse XHTML with a regex?" 
is so common on Stack Overflow that a guy actually lost his mind trying 
to answer it.

Actually, no it wouldn't.


Post a reply to this message

From: Warp
Subject: Re: Other people dislike regexes too
Date: 17 Jul 2012 05:40:57
Message: <50053328@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> On 17/07/2012 07:28 AM, Warp wrote:
> > Invisible<voi### [at] devnull>  wrote:
> >> In my limited experience, people don't use regexes for simple pattern
> >> matches.
> >
> > Yes, because you have decades of extensive experience on how eg. unix users
> > typically use regexes.

> Perhaps you mean like

You don't seem to grasp what "the most common usage" means. It does not
mean "the most prominent examples displayed on webpages" or "the most
prominent examples that I have seen". It means the forms that people
*most commonly* use, as in raw numbers. Count how many people use a
completely syntax-less regex or with just a simple wildcard, vs. the
times when someone has to write a really large and complex expression.
I'd estimate that the former wins about a million to one.

> Or how about

>    dmesg | egrep '(s|h)d[a-z]'

If you want to build your straw man, at least use examples that conform
to your straw man. That's a bad example because it can be understood in
about 2 seconds.

> while(<STDIN>)
>    {
>    my($line) = $_;
>    chomp($line);
>    if($line !~ /<DIR>/)
>      {
>      if ($line =~ /.{28}(\d\d)-(\d\d)-(\d\d).{8}(.+)$/)
>        {
>        my($filename) = $4;
>        my($yymmdd) = "$3$1$2";
>        if($yymmdd lt "971222")
>          {
>          print "copy $filename \\oldie\n";
>          }
>        }
>      }
>    }

Also, if you are building your straw man, at least try to use actual
regexes and not some unrelated scripting language.

-- 
                                                          - Warp


Post a reply to this message

<<< Previous 2 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.