 |
 |
|
 |
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> OK, that's one big ol' complex regex, right there.
>
> I see you never looked at perl code.
It's something I generally try to avoid, yeah. :-)
> do you really believe spelling out "option (char '+' <|> char '-')" is
> any worth as opposed to "(\+|-)"?
Um, *yes*. Why, do you think it doesn't have worth?
>> Enforcing that the exponent is less than or equal to 3 digits would be
>> slightly more wordy. The obvious way is
>>
>> xs <- many1 digit
>> if length xs > 3 then fail else return ()
>
> no, sorry. Truly overkill against "\d{3}".
So define a combinator for it:
manyN 0 _ = return []
manyN n p = do x <- p; xs <- manyN (n-1) p; return (x:xs)
And now, forever more, you can merely write
manyN 3 digit
Problem solved.
EDIT: Apparently this function is already predefined. I hadn't even
noticed. I could have just written "count 3 digit". In all the time I've
been writing parsers, I've never needed this.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>>> (That is, optional sign, one or more digits, optional decimal point
>>> followed by one or more digits, optional E followed by optional sign
>>> followed by one to three digits.)
>>
>> If I've understood the spec correctly, it's
>>
>> do
>> option (char '+' <|> char '-')
>> many1 digit
>> option (char '.')
>> many1 digit
>> option (do char 'E'; option (char '+' <|> char '-'); many1 digit)
>
> Bzzzt. Sorry. That requires at least a 2-digit number.
I don't follow. Where do you think by definition deviates from your spec?
>> Notice that since this is written in a /real/ programming language and
>> not a text string, we can do
>>
>> and save a little typing.
>
> You can do that with many regexp engines too. That's what I meant by
> "trivial macros".
I've never seen a system that can do that, but I guess that doesn't
prove a lot.
>> You can also factor the task into smaller pieces:
>
> Yeah, that's just a text substitution in the regexp expression.
Sure. Except without any guarantee of syntactic correctness.
>> With a regex, on the other hand, you cannot even statically guarantee
>> that a given string is even a syntactically valid regex.
>
> Nonsense. How'd you get that?
Not every string is a valid regex. You have to follow the syntax rules.
But this is not checked at compile-time (and usually /cannot/ be checked
at compile-time). So if you make a typo in your regex, you won't find
out until runtime.
If you start constructing regexs programmatically, your problems just
multiplied.
On the other hand, with a /real/ parser library, both of these grave
problems immediately vanish into thin air.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
On 10/11/2010 05:55 PM, scott wrote:
>>> (\+|-)[0-9]+(\.[0-9]+)?(E(\+|-)?[0-9]{1,3})?
>>
>> do
>> option (char '+' <|> char '-')
>> many1 digit
>> option (char '.')
>> many1 digit
>> option (do char 'E'; option (char '+' <|> char '-'); many1 digit)
>
> Hmm, now which one is more readable?
Exactly.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> Yes, I've used SSH myself. It's quite neat that I can have an old 386
>> laptop sitting in a cupboard somewhere and operate it basically as if
>> I was sitting in front of it. On the other hand, would it be so hard
>> to do so with a less primitive interface?
>
> it's strange that I guy used to programming languages feels awkward
> about a text interface.
No, it feels awkward when you sit at a text terminal that's using ANSI
escape codes to generate colours and block graphics characters to "draw"
a "graphical" user interface. I mean, why not just have a /real/
graphical user interface?
> Notepad in all its glorious graphical user
> interface is much more featureless and rudimentar than vi on an old Unix
> terminal.
Sure. But that's because Notepad sucks beyond belief. ;-)
I'm not arguing that the CLI is useless. I'm arguing that ASCII-art
"graphical" UIs are silly.
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
On 10/11/2010 10:07 PM, Darren New wrote:
> Invisible wrote:
>> I've always thought that manpages and the ugliest, lamest, most
>> archaic thing ever, so I don't see that that's much of an advantage.
>
> Oh, wait. Aren't you the one that goes on about LaTeX and Postscript?
> :-) I'm not sure what you have against a text-based program for
> generating typeset output.
TeX, PostScript, HTML, it's all good.
I'm just saying that the finished product looks awful.
The reason I put up with TeX, in spite of its innumerable and obvious
limitations, is that it produces very, very nice output. troff doesn't
seem to compare to that.
> Plus, they didn't produce stuff on screen. This is hardcopy land.
> Screens were advanced stuff. :-)
Apparently that's also why all Unix commands fail to offer even the most
basic feedback as to what's happening, unless something actually goes
wrong. It's to save paper.
So really, Unix invented environmentalism, 60 years ahead of the rest of
Western civilisation. :-P
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible escreveu:
> I'm not arguing that the CLI is useless. I'm arguing that ASCII-art
> "graphical" UIs are silly.
and I agree here. However, ssh, bash or vi don't even attempt such
sillyness, so I wonder what's your point with ssh as example of that...
--
a game sig: http://tinyurl.com/d3rxz9
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible escreveu:
>> do you really believe spelling out "option (char '+' <|> char '-')" is
>> any worth as opposed to "(\+|-)"?
>
> Um, *yes*. Why, do you think it doesn't have worth?
it's far more verbose and, thus, hurts readability?
come on! do you even need those quotes around the symbols?
>>> xs <- many1 digit
>>> if length xs > 3 then fail else return ()
>>
>> no, sorry. Truly overkill against "\d{3}".
BTW, my mistake. Should be "\d{0,3}"
> So define a combinator for it:
>
> manyN 0 _ = return []
> manyN n p = do x <- p; xs <- manyN (n-1) p; return (x:xs)
>
> And now, forever more, you can merely write
>
> manyN 3 digit
>
> Problem solved.
then try this in your little language:
// A phone number with or without hyphens:
[2-9]\d{2}-?\d{3}-?\d{4}
It looks pretty much like a template for a phone number. I'm sure yours
will look like a little backwards forth script and will be much harder
to figure out.
I also wonder if it's the postfix nature of regexes that bothers you...
--
a game sig: http://tinyurl.com/d3rxz9
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
nemesis <nam### [at] gmail com> wrote:
> do you really believe spelling out "option (char '+' <|> char '-')" is
> any worth as opposed to "(\+|-)"?
Btw, "(\+|-)" is unnecessarily complex. Just use "[+-]". That's exactly
what the [] syntax is for.
--
- Warp
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
Invisible <voi### [at] dev null> wrote:
> No, it feels awkward when you sit at a text terminal that's using ANSI
> escape codes to generate colours and block graphics characters to "draw"
> a "graphical" user interface. I mean, why not just have a /real/
> graphical user interface?
Do you understand the concept of bandwidth? And that it can be quite
limited?
The VT protocol was designed back when you were lucky if you could
transfer than a few kilobytes per second, so they had to maximize what
you could do while at the same time minimizing the need to transfer data.
Today this might be more obsolete than it was 20 years ago, but there
are still benefits to that. The standard is quite well-established and
has been around forever, so almost every system in existence supports it.
Also, being text-based (and low-bandwidth) you can create or port terminal
emulators to almost any system with a screen and a keyboard, even very
exotic ones. For instance, it's pretty common for unix-savvy people to
have a terminal emulator on their cellphones so that they can contact
their computer or whatever remotely. A remote GUI would simply not do.
> I'm not arguing that the CLI is useless. I'm arguing that ASCII-art
> "graphical" UIs are silly.
Sometimes you have to resort to them to organize what is displayed.
--
- Warp
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|  |
|
 |
>> Um, *yes*. Why, do you think it doesn't have worth?
>
> it's far more verbose and, thus, hurts readability?
Far more verbose, and as a consequence it's almost self-explanatory what
it does. (Unlike a collection of symbols that have no widely-accepted
meaning outside of regex languages...)
> come on! do you even need those quotes around the symbols?
Yes. Because this is a real programming language, you could replace the
character name with a variable, for example. In this way you can make
the pattern parameterised.
> then try this in your little language:
>
> // A phone number with or without hyphens:
> [2-9]\d{2}-?\d{3}-?\d{4}
>
> It looks pretty much like a template for a phone number. I'm sure yours
> will look like a little backwards forth script and will be much harder
> to figure out.
I actually can't figure out what that does, so I can't implement it.
> I also wonder if it's the postfix nature of regexes that bothers you...
No, mostly it comes down to these:
1. Commands have cryptic names like "*" or "+".
2. Literal characters aren't quoted, so it's hard to tell what's literal
and what's a command.
(And 3. since spaces are literal characters, you can't even use spacing
to make the structure of the expression clearer.)
Post a reply to this message
|
 |
|  |
|  |
|
 |
|
 |
|  |
|
 |