POV-Ray : Newsgroups : povray.off-topic : The mysteries of Erlang : Re: The mysteries of Erlang Server Time
29 Jul 2024 22:26:39 EDT (-0400)
  Re: The mysteries of Erlang  
From: Darren New
Date: 9 Mar 2011 13:22:03
Message: <4d77c54b$1@news.povray.org>
Having worked with Erlang, here's some comments.

Invisible wrote:
> - There are many functional programming languages. (Haskell, OCaml, 
> Clean, and for some reason people keep calling Lisp "functional" too.) 
> However, Erlang is *the only* one that could be considered "commercially 
> successful", as far as I can tell.

Erlang isn't functional. It's single-assignment. The "functional" bit in 
Erlang doesn't give you benefits like it does in Haskell because there's 
still a whole bunch of non-functional operations. However, since you can 
only assign to each variable once, it means you can't have loops and you 
can't do a sequence of operations without making up new meaningless names 
for the data at each step.

If it was *actually* functional, you could put your own functions in guards, 
for example.

> - Other languages attempt to be *concurrent*, but only Erlang is 
> *distributed*.

Yep.

> - The system is supposedly insanely reliable. People toss around "nine 
> 9s up-time" as if this is some sort of experimentally verified *fact*.

It is. They've been running a commercial switching network with it for 
something like 10 years, and one of the six data centers lost power or 
something and all the machines in it for 20 minutes once, so they had a 15% 
downtime for 20 minutes out of ten years.

> - The language claims to do all sorts of wacky, far-out stuff like 
> trivial concurrency and distribution, hot-swapping running code, 
> detecting and correcting run-time errors, and so forth. I want to see 
> how it's done.

It's pretty straightforward. The rest is in libraries. The collection of 
libraries that make it easy is called OTP. The documentation on OTP is 
either non-existent or blows mooses, depending on what you can find.

> Suffice it to say, from what little I could discover, I didn't like what 
> I was seeing. Like most commercially successful languages, Erlang is 
> obtuse, complex, ugly and kludgy. Much like C, Java or anything else 
> wildly popular. It's abundantly clear that Erlang is about as 
> "functional" as Lisp (i.e., not at all). Still, people claim that 
> Ericsson's entire business is based on it, and lots of people are using 
> it, so it must have got *something* right.

Yes. It got the reliability right.

> refusing to allow processes to share state. This immediately implies 
> that if you want to send data from place to place, you must copy it all. 

No, not true. Since the data is immutable, you don't actually need to copy 
it if sender and receiver are on the same processor. It *acts* like it got 
copied, but then "X = [5, Z]" acts like it copies Z into X also.

> Then again, if one process is operating on a completely different node, 
> you are *forced* to copy data from place to place anyway. It is 
> unavoidable. So in a way, forcing you to do it all the time just makes 
> it that little bit simpler to move from local to distributed coding.

Except you're not really forced to copy stuff. And big things (like code or 
large binaries) get allocated in their own heap anyway.

> One thing that rapidly becomes clear (and doesn't seem to be mentioned 
> anywhere else) is that for Erlang, processes are about more than just 
> concurrency or distribution. They are about fault isolation.

Indeed, that was the fundamental goal. Not distributed computing, but 
reliable computing, however that may be achieved.

> Another thing that becomes painfully obvious is that writing your 
> program in Erlang does not somehow magically make it reliable. 

There are a handful of primitives that are reliable. You have to build 
everything else on top of that. For example, the spawn-link function gives 
you the basis for reliability, because there's no race condition between 
spawning the new process and linking to its error messages.

> less the same kinds of error handling that any other language provides - 
> catch/throw, etc. 

Huh. I didn't even know it provided catch/throw. I've never seen that used, 
actually.

> can make it so that related processes get auto-killed as well 
> (presumably because they can't sensibly complete without the dead 
> process), 

Yes.

 > or you can make it send a notification to some monitor
> process. The *language* itself allows notifications to be sent, but 
> nothing more. The *libraries* allow you to do sophisticated error 
> recovery, but *you* have to implement this. It doesn't happen by magic, 
> as many seem to suggest.

OTP is the libraries that make it seem like magic. However, being typical 
undocumented libraries that you learn by spending 5 years as an employee of 
Erricson, written in a language so modular you can't even figure out what 
pieces of code are part of the program and what aren't let alone read the 
code to see what it does, I've never quite figured it out beyond the very 
surface level.

> Very importantly, the system correctly handles things other than 
> software exceptions. Detecting division by zero is one thing. Detecting 
> that the machine in Australia that you were talking to just got hit by a 
> small pyroclastic flow is another. Obviously, no documentation actually 
> describes how this works.

There is documentation.

Each hardware box is running one interpreter, each of which is running one 
thread per CPU. Each interpreter keeps a TCP socket open to all the other 
interpreters it knows about, as well as sending periodic "are you there" 
messages. That TCP socket is also used to send application-level messages to 
remote processes. If the socket breaks, anything on the local system linked 
to that socket gets a notification that it broke.

There's also a separate process running on each local machine that's 
monitoring the local Erlang interpreter that will reboot the machine if the 
local Erlang interpreter stops working. (For some definition of "working" 
which isn't clear from the docs, but clearly includes answering keep-alive 
probes.)

> So how does Erlang actually work then? I mean, you can create processes. 
> OK, then what? Well, as best as I can determine, each process has a 
> "mailbox", and you can send asynchronous messages to any process that 
> you have the address of. (And these messages can contain process 
> addresses.) This works the same way both locally and remotely.

Yep.

> Sending is a non-blocking operation. Receiving is a blocking operation, 
> and it makes use of Haskell-style pattern matching (which seems like a 
> rather neat fit here). You can also time out waiting for a message - 
> something which would be really hard to implement yourself, and which is 
> probably extremely important in networking code.

Correct.

> Message sending has the curious property that message delivery is /not/ 
> guaranteed, but message ordering /is/ guaranteed. Like, WTF?

Message delivery isn't guaranteed because it's asynchronous, and the remote 
machine may crash between the time you send the message and the time the 
remote machine receives it. Ordering *is* guaranteed because if the remote 
machine crashes before it gets message #5, it won't get message #6 either.

 > I'm not
> even sure how it's possible to tell that the messages are in order 
> without being able to tell if you got all of them, but whatever. 

You can't, but the *sender* doesn't know if you got them all. The receiver does.

> very, very random to me. The thesis claims that implementing delivery 
> checks yourself is easy, but implementing ordering checks is very hard. 
> So they put the hard thing into the language, and left the easy thing up 
> to you if you need it. Um, OK?

Yep. Knowing if something is out of order is hard in a single-assignment 
language. By the time you do the pattern match, it's too late to not get the 
message out out of the queue. But the runtime can just look at the sequence 
number of messages from that other machine and pretend the message didn't 
arrive until the missing ones show up.

Given it all runs over TCP, tho, I don't think that's actually a whole lot 
of trouble.

Knowing the receiver got the message is as simple as adding code to send an 
ack each time you handle a message.

> One of the very trippy things about Erlang is that I can be sitting on 
> my PC in England, and I can just tell some Solaris server in Brazil to 
> spawn a new process. As expected, no word on how the hell this actually 
> works. I would have *thought* this means that the necessary code gets 
> beamed over the wire, but some documentation seems to suggest that you 
> have to install it yourself. And yet, you can apparently send arbitrary 
> functions as messages, so...?

The functions are merely references to installed code. The code itself has 
to already be in Brazil, and the function you send is "this module, this 
name, this version".

There's a document out there somewhere that lists the binary formatting of 
the contents of all the messages.

Starting the process remotely works because the local interpreter has (or 
establishes) a TCP link to the remote machine. All the messages for all the 
processes on your PC go over the same link to the machine in Brazil. There 
isn't one connection per communicating process.

> There's also no word on authentication. I mean, how does the machine on 
> Brazil know that I'm authorised to be executing arbitrary code on it? 

Nope, you're fucked.

> best guess is that if you're a telecoms giant, you have a private data 
> network that nobody else can physically access, and you use that to 
> control your equipment. In other words, you're authorised if you can 
> actually access the system in the first place.

Yes, exactly. Or you proxy the connection between machines through a VPN 
with authentication. Basically, you secure it at the machine-to-machine 
level, which is done outside of Erlang's source code and instead done at the 
Erlang interpreter level.

> It's a similar story when we come to code hot-swapping. Surprise, 
> surprise, it turns out that you can't just magically hot-swap code, just 
> because it's written in Erlang. No, you have to do *actual work* to make 
> this possible.

Well, sure.

> The language itself allows two (and only two) copies of the same code to 
> be loaded at once, an "old" version and a "new" version. And it provides 
> a way to switch from one to the other for a specific process. And that's 
> *all* it gives you. Everything else has to be done, by hand, by you.

The two-versions limit is because the code itself isn't GCed. They document 
that they could have GCed the code, but that would have added lots of 
overhead for something relatively rare.

If you *really* need more than two versions, first move to a "new" version 
that talks to one of N versions, then start up the N versions as appropriate.

> In short, _you_ have to figure out how your new code is different from 
> your old code, _you_ have to write the code that converts any data 
> structures or establishes any new invariants, _you_ have to check that 
> this new code actually works properly, and _you_ have to design your 
> application in the first place so that you can tell it to switch to the 
> new code, running the conversion routine in the process.

Yep.

> And if you want to downgrade back to the old version? _You_ have to do 
> all the work for that too. Really, the only help that Erlang is giving 
> you is the ability to easily change from one chunk of code to another. 
> You could probably do the self-same thing in Java, if you built your 
> application so that every class has a version number, and there's some 
> way to tell the application to load a set of new class files and execute 
> a predetermined method on one of them. (I suspect Java's inflexibly type 
> system would probably whine though...)

You'd have to be able to pass the network connections around too, remember.

> Erlang really gives you no assistance at all beyond the minimal level of 
> "you can have two modules with the same name". And it's limited to just 
> two, by the way. If you try to load a third version, anything running 
> the first version is unceremoniously killed. And there's no mention of 
> any way to detect whether old code is still running. (Perhaps there is, 
> but I didn't see it mentioned.)

There is, but it's deep in the bowels of "system" stuff.

The OTP system provides all the packaging and maintenance for the higher 
levels of management. Complaining that Erlang itself doesn't automate such 
stuff is like complaining that you need a Linux package manager because the 
file system only knows about files, not applications. :-)

> There was some talk of a packaging system, which sounds quite 
> interesting, but obviously no details are described. It talks about 
> being able to group modules into applications, and group applications 
> into releases which can have complex dependencies, build procedures and 
> installation processes. But... no details. So maybe it does 
> configuration management for you, or maybe it does very little to assist 
> you. I couldn't say.

It does a bunch of stuff, including giving you ways to roll things forward 
and back and such.

Realize that most of these "systems" you're talking about are actually 
implemented as Erlang code. So, for example, to compile code, you actually 
sit down at the REPL and invoke the compiler function. To package code, you 
sit down at the REPL and invoke the packager function. To debug code, you 
sit down at the REPL and invoke the debugging functions. Etc.

> The language itself has a few interesting features. Of course, being 
> designed as something easily parsable by Prolog, the syntax is utterly 
> horrid, it uses a kludgy preprocessor rather than actually supporting 
> named constants, and so forth. It's also dynamically typed, which some 
> people are presumably going to argue is somehow "necessary" because of 
> what it does. I can't help thinking that a powerful type system would 
> have made it /so much easier/ to make sure you got everything straight.

I agree about the syntax etc. It's quite possible a better type system would 
make it easier.

You should also check out Hermes, which is essentially the same thing with 
an astoundingly strict and high-level type system.  Indeed, Hermes is kind 
of a high-level Erlang, in that you (for example) let the compiler/runtime 
decide where to run the process and how many to create. You write a loop 
that says "read a message, process it, send an answer, loop" and it gets 
automatically parallelized. You write a process that says "read a message, 
process, send an answer to where this message came from, exit" and it gets 
turned into an in-line subroutine. Etc.

> Just reading the document, I saw endless examples of data structures 
> who's meaning is ambiguous due to the lack of types. For example, it is 
> apparently impossible to tell the difference between a process that died 
> because of an exception, and a process which merely sent you a message 
> that happens to be a tuple containing the word "EXIT". 

Well, it's possible unless you asked "turn all crashes from processes I'm 
watching into EXIT messages."

> The advice for 
> this presumably being "don't do that".

No, this is actually more like "We did that on purpose in case you want the 
supervisor to react as if you exited while you keep running." So, for 
example, you can get the supervisor to spawn a new handler process while the 
previous handler process rolls back the transaction that failed in *its* 
child process or something.

> It's slightly bizarre. Erlang looks for all the world like a crude 
> scripting language with no safety at all and abysmal run-time 
> performance. And yet, the language is designed for running network 
> switches, possibly the most demanding high-performance hard-realtime 
> system imaginable, and people claim it has nine 9s up-time. The only 
> container types are linked lists and tuples, and yet the standard 
> libraries somehow include cryptography and complex network protocols. 
> Very odd...

It's possible to link C (or anything else) into Erlang. The GUI for Erlang 
launches a separate Tcl/TK process and talks over a socket to it to do the 
drawing, for example.

In general, network switching isn't that hard real-time, because you usually 
have custom hardware to do the switching. A 5ESS (which can handle up to 
800,000 phone lines) runs on two 6800s, one of which is a hot spare. But of 
course there's a large lump of hardware that's routing the individual bytes 
here and there, so the 6800 only actually gets involved when you connect or 
disconnect, basically.

> Erlang uses the rather baffling convention that mere variables start 
> with an upper-case letter, while proper nouns such as function names or 
> data points start with a lower-case letter. This makes it surprisingly 
> hard to read Erlang code. (Haskell, not to mention English, obviously 
> uses the opposite convention.)

Erlang is the only language I've found whose syntax sucks worse than C++, 
yet it has way, way less built in than C++.

The primary problem, I think, is it started out as "let's investigate how we 
can write a million lines of code that doesn't crash."  Of course, once you 
have that million lines of code and it's running reliably, you're not going 
to go back and fix anything as trivial as the syntax.

> I already mentioned that message receipt is by pattern matching. This 
> seems like a rather powerful way to work, especially considering that a 
> message that fails all patterns stays queued in the mailbox. So you can 
> actually use patterns to decide what order you want to process incoming 
> messages in. That seems quite sophisticated. And I already mentioned how 
> you can add a time-out as well, which is obviously a very frequent 
> requirement for hard-realtime systems.

It's very handy, yes.

> In Haskell, general practise is to make all functions "total". In 
> particular, all pattern matching should cover every case if possible. By 
> contrast, apparently the general practise with Erlang is to /not/ cover 
> cases which aren't expected to occur, and just let the thing throw an 
> exception on a pattern-match failure. The rationale apparently being 
> that the automatically-generated exception string is just as descriptive 
> as anything that you could write by hand yourself, and it makes the code 
> less cluttered.

Well, that's not quite true for message pattern matching. For matching a 
pattern against a value, yes. But if you don't read messages, you leave them 
in the input buffer, which then grows and grows until you take out the 
entire interpreter. Normally you'd find the place in your code where you 
*think* you handled all the messages, and you'd put in a "if I match 
anything else, crash out" branch to the pattern match.

> - The body code has to go in its own special module.

A module is the granular unit of code update, so this makes sense.

> - The module has to export a specific set of functions with specific 
> names and specific numbers and types of arguments returning a specific 
> data structure as the result. (If only Erlang had type signatures, eh?)

There is that.

> - The skeleton passes an opaque "state" object between these functions, 
> which the functions can update.

Yep.

> What does that sound like to you? Yes, congratulations, you've just 
> invented object-oriented programming. But without dynamic binding or 
> inheritance. (Or static typing, for that matter.) A "module which 
> exports a special set of functions" is basically a *class*, and this 
> "threading opaque state from function to function" is basically a 
> stateful object.

Yep.  Except since it's actually a separate process, it's called an "actor", 
not an "object". :-)

> Of course, modules don't have inheritance. That means that you can't use 
> an abstract class to define what methods a class is supposed to have. It 
> means you can't write a half-implemented class that other concrete 
> classes can inherit from.

No. You'd instead write something that invokes functions in a module whose 
name you provided when you instantiated this module. There's actually even 
syntax for this. You don't "inherit" but you can have essentially "abstract 
modules."

> This whole "module with special functions" monstrosity is all the more 
> weird because Erlang also has
> 
> 1. first-class function names
> 
> 2. lambda functions
> 
> Um, like, WTF? Why does the code have to go in its own module? Why can't 
> I just pass you a tuple of function names, or even lambda functions, 
> that define the callbacks? Hello??

Because then you can't update the module. One of the things OTP is providing 
for you is the whole infrastructure for updating your module. That's why you 
send the "state" back to OTP and why the code has to be in a separate 
module. When OTP gets the "update" message, it invokes the new module with 
the state.

> One really neat feature of the language, which I didn't look into *too* 
> deeply, is the binary operators. 

They are pretty cool, yes. Handy, but I've seen it done better elsewhere. :-)

> Not sure what happens if you need to write bytes backwards, as some 
> broken protocols and file formats require. Is there a directive for 
> that? Or do you have to do it by hand?

I think there is. Lots of stuff can come after a second colon to control 
that sort of thing.

> Many things about Erlang still remain unexplained. I still have 
> absolutely no clue WTF an "atom" is. 

An atom is an interned string. Same as a "symbol" in LISP or Smalltalk. The 
trick is that in a message, an atom is an index into the atom table, so it's 
short regardless of how long the atom is.

> Nor have I discovered what this 
> "OTP" thing that gets mentioned every 3 sentences is. 

It's the libraries that take the basic operations Erlang supplies (like 
"load new code" or "spawn link") and turn it into things like installable 
packages.

> I have utterly no 
> idea how the "registered process names" thing works. 

A process can say "My mailbox is the registered process on server ABC for 
XYZ".  Another process can say "Give me XYZ@ABC" and get that mailbox. It 
works by shchlepping non-Erlang messages between the erlang processes, just 
like the keep-alives do.

 > Erlang is supposed
> to be for very high-performance systems, 

Not really.

> Apparently there's a distributed 
> database system written in Erlang, but I didn't find any mention of 

Mnesia. They were originally going to call it Amnesia, but someone pointed 
out that naming a database system after an illness characterized by 
forgetting things is probably a bad idea.

It's a cute little system, but it has its flaws due to being implemented on 
top of a single-assignment language and keeping everything in memory. For 
example, if the node crashes and you have a gigabyte of data in the table of 
that process, it has to go through reading all the logged updates and 
applying them to the tree in memory before it'll actually start answering 
queries. I guess if you can afford several machines running the same 
database entirely out of RAM, it's not much of a problem.

It's basically a distributed transactional system on top of the built-in 
database tables whose TLA name escapes me at the moment.

http://www.erlang.org/doc/man/mnesia.html

Note that all these things, OTP and Mnesia, along with stuff like the 
compiler, the debugger, the profiler, the package manager, etc etc etc, are 
all documented as "here's the functions you use to do it in Erlang." Almost 
none of these things are actually command-line tools.

-- 
Darren New, San Diego CA, USA (PST)
  "How did he die?"   "He got shot in the hand."
     "That was fatal?"
          "He was holding a live grenade at the time."


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.