|
|
You might have heard of BOINC. It powers SETI@Home, World Community
Grid, LHC@home, and many other "volunteer" distributed computing
projects. Even my little POV-Ray renderfarm :)
From the user (volunteer) side, here's how it works: User downloads the
BOINC client. Installs it. On first run, it pops the project attaching
wizard. User enters URL of the project (or chooses from a bundled but
incomplete list). Enters his email and password to create an account.
And that's it, you're attached. BOINC will ask for workunits, download
the executable needed to process it and input files specific to each
workunit, process them, and upload them back. You can attach to multiple
projects, and BOINC will get work from all.
Users also get "credits" for the CPU cycles they contribute. Since the
beginning of BOINC, credit fairness has been the biggest source of
flamewars :) But I'll spare you the details of that mess.
From the server (project owner) side: Umm where do I start. It consists
of five daemons and two CGIs, all written in C++. The webpages are made
in PHP. There is an admin web interface, which doesn't really do too
much. Most is controlled with XML configuration and command-line tools.
There's way more to it but I wanted to keep the "introduction" brief.
Rant time.
Once you get to the implementation details... it's *so* ugly. XML used
in a thousand places, but *never* with a real parser (strstr is good
enough for their needs, apparently). I see a potential buffer overflow
in every other line of C++. (see http://ln-s.net/1jqo for a random
example) The PHP code was mostly written by undergrads.
The GUI-client communication uses XML too, but the protocol
(undocumented) is really a direct dump of internal structures. Have fun
reading the code and figuring out what <state>2</state> really means.
The client communicates with the worker apps using shared memory[1]. XML
over shared memory, to be more precise.
The client-server protocol (mostly undocumented) sometimes seems
*designed* so that alternate implementations of either side are
impossible, or very hard. The client sends its version number, and the
server can be configured to reject requests from old clients. This is in
case the project requires a specific feature that was added in a certain
version. But the comparison isn't made with a "protocol version" or
"runtime version" or anything like that. It's just the core client version.
Suppose BOINC version 4.1 added support for feature A, and version 4.2
added feature B. Projects needing A require 4.1; projects needing B
require 4.2. So far so good. If I write my own client, I'll need to
'fake' the version number I send, to match the features I implemented.
But if I implement B and not A, what version number do I send?
And... *While* I was writing this post, there was a message on the
mailing list. An idea for fixing the problems with split cross-project
ID. It involved adding a timestamp. Storing the CPID and the timestamp
in the 'cpid' column on the user table, separated by a space.
*I* notice how that's a bad idea (two values in the same DB field). I
never took any database classes on a university. BOINC stands for
Berkeley Open Infrastructure for Network Computing. Do they have no
database classes at Berkeley or what??
*sigh*
Sorry for the long post. I'd need many many paragraphs to explain how
the huge system works. And about twice that to explain how it *doesn't*
work :)
[1]
<PovAddict> I asked about shared memory because core client and science
app talk to each other that way
<rtyler> instead of sockets or pipes?
<PovAddict> yeah
<rtyler> shared memory has to be right below "printing and scanning with
OCR" in terms of favorite means of IPC :P
Post a reply to this message
|
|
|
|
> Once you get to the implementation details... it's *so* ugly.
Heh. And while I rant here about the ugly implementation details, yet
another flamewar forms in SETI@Home forum about the ugly general stuff.
Like credits, why interest drops, how developers don't listen to users,
how there isn't anything interesting in running BOINC, etc.
http://setiathome.berkeley.edu/forum_thread.php?id=46014
Damn long thread, sheesh.
Post a reply to this message
|
|
|
|
"Nicolas Alvarez" <nic### [at] gmailisthebestcom> wrote in message
news:47ec26d9$1@news.povray.org...
> And... *While* I was writing this post, there was a message on the
> mailing list. An idea for fixing the problems with split cross-project
> ID. It involved adding a timestamp. Storing the CPID and the timestamp
> in the 'cpid' column on the user table, separated by a space.
>
> *I* notice how that's a bad idea (two values in the same DB field). I
> never took any database classes on a university. BOINC stands for
> Berkeley Open Infrastructure for Network Computing. Do they have no
> database classes at Berkeley or what??
Recently heard - "Normalisation's great in theory, but it doesn't work in
practice"
WT#$#$%%^$%^???????
Post a reply to this message
|
|