|
![](/i/fill.gif) |
It has been said that Haskell (and languages like it) will forever
remain an interesting curiosity until the day when large, real-world,
*useful* applications exist written in it. Only then will anybody
seriously take any notice.
Erlang has Wings 3D. You don't have to know or care about Erlang to use
Wings. It's impressive in its own right, not because it's Erlang. There
is something to be said for the argument that no programming language
can really be taken seriously until it has at least one real-world
application that people actually use for real stuff.
Somebody somewhere wrote that nobody really gave a damn about Ruby until
Ruby on Rails came along. Now of course *everybody* knows about Ruby.
And apparently every "web framework" must be compared to Rails, because
it is obviously the best framework that has ever existed in the history
of mankind.
I have literally no idea what a "web framework" is. I don't know
anything about Ruby, and I have no idea what Rails actually does. What I
*do* know is that there's been a lot of buzz in the Haskell community
lately about several Haskell web frameworks under aggressive development
right now. Apparently a few people are of the opinion that one of these
might finally be the Killer App to put Haskell on the map. So I thought
I'd take a look...
(As it turns out, since the connection between web browser and web
server is untyped, this becomes an interesting instance of Darren's
"static typing is pointless" problem.)
I started with the implausibly-named (and unpronounceable) "Yesod"
framework.
The very first thing I noticed is that it's *documented*. Understand
this: Most Haskell projects consist of half a dozen source code files.
When you look at the project page, you see a list of all the data
structures and functions that it exports, and nothing more. IF YOU'RE
LUCKY, there might be some human-written exposition saying what each
thing on the list does. Maybe even a tiny snippet of example code. And
that is all.
Part of the problem is that the Haskell build system has hard-wired
support for Haddock, an API reference document generator similar to
Doxygen, JavaDoc, and a bazillion others. Which means that every package
on Hackage.org has an API reference (regardless of whether the author
intended this). If you're lucky, the author will have written some
documentation to go with the API reference. But only if you're lucky.
An API reference - even a *good* one - is no substitute for a
comprehensive *explanation* of what the hell is actually going on. But
Haddock doesn't support that kind of thing, and Hackage has no way to
include such documentation even if you happen to have written it. The
best you can do is host your own website (at your own expense) and
include the URL in the package description.
Yesod *has* its own website, and some pretty extensive documentation.
It's not perfect, but it's vastly better than 99% of Hackage. So I set
about reading all the documentation they provide:
http://www.yesodweb.com/book/
The results of my investigation were... interesting.
First of all, this is no monolithic block of code. It's a vast
arrangement of many separate libraries, and clearly some effort has been
directed towards making it easy to use a different building block for
this function or that function if you so desire. Having said that,
clearly the "default" combination of components is going to be the one
that's most thoroughly tested and takes the least amount of work to use,
out of the box.
Briefly summarising what I've learned, it seems Yesod offers a medium
level of abstraction. There have been other projects that do much more
work for you, at the expense of forcing you to design your application
in a specific way. Yesod seems to be a somewhat lower-level offering,
giving you more flexible options as a consequence, but also making you
work a bit harder.
[For example, there was an experimental system where you wrote
user/computer interactions as transactions in a special monad, so it
looked just like a regular CLI program, but under the hood the framework
auto-generated HTML forms, session cookies, ran a non-relational
database engine to hold user state, handled the back-button by rolling
back transactions automatically, and all kinds of freaky stuff. It's
very impressive that such a thing can be done, but obviously it does
utterly force you to follow the framework designer's vision of The One
True Way that a web application should work.]
The way Yesod works is interesting. By default, it uses its own internal
web server named "Warp". It's written in 100% Haskell, and it's
supposedly extremely fast (hence the name). Like so many things, Warp
wasn't actually developed as part of Yesod, it was just what they chose
to have as their default component.
I gather from the blurb, though, that it's supposed to be possible to
run your application as a CGI binary instead. [Although the
documentation warns you to never, ever do this, because forking a new
process for every connection would be far too unperformant.] Fast-CGI is
also supported, and presumably much faster if you're already running
Apache or something. SCGI also works. You're even supposed to be able to
use WebKit to make "native desktop applications".
The next interesting thing is how content works. On a typical web
server, you have static pages and other resources, and then you have CGI
*scripts*, or possibly various kinds of templates that look like HTML,
but they go through a preprocessor which inserts external data into
certain places within the template.
Yesod does it differently. Templates are Haskell source code. They can
be in separate files or inline in your main program, but either way,
every template gets compiled into executable machine code. This means
that template processing is lightning fast. It also means that every
single damned time you change anything in any template anywhere, you
have to recompile that template, and relink the entire executable.
By default, the Warp web server is used. Since this is written in
Haskell, the end result is that the web server, application logic,
presentation templates and static files are all compiled into one giant
binary executable. When you run this, it opens a TCP port and starts
listening for HTTP requests.
It also means that any time anything in any template changes, that
template must be recompiled, and then the entire program relinked. And
then you have to shut down the running server and fire up the newly
compiled on.
All of which seems... a bit strange, to me. Then again, if you want to
use Apache, you can build your application for Fast-CGI instead. Then
only the application logic and presentation templates get linked into a
giant binary blob, and you don't have to restart the entire server to
update. It still seems like a rather heavyweight approach.
The templates make use of Haskell's "quasi-quoting" feature. This
basically allows you to embed arbitrary strings into Haskell source
code, provided you write a parser. But it also means that those strings
can refer to any Haskell things that are currently in-scope. For
example, you can write a Haskell loop, and the loop body can be a
template which refers to the loop variable.
One of the features the manual makes a big deal of is "typesafe URLs".
Essentially, instead of using strings to refer to things, you create
data structures which represent valid URLs. And I *don't* mean that you
have a data structure which represents the parsed pieces of a URL. I
mean you have a dummy data type called "HomePage", and then a data type
called "UserProfile" with a user ID field, and then a "Topic" with a
topic ID field. And then you tell Yesod how to convert "UserProfile 42"
into a URL, and how to convert a URL into a UserProfile 42 again.
That done, all your templates refer to UserProfile 42. The compiler is
therefore about to statically check, at compile-time, that every
generated URL is valid. You can never accidentally misspell a URL
(misspelling a data type is a compile-time error), you can never forget
a component of a pathname (there *are* no pathnames). If you want to
change the mapping from logical resources to physical URLs, you change
it in one single place. You don't have to search the whole codebase and
change dozens of URLs. About the worst thing that can happen is that you
say UserProfile 138, and there *isn't* a user number 138 at the moment.
(And in that case, you can return a HTTP 404 saying "no such user",
rather than just a generic "file not found".)
In a similar way, the mapping from URLs to "handlers" (the code that
actually decides what data to return for a given URL) is declarative,
not imperative. You specify which URL maps to what data type, and what
handler. If URLs overlap, that is a compile-time error. Any URL you
don't provide a mapping for is HTTP 404.
Similar to all this, Yesod uses the type system to distinguish between
raw text which needs to be escaped before being inserted into HTML / CSS
/ JS (with the correct escaping rules for each one), and text which is
already in such a format (and must /not/ be escaped, because that would
mangle it). So there's one syntax for inserting stuff into a web page,
and the type system detects whether it needs to be escaped first.
(Unfortunately, there's a different syntax for inserting URLs. Plus
there's half a dozen slightly different templating languages, and the
differences are poorly explained. Still, "whamlet". I thought only XKCD
talked about WHAM! anymore...)
Amusingly, there's a system to run a template, take an MD5 hash of the
inputs to it, and save the output to disk so it can be cached on the
server side for faster delivery, and on the client side for reduced
server load.
Yesod provides other goodies. For example, it has a "widget" concept.
The idea appears to be such that you can design, say, a date picker, and
make that a library. You can then reuse this wherever you want it. In
particular, you can have two date pickers on the same page.
Why can't you do that already? Well, problem #1: a widget like this
probably consists of an HTML form plus some CSS to style it plus some
JavaScript to run it. These three elements go in different parts of the
page. Yesod's widget concept handles taking lots of widgets and putting
all the CSS in one place, all the JS in another place, etc. Problem #2:
name collisions. Again, Yesod lets you generate unique IDs to get around
that. (E.g., for form names, variable names, etc.)
Personally, I think Yesod's widgets are a nice idea, but a tad minimal.
They probably need more functionality to really be useful.
Widgets aren't just for forms, of course. You can use them for navbars
or news panels or anything you like. And then Yesod has a dedicated
forms module, which is meant to allow you to take existing forms and
transparently glue them together to make bigger forms. Again, I'm
unconvinced of how well it really works, but I admire the idea.
There's also a "session" feature which lets you store key/value pairs in
a cookie. The interesting part is that the cookie is encrypted with a
key only the server knows (so clients can't see what's in it), and
signed (so clients can't change the contents). It adds overhead, but I
don't think I've ever seen it this easy before...
Then there's the "persistent" framework, where you write some Haskell
data structure definitions, and Yesod generates SQL to create the
database, and boilerplate code for converting SQL data into Haskell data
and vice versa. [Actually, it supports "no SQL" databases too. And comes
with a default "in-memory database" for test purposes.] Now if you try
to put a string as the user ID value, you get a compile-time error,
because that field *should* be an integer.
The interesting question, of course, is "what happens if the schema
changes?" It seems Yesod is *assuming* that the schema will never change
unless Yesod changes it. (I presume I don't need to point out how
jaw-droppingly flawed such an assumption is.) Yesod provides a feature
to let you do a "migration". And by that, I mean that Yesod will load up
the current database schema, compare to what your compiled application
is expecting the schema to be, and then issue a bunch of ALTER TABLE
commands to make the database match what the application expects.
If you put a migration statement at the start of your application, then
each time your application starts up, it will change the database to
match the schema it expects. That means if you alter your application so
the schema changes, when you recompile and run the new version, the
database is automatically updated before the application starts running.
While this *does* take care of part of the problem, it's by no means a
perfect solution. Immediate problems include:
- There are situations where Yesod can't correct the database
automatically. (The main one being where a field is renamed. Yesod has
no idea what to rename, just by looking at it.) Still, at least the
application will halt at the migration step, rather than running for a
few days and *then* crashing in the middle of a rarely-used query.
- Just blindly munging the schema as you see fit is fine for small toy
applications. I wouldn't dare do such a thing in a production
environment, however. Perhaps the extra fields in the DB but not in
Haskell are used by somebody else? We really shouldn't just go and
delete them just like that...
- ...which brings us to the fundamental problem with the whole design
here. It fundamentally assumes that the database is *only* for this one
application. Now it seems to have somehow been lost to history, but I
have C. J. Date on my bookshelf, and he tells me that part of the
definition of a "database" is that it is a store of data accessed by
MORE THEN ONE APPLICATION. So no single application should just blindly
alter the schema as it sees fit. That's why SQL databases have things
like views; it lets different programs see different parts of the
database, without having to rejigger every application every time the
data changes, nor alter the data when any application changes. That's
one of the fundamental reasons to use a proper database rather than a
plain flat file!
Still, given the current crazy for "no-SQL databases" (i.e., "we don't
need no stinking coherent theory of operation!"), I guess this point is
lost on most designers...
Given this assumption, it comes as no surprise that Yesod hasn't even
considered the possibility that a schema might change at run-time. But
then again, if it does, what can you actually do about it? Either the
tables and fields you need are present, or they aren't. If they aren't,
that's pretty much *got* to be a fatal run-time error.
About the only schema change that looks potentially survivable is if a
field changes type. If Haskell is only moving that data from A to B,
then arguably if Haskell /didn't/ check its type, this change could be
survivable. Then again, I'm struggling to think of a scenario where the
type wouldn't matter and just calling it "text" or "raw binary" wouldn't
be OK.
Having digested all of that, I went and took a look at Happstack:
http://happstack.com/docs/crashcourse/index.html
(The "Haskell Application Server Stack".)
A surprising amount of the details were similar in concept or downright
*identical* to Yesod. (In some cases, it's actually using the same
components for various jobs.) For example, typesafe URLs make another
appearence, although now with imperative routing rather than declarative.
While Yesod defaults to using the Shakespare series of template
languages, Happstack appears to be quite happy for you to use
Shakespare, BlazeHtml, HString, Heist or several others.
Shakespare uses fancy syntax resembling normal HTML / CSS / JS and
compiler trickery to convert this into regular Haskell code. BlazeHtml
uses regular Haskell code in the first place. Heist, on the other hand,
has templates which are just regular HTML files, and it searches them
for markers where dynamic content should be inserted. In other words,
templates are just regular HTML that any web designer would understand,
and they can be changed without recompiling (or restarting) your
application server.
[Incidentally, Heist itself is part of "Snap", yet *another* Haskell web
framework, which I have yet to properly research...]
Then there's JMacro, which allows you to embed JavaScript directly in
your Haskell source file, and have it syntax-checked at compile-time.
(I.e., the final JavaScript sent to the web browser will never have
syntax errors; these errors are caught at Haskell compile-time.) Plus of
course, it lets you splice chunks of JS together, generate unique
variable or function names, insert Haskell data into JS, and so on.
As an extra, JMacro allows Haskell-style anonymous function definitions.
Quite why the hell that's useful I'm not sure. Oh, and also ML-style
functions, which are much more wordy. WTF?
So there you have it. It seems there's quite a few web frameworks out
there now. And I just can't help noticing that *all* of the example code
I've looked at is really very, very long and difficult to follow,
considering the trivial amount of functionality it actually
implements... o_O
Post a reply to this message
|
![](/i/fill.gif) |