|
![](/i/fill.gif) |
OK, so here's an application program which, I presume, I don't need to
explain to you guys. ;-)
How do you do this in Java? Well, first you're going to do something like
public class Vector3
{
public final double X, Y, Z;
public Vector3(double x, double y, double z) {X = x; Y = y; Z = z;}
public Vector3 Add(Vector3 other) {...}
...
}
Now we have vector arithmetic. Next you'll probably do something similar
for RGB colour values as well. OK, so that's the basics covered. Next,
you probably want something like
public class Ray
{
public final Vector3 Start, Direction;
public Ray(Vector3 s, Vector3 d) {Start = s; Direction = d;}
public Vector3 Point(double t)
{
return Start.Add(Direction.Scale(t));
}
}
Now you'll probably sit down and do
public abstract class Shape
{
public abstract boolean TestIsect(Ray r);
public abstract double FirstIsect(Ray r);
public abstract double[] AllIsect(Ray r);
public abstract Vector3 Normal(Vector3 p);
public abstract boolean Inside(Vector3 p);
}
The most fundamental thing you need to do to a shape is ray intersection
tests. Most of the time, you just want the nearest intersection, so
that's FirstIsect(). However, to implement CSG, you need to obtain /all/
ray intersections, hence AllIsect(). You also need to know if the points
of intersection on one shape are inside or outside some other shape,
hence Inside(). Lighting calculations invariably require the surface
normal at a given point [approximately] on the surface, so Normal(). You
also sometimes use a Shape as a bounding volume, hence TestIsect(),
which tests for an intersection without bothering to compute where it
actually is.
So far, so good. We can now write out a whole bunch of Shape subclasses
- planes, spheres, cones, etc. I won't bore you with the details. Next,
let's have a camera:
public abstract class Camera
{
public abstract Ray Shoot(double x, double y);
}
Given a point on the image plane, generate a camera ray. Nice and easy.
We can now do things like
public class Perspective extends Camera
{
public final Vector3 V0, Vx, Vy, Vz;
public Camera(Vector3 v0, Vector3 x, Vector3 y, Vector3 z)
{V0 = v0; Vx = x; Vy = y; Vz = z;}
public Ray Shoot(double x, double y)
{
Vector3 v1 = Vx.Scale(x);
Vector3 v2 = Vy.Scale(y);
Vector3 v3 = Vx.Add(Vy).Add(vz).Normalise();
return new Ray(V0, v3);
}
}
OK, so far, we have shapes. In the ray tracer I designed way, way back
in 1998 or whenever it was, an "object" had a "shape" and a "texture",
and a texture maps every point in space to a "surface", which describes
the rendering properties of the object. So we have:
public abstract class Texture
{
public abstract Surface GetSurface(Vector3 p);
}
public abstract class Surface
{
public abstract Colour ProcessIsect(Scene s, Vector3 hit, Vector3 in);
}
What I then did was to define several Surface subclasses:
- AmbientSurface always returns a user-defined Colour.
- DiffuseSurface uses Lambert's cosine rule to scale a user-defined
Colour according to the cosien of the angle of light from all the light
sources in the Scene. (It also performs shadow tests.)
- ReflectSurface implements perfect specular reflection.
- SumSurface computes the sum of several surfaces. In this way, an
object can have both reflection /and/ diffuse terms.
When I got to textures, I really had a party. First, there is obviously
ConstantTexture, which always returns a user-defined Surface for every
point in space. But then I defined various kinds of "map" classes:
- PointScalar maps every point in space to a scalar.
- PointVector maps every point in space to a unital vector.
- ColourMap maps scalars to colours.
- PointColour combines a PointScalar and a ColourMap to map every point
in space to a colour.
I had basically planar and spherical maps. Then I had various classes
that transform the coordinates being input to a map. For example, if you
take a spherical map and "slice" it through the middle, it becomes a
conical map. I also had an elaborate system whereby you could use a
PointScalar to scale a PointVector and use that to displace the
coordinates being fed to some other mapping. In this way, you could take
a simple planar map and sine-modulate it on multiple axes to generate
complicated swirly patterns.
You then have things like DiffuseTexture, which uses a PointColour to
assign a colour to every point in space, and produce a DiffuseSurface
with that colour. (Ditto for AmbientTexture, ReflectTexture, etc.) And,
obviously, CheckerTexture, which takes two Texture objects and picks
which one to return for different points in space. Or a Texture that
linearly interpolates between several subordinary Textures based on a
ScalarMap. Or...
Implementing light sources is fiddly. It depends on whether you just
want point lights, or whether you want area lights. But let's keep it
simple and stick to point lights only.
public class Light
{
public final Vector3 Position;
public final Colour Colour;
public Light(Vector3 p, Colour c) {Position = p; Colour = c;}
}
Now a Scene is fairly simple; it holds lights, camera, and action. It
also seems like a reasonable place to stick things like the TraceRay()
function:
public class Scene
{
public final Camera Camera;
public final Object[] Objects;
public final Light[] Lights;
public Scene(...)
public Colour TraceRay(Ray r)
{
double[][] ts = new double[Objects.length];
for (int lp=0; lp<ts.length; lp++)
ts = Objects[lp].FirstIsect(ray);
Object o;
double t = 1e100;
for (int x=0; x<ts.length; x++)
for (int y=0; y<ts[x].length; y++)
if (ts[x][y] > 0 && ts[x][y] < t)
{t = ts[x][y]; o = Objects[x];}
Vector3 p = o.GetShape.Point(t);
Surface s = o.GetTexture().GetSurface(p);
return s.ProcessIsect(this, p, r.Direction);
}
}
Now just add some classes that implement mapping pixel coordinates to
image plane coordinates (taking care to invert Y), generating camera
rays, feeding them to TraceRay(), maybe antialiasing, and writing the
results to disk and/or screen. And there, that's your ray tracer.
My original ray tracer actually included slightly more code, because it
let you inspect the scene at run-time and get human-readable
descriptions for various things. (E.g., Shape had a GetName() function,
and ShapeSphere.GetName() returns "Sphere".) I never actually got as far
as /using/ any of this infrastructure, however. And actually, I didn't
build many scenes, because... well, look at it:
Scene out = new Scene();
Shape sh = new ShapeSphere(new Vector3(0, 0, 0), 1);
Surface su = new DiffuseSurface(Colour.RED);
Object ob = new Object(sh, new ConstantTexture(su));
out.AddObject(ob);
Shape sh = new ShapePlane(new Vector3(0, 1, 0), -2);
Surface su = new DiffuseSurface(Colour.GREEN);
Object ob = new Object(sh, new ConstantTexture(su));
out.AddObject(ob);
Shape sh = new ShapePlane(new Vector3(0, -1, 0), -2);
Surface su = new DiffuseSurface(Colour.BLUE);
Object ob = new Object(sh, new ConstantTexture(su));
out.AddObject(ob);
out.SetCamera(new PerspectiveCamera(Vector3(0, 0, -5), Vector3(1, 0,
0), Vector3(0, 1, 0), Vector3(0, 0, 2.5));
out.AddLight(new Vector3(0, 0, -5), Colour.WHITE);
This places a piffling three objects and one light source. YUCK! You can
see why I didn't write many test scenes...
So how does all this look in Haskell?
First of all, we need vectors.
data Vector3 = Vector3 !Double !Double !Double
class Num Vector3 where
(Vector3 x1 y1 z1) + (Vector3 x2 y2 z2) = Vector3 (x1+x2) (y1+y2)
(z1+z2)
...
Similarly, we need colours. Unlike Java, we can actually use real
grown-up operator names, so we don't have to write monstrosities like
return v1.Add(v2).Add(v3).Normalise();
Instead we can simply say
normalise (v1 + v2 + v3)
which is far easier to read. It also means stackloads of existing
numeric functions automatically work with vectors (e.g., the sum
function can sum a list of vectors - or a list a colours, which turns
out to be more useful).
Note, also, that we don't need to do this stupid
public Vector3(double x, double y, double z) {X = x; Y = y; Z = z;}
and then write
new Vector3(1, 2, 3)
every time we want a vector. Instead, you can simply do
Vector3 1 2 3
to create a vector.
Similarly, look back at the Java Ray class. See how much boilerplate
code there is for declaring that the fields are public and constant, and
for constructing a new object, and so forth. Now look at Haskell:
data Ray = Ray {ray_start, ray_direction :: Vector3}
ray_point :: Ray -> Double -> Vector3
ray_point (Ray s d) t = s + d |* t
Much more compact, and yet much more /readable/ at the same time.
These, of course, are mere trifles. Let us get to the real meat of the
problem. Now, we need to implement shapes somehow. The /obvious/ thing
to do is copy the Java version; define a class for shapes, and then
create various data structures representing concrete shapes:
class Shape s where
isect :: s -> Ray -> [Double]
normal :: s -> Vector3 -> Vector3
inside :: s -> Vector3 -> Bool
data Sphere = Sphere {center :: Vector3, radius :: Double}
instance Shape Sphere where
isect (Ray s d) (Sphere c r) = ...
data Plane = Plane {normal :: Vector3, offset :: Double}
instance Shape Plane where
isect (Ray s d) (Plane n o) = ...
Notice the isect function. In Java, we have /three/ functions for
efficiency; one function only bothers to compute /whether/ there are any
intersections, another that only bothers to compute the /first/
intersection, and another which computes /all/ intersections.
Haskell's lazy evaluation makes this quite unnecessary. All we need is
/one/ function which returns all intersections in a lazy list.
Inspecting whether the list is empty evaluates just enough of isect to
answer this question, and no more. And obviously, accessing only the
first solution does not cause any further solutions to be computed. Note
we're actually doing /better/ than Java: If our CSG happens, at
run-time, to need the first /three/ intersections but no further, then
only these are computed.
(In fairness, we could use an Iterator in Java to achieve the self-same
thing. The hasNext() function would enable intersection tests without
computing the actual location, and next() would compute one additional
intersection. It would be really quite a lot of work to set up this much
lazy evaluation manually, though. Haskell gives it to us for free!)
You can certainly do things this way. There is a snag, however. If you
stick to Haskell 2010 (the currently published official language
specification), you cannot easily make a list of shapes. Because all the
elements of a list have to be of identical types. And (for example)
Sphere is not the same type as Plane, even if they do both implement the
Shape class (and possibly other classes like Eq or Show). There is no
way to say "a list of everything that has these methods".
GHC adds such a way. It's called
{-# LANGUAGE ExistentialQuantification #-}
It could more accurately be called "existential /type/ quantification",
or better yet "hidden type variables". But whatever. It lets you write
data AnyShape = forall s. Shape s => Wrap s
With this incantation, Sphere is still a different type from Plane, but
if you take a Sphere object and apply Wrap to it (or any other type that
implements Shape), it becomes an AnyShape type. So you cannot write
[sphere 0 1, plane 1 0]
but you /can/ write
[Wrap (sphere 0 1), Wrap (plane 1 0)]
and this is valid. Notice, however, that it is impossible to write an
"unwrap" function. Because once you "wrap" something, its original type
information is lost forever, so there's no way in hell to know what type
to cast it back to.
(More precisely, all type information is lost at compile-time, just like
in C or Pascal. It's called "type erasure".)
You can't write an unwrap function, but what you /can/ do is make
AnyShape itself an instance of the Shape class:
instance Shape AnyShape where
isect (AnyShape s) r = isect s r
...
Now we still can't unwrap an AnyShape, but we don't need to. We can just
treat AnyShape like a normal Shape instance.
You can't do any of this in Haskell 2010. "forall" is not a valid
keyword there. The best you can do in Haskell 2010 is to define
data AnyShape =
AS_Sphere Sphere |
AS_Plane Plane |
...
Every time you define a new Shape instance, you need to update this
giant monolithic data declaration. Yuck!
But wait... Actually, there is a simpler way. What do we want to /do/
with our list of shapes? All we /actually/ want to do is execute shape
methods. But because functions are first-class in Haskell, we can
actually solve this in an unusual way. Rather than storing a data
structure representing the shape and providing a way to statically look
up the correct ray-intersection function, we can just STORE THE
INTERSECTION FUNCTION!
Well, actually we've got /three/ functions. But we can make a data
structure which stores all three:
data Shape =
Shape
{
shape_isect :: Ray -> [Double]
shape_normal :: Vector3 -> Vector3
shape_inside :: Vector3 -> Bool
}
sphere :: Vector3 -> Double -> Shape
sphere center radius =
Shape
{
shape_isect = (\ ray -> ...),
shape_normal = (\ point -> vnormalise (point - center)),
shape_inside = (\ point -> vmag (point - center) < radius)
}
plane :: Vector3 -> Double -> Shape
plane normal offset =
Shape
{
shape_isect = (\ ray -> ...),
shape_normal = (\ _ -> normal),
shape_inside = (\ point -> normal `vdot` point < offset)
}
Now a "shape" is simply an ordinary data structure, which contains some
function pointers. And each sort of shape - a sphere, a cone, whatever -
is an ordinary /function/ which fills out this data structure with the
right function pointers. In particular, every type of shape now has the
same type signature. There is no Sphere type, no Plane type, no Cone
type, there is only a Shape type. So now we can write
[sphere 0 1, plane 1 0]
and have it be well-typed. (It's [Shape].)
One thing about this transformation: It is now impossible to "look at" a
shape and know anything about it. Before, when each shape was
represented as a data structure, you could look at (say) a sphere and
see where its center is or what its radius is. Now you can't do that.
After the shape is created, it is impossible to do anything with it
except ray intersection tests.
That might be a problem if, for example, you wanted to construct a
bounding volume or something. OTOH, you could just add some more fields
containing function pointers for supporting such volume construction. In
general, whatever abilities you need shapes to have, you add more fields
to Shape to support that.
Notice that with our fancy AnyShape wrapping technique, you /also/ lose
the ability to do anything except what the Shape class provides. In
fact, GHC internally implements that particular trick by doing exactly
what we're manually doing here - storing a dictionary of function
pointers. But doing it by hand means you don't need any unofficial
language extensions.
OK, so we can solve the shape problem. Now how about a camera?
Well, this is where the fun /really/ starts. A camera simply maps an
image plane coordinate to a ray. So in Haskell, a camera is /just a
function/!
type Camera = Vector2 -> Ray
Similarly, a Texture becomes utterly trivial:
type Texture = Vector3 -> Surface
In general, and Java class that has "just one method" can be turned into
a plain vanilla Haskell function. That means that Surface is just a
function, although we need to think carefully about what arguments it
needs. As it turns out, Surface needs quite a few items of data, all of
similar types. Since function arguments are unnamed in Haskell, it's
probably a good idea to define a data structure. (Otherwise we'll
forever be supplying the arguments in the wrong order and causing weird
bugs that the compiler can't catch.)
data RayIsect =
RayIsect
{
isect_point :: Vector3,
isect_direction :: Vector3,
isect_normal :: Vector3
}
type Surface = Scene -> RayIsect -> Colour
Notice, again, that we can fill out /all/ fields, and then lazy
evaluation will skip any calculations we don't need. In particular,
Surface types that don't depend on the surface normal can skip the
surface normal calculation.
With the definitions above, a "constant texture" becomes so trivial it's
barely worth defining a name for it. You can just write "const s" and
you're done. Let's look at some simple instances:
cam_orthographic :: Vector3 -> Vector3 -> Vector3 -> Vector3 -> Camera
cam_orthographic v0 vx vy vz =
\ (Vector2 x y) = Ray {start = v0 + x *| vx + y *| vy, direction = vz}
cam_perspective :: Vector3 -> Vector3 -> Vector3 -> Camera
cam_perspective vx vy vz =
\ (Vector2 x y) = Ray {start = v0, direction = vnormalise (x *| vx
+ y *| vy + vz)}
sur_ambient :: Colour -> Surface
sur_ambient c = \ _ _ -> c
sur_diffuse :: Colour -> Surface
sur_diffuse c =
\ scene isect ->
let
fn light =
if shadow_test scene light isect
then cBlack
else c *| (light_point light `vdot` isect_point isect)
colours = map fn (scene_lights scene)
in sum colours
sur_reflect :: Colour -> Surface
sur_reflect c =
\ scene isect ->
let
v_in = isect_direction isect
v_norm = isect_normal isect
v_out = v_in - 2 * (v_in `vdot` v_norm) *| v_norm
in trace_ray scene (Ray {start = isect_point isect, direction =
v_out})
Recall that in Java, ever single one of these things would be an entire
class, with the "public class Woo extends Wah" and the field declaration
and the constructor declaration, and only THEN do we get to writing the
bit of code that actually does something useful.
In Haskell, we just write the useful bit. All of the stuff about is
useful code, and no filler. It's all stuff implementing actual maths,
not micromanaging object construction or field initialisation or whatever.
So if you can't access object internals, how can you implement spatial
transformations? Easy: You transform the coordinates before input.
transform_texture :: Transform -> Texture -> Texture
transform_texture transform texture =
\ point -> texture (transform point)
Now wasn't that easy? Compare Java:
public class TransformTexture extends Texture
{
public final Transform Trans;
public final Texture Text;
public TransformTexture(Transform t1, Texture t2)
{Trans = t1; Text = t2;}
public Surface GetSurface(Vector3 p)
{
return Text.GetSurface(Trans.Apply(p));
}
}
What a load of waffle! Especially when you consider that a typical
Haskeller would probably write
transform_texture :: Transform -> Texture -> Texture
transform_texture = flip (.)
and be done with it!
(This works because this convoluted-sounding concept of "apply a
coordinate transformation to the input before passing it to the texture"
is really nothing other than /function composition/, which is a
well-known concept in Haskell. A coordinate transformation maps a point
to a point, a texture maps a point to a Surface. Combining these steps
is clearly function composition. When you think about it like that...
suddenly it's /obvious/ that this is a trivial thing.)
When you start thinking of (say) a coordinate transformation as a
trivial function rather than as some complex active "object" which has
"behaviour", "state" and "identity" which needs "constructors" and
"destructors" and "gettters" and "setters" and "reflection" and... Well,
suddenly all your problems start to look a hell of a lot simpler.
Take a look back at all those "map" classes I had. In Haskell, each of
these is merely a function. It's hardly worth assigning names to them.
Indeed, in Java I had trouble picking suitably descriptive names for all
the combinations and variations I could come up with. In Haskell, I
don't need to bother. If I want to deal with functions that map points
to scalars, I can just /call/ them "functions that map points to
scalars". Which is actually less confusing then naming them ScalarMap or
whatever. It more succinctly conveys what they do.
What have I just said? The short summary is that Haskell, the finest
functional programming language in the land, is superior to Java, one of
the more sucky OOP languages. Not exactly a revelation, is it? I think
I'm going to go outside for a while...
Post a reply to this message
|
![](/i/fill.gif) |