POV-Ray: Newsgroups: povray.off-topic: Haskell vs Java: Building a ray tracer: Haskell vs Java: Building a ray tracer

POV-Ray : Newsgroups : povray.off-topic : Haskell vs Java: Building a ray tracer : Haskell vs Java: Building a ray tracer		Server Time 29 Jul 2024 02:29:38 EDT (-0400)
From: Invisible
Date: 28 Jun 2012 06:07:18
Message: <4fec2cd6@news.povray.org>
OK, so here's an application program which, I presume, I don't need to 
explain to you guys. ;-)

How do you do this in Java? Well, first you're going to do something like

   public class Vector3
   {
     public final double X, Y, Z;

     public Vector3(double x, double y, double z) {X = x; Y = y; Z = z;}

     public Vector3 Add(Vector3 other) {...}
     ...
   }

Now we have vector arithmetic. Next you'll probably do something similar 
for RGB colour values as well. OK, so that's the basics covered. Next, 
you probably want something like

   public class Ray
   {
     public final Vector3 Start, Direction;

     public Ray(Vector3 s, Vector3 d) {Start = s; Direction = d;}

     public Vector3 Point(double t)
     {
       return Start.Add(Direction.Scale(t));
     }
   }

Now you'll probably sit down and do

   public abstract class Shape
   {
     public abstract boolean TestIsect(Ray r);
     public abstract double FirstIsect(Ray r);
     public abstract double[] AllIsect(Ray r);
     public abstract Vector3 Normal(Vector3 p);
     public abstract boolean Inside(Vector3 p);
   }

The most fundamental thing you need to do to a shape is ray intersection 
tests. Most of the time, you just want the nearest intersection, so 
that's FirstIsect(). However, to implement CSG, you need to obtain /all/ 
ray intersections, hence AllIsect(). You also need to know if the points 
of intersection on one shape are inside or outside some other shape, 
hence Inside(). Lighting calculations invariably require the surface 
normal at a given point [approximately] on the surface, so Normal(). You 
also sometimes use a Shape as a bounding volume, hence TestIsect(), 
which tests for an intersection without bothering to compute where it 
actually is.

So far, so good. We can now write out a whole bunch of Shape subclasses 
- planes, spheres, cones, etc. I won't bore you with the details. Next, 
let's have a camera:

   public abstract class Camera
   {
     public abstract Ray Shoot(double x, double y);
   }

Given a point on the image plane, generate a camera ray. Nice and easy. 
We can now do things like

   public class Perspective extends Camera
   {
     public final Vector3 V0, Vx, Vy, Vz;

     public Camera(Vector3 v0, Vector3 x, Vector3 y, Vector3 z)
     {V0 = v0; Vx = x; Vy = y; Vz = z;}

     public Ray Shoot(double x, double y)
     {
       Vector3 v1 = Vx.Scale(x);
       Vector3 v2 = Vy.Scale(y);
       Vector3 v3 = Vx.Add(Vy).Add(vz).Normalise();
       return new Ray(V0, v3);
     }
   }

OK, so far, we have shapes. In the ray tracer I designed way, way back 
in 1998 or whenever it was, an "object" had a "shape" and a "texture", 
and a texture maps every point in space to a "surface", which describes 
the rendering properties of the object. So we have:

   public abstract class Texture
   {
     public abstract Surface GetSurface(Vector3 p);
   }

   public abstract class Surface
   {
     public abstract Colour ProcessIsect(Scene s, Vector3 hit, Vector3 in);
   }

What I then did was to define several Surface subclasses:

- AmbientSurface always returns a user-defined Colour.
- DiffuseSurface uses Lambert's cosine rule to scale a user-defined 
Colour according to the cosien of the angle of light from all the light 
sources in the Scene. (It also performs shadow tests.)
- ReflectSurface implements perfect specular reflection.
- SumSurface computes the sum of several surfaces. In this way, an 
object can have both reflection /and/ diffuse terms.

When I got to textures, I really had a party. First, there is obviously 
ConstantTexture, which always returns a user-defined Surface for every 
point in space. But then I defined various kinds of "map" classes:

- PointScalar maps every point in space to a scalar.
- PointVector maps every point in space to a unital vector.
- ColourMap maps scalars to colours.
- PointColour combines a PointScalar and a ColourMap to map every point 
in space to a colour.

I had basically planar and spherical maps. Then I had various classes 
that transform the coordinates being input to a map. For example, if you 
take a spherical map and "slice" it through the middle, it becomes a 
conical map. I also had an elaborate system whereby you could use a 
PointScalar to scale a PointVector and use that to displace the 
coordinates being fed to some other mapping. In this way, you could take 
a simple planar map and sine-modulate it on multiple axes to generate 
complicated swirly patterns.

You then have things like DiffuseTexture, which uses a PointColour to 
assign a colour to every point in space, and produce a DiffuseSurface 
with that colour. (Ditto for AmbientTexture, ReflectTexture, etc.) And, 
obviously, CheckerTexture, which takes two Texture objects and picks 
which one to return for different points in space. Or a Texture that 
linearly interpolates between several subordinary Textures based on a 
ScalarMap. Or...

Implementing light sources is fiddly. It depends on whether you just 
want point lights, or whether you want area lights. But let's keep it 
simple and stick to point lights only.

   public class Light
   {
     public final Vector3 Position;
     public final Colour  Colour;

     public Light(Vector3 p, Colour c) {Position = p; Colour = c;}
   }

Now a Scene is fairly simple; it holds lights, camera, and action. It 
also seems like a reasonable place to stick things like the TraceRay() 
function:

   public class Scene
   {
     public final Camera   Camera;
     public final Object[] Objects;
     public final Light[]  Lights;

     public Scene(...)

     public Colour TraceRay(Ray r)
     {
       double[][] ts = new double[Objects.length];
       for (int lp=0; lp<ts.length; lp++)
         ts = Objects[lp].FirstIsect(ray);

       Object o;
       double t = 1e100;
       for (int x=0; x<ts.length; x++)
         for (int y=0; y<ts[x].length; y++)
           if (ts[x][y] > 0 && ts[x][y] < t)
             {t = ts[x][y]; o = Objects[x];}

       Vector3 p = o.GetShape.Point(t);
       Surface s = o.GetTexture().GetSurface(p);
       return s.ProcessIsect(this, p, r.Direction);
     }
   }

Now just add some classes that implement mapping pixel coordinates to 
image plane coordinates (taking care to invert Y), generating camera 
rays, feeding them to TraceRay(), maybe antialiasing, and writing the 
results to disk and/or screen. And there, that's your ray tracer.

My original ray tracer actually included slightly more code, because it 
let you inspect the scene at run-time and get human-readable 
descriptions for various things. (E.g., Shape had a GetName() function, 
and ShapeSphere.GetName() returns "Sphere".) I never actually got as far 
as /using/ any of this infrastructure, however. And actually, I didn't 
build many scenes, because... well, look at it:

   Scene out = new Scene();

   Shape   sh = new ShapeSphere(new Vector3(0, 0, 0), 1);
   Surface su = new DiffuseSurface(Colour.RED);
   Object  ob = new Object(sh, new ConstantTexture(su));
   out.AddObject(ob);

   Shape   sh = new ShapePlane(new Vector3(0, 1, 0), -2);
   Surface su = new DiffuseSurface(Colour.GREEN);
   Object  ob = new Object(sh, new ConstantTexture(su));
   out.AddObject(ob);

   Shape   sh = new ShapePlane(new Vector3(0, -1, 0), -2);
   Surface su = new DiffuseSurface(Colour.BLUE);
   Object  ob = new Object(sh, new ConstantTexture(su));
   out.AddObject(ob);

   out.SetCamera(new PerspectiveCamera(Vector3(0, 0, -5), Vector3(1, 0, 
0), Vector3(0, 1, 0), Vector3(0, 0, 2.5));

   out.AddLight(new Vector3(0, 0, -5), Colour.WHITE);

This places a piffling three objects and one light source. YUCK! You can 
see why I didn't write many test scenes...



So how does all this look in Haskell?

First of all, we need vectors.

   data Vector3 = Vector3 !Double !Double !Double

   class Num Vector3 where
     (Vector3 x1 y1 z1) + (Vector3 x2 y2 z2) = Vector3 (x1+x2) (y1+y2) 
(z1+z2)
     ...

Similarly, we need colours. Unlike Java, we can actually use real 
grown-up operator names, so we don't have to write monstrosities like

   return v1.Add(v2).Add(v3).Normalise();

Instead we can simply say

   normalise (v1 + v2 + v3)

which is far easier to read. It also means stackloads of existing 
numeric functions automatically work with vectors (e.g., the sum 
function can sum a list of vectors - or a list a colours, which turns 
out to be more useful).

Note, also, that we don't need to do this stupid

   public Vector3(double x, double y, double z) {X = x; Y = y; Z = z;}

and then write

   new Vector3(1, 2, 3)

every time we want a vector. Instead, you can simply do

   Vector3 1 2 3

to create a vector.

Similarly, look back at the Java Ray class. See how much boilerplate 
code there is for declaring that the fields are public and constant, and 
for constructing a new object, and so forth. Now look at Haskell:

   data Ray = Ray {ray_start, ray_direction :: Vector3}

   ray_point :: Ray -> Double -> Vector3
   ray_point (Ray s d) t = s + d |* t

Much more compact, and yet much more /readable/ at the same time.

These, of course, are mere trifles. Let us get to the real meat of the 
problem. Now, we need to implement shapes somehow. The /obvious/ thing 
to do is copy the Java version; define a class for shapes, and then 
create various data structures representing concrete shapes:

   class Shape s where
     isect  :: s -> Ray     -> [Double]
     normal :: s -> Vector3 -> Vector3
     inside :: s -> Vector3 -> Bool

   data Sphere = Sphere {center :: Vector3, radius :: Double}

   instance Shape Sphere where
     isect (Ray s d) (Sphere c r) = ...

   data Plane = Plane {normal :: Vector3, offset :: Double}

   instance Shape Plane where
     isect (Ray s d) (Plane n o) = ...

Notice the isect function. In Java, we have /three/ functions for 
efficiency; one function only bothers to compute /whether/ there are any 
intersections, another that only bothers to compute the /first/ 
intersection, and another which computes /all/ intersections.

Haskell's lazy evaluation makes this quite unnecessary. All we need is 
/one/ function which returns all intersections in a lazy list. 
Inspecting whether the list is empty evaluates just enough of isect to 
answer this question, and no more. And obviously, accessing only the 
first solution does not cause any further solutions to be computed. Note 
we're actually doing /better/ than Java: If our CSG happens, at 
run-time, to need the first /three/ intersections but no further, then 
only these are computed.

(In fairness, we could use an Iterator in Java to achieve the self-same 
thing. The hasNext() function would enable intersection tests without 
computing the actual location, and next() would compute one additional 
intersection. It would be really quite a lot of work to set up this much 
lazy evaluation manually, though. Haskell gives it to us for free!)

You can certainly do things this way. There is a snag, however. If you 
stick to Haskell 2010 (the currently published official language 
specification), you cannot easily make a list of shapes. Because all the 
elements of a list have to be of identical types. And (for example) 
Sphere is not the same type as Plane, even if they do both implement the 
Shape class (and possibly other classes like Eq or Show). There is no 
way to say "a list of everything that has these methods".

GHC adds such a way. It's called

   {-# LANGUAGE ExistentialQuantification #-}

It could more accurately be called "existential /type/ quantification", 
or better yet "hidden type variables". But whatever. It lets you write

   data AnyShape = forall s. Shape s => Wrap s

With this incantation, Sphere is still a different type from Plane, but 
if you take a Sphere object and apply Wrap to it (or any other type that 
implements Shape), it becomes an AnyShape type. So you cannot write

   [sphere 0 1, plane 1 0]

but you /can/ write

   [Wrap (sphere 0 1), Wrap (plane 1 0)]

and this is valid. Notice, however, that it is impossible to write an 
"unwrap" function. Because once you "wrap" something, its original type 
information is lost forever, so there's no way in hell to know what type 
to cast it back to.

(More precisely, all type information is lost at compile-time, just like 
in C or Pascal. It's called "type erasure".)

You can't write an unwrap function, but what you /can/ do is make 
AnyShape itself an instance of the Shape class:

   instance Shape AnyShape where
     isect (AnyShape s) r = isect s r
     ...

Now we still can't unwrap an AnyShape, but we don't need to. We can just 
treat AnyShape like a normal Shape instance.

You can't do any of this in Haskell 2010. "forall" is not a valid 
keyword there. The best you can do in Haskell 2010 is to define

   data AnyShape =
     AS_Sphere Sphere |
     AS_Plane  Plane  |
     ...

Every time you define a new Shape instance, you need to update this 
giant monolithic data declaration. Yuck!

But wait... Actually, there is a simpler way. What do we want to /do/ 
with our list of shapes? All we /actually/ want to do is execute shape 
methods. But because functions are first-class in Haskell, we can 
actually solve this in an unusual way. Rather than storing a data 
structure representing the shape and providing a way to statically look 
up the correct ray-intersection function, we can just STORE THE 
INTERSECTION FUNCTION!

Well, actually we've got /three/ functions. But we can make a data 
structure which stores all three:

   data Shape =
     Shape
     {
       shape_isect  :: Ray -> [Double]
       shape_normal :: Vector3 -> Vector3
       shape_inside :: Vector3 -> Bool
     }

   sphere :: Vector3 -> Double -> Shape
   sphere center radius =
     Shape
     {
       shape_isect  = (\ ray   -> ...),
       shape_normal = (\ point -> vnormalise (point - center)),
       shape_inside = (\ point -> vmag (point - center) < radius)
     }

   plane :: Vector3 -> Double -> Shape
   plane normal offset =
     Shape
     {
       shape_isect  = (\ ray   -> ...),
       shape_normal = (\ _     -> normal),
       shape_inside = (\ point -> normal `vdot` point < offset)
     }

Now a "shape" is simply an ordinary data structure, which contains some 
function pointers. And each sort of shape - a sphere, a cone, whatever - 
is an ordinary /function/ which fills out this data structure with the 
right function pointers. In particular, every type of shape now has the 
same type signature. There is no Sphere type, no Plane type, no Cone 
type, there is only a Shape type. So now we can write

   [sphere 0 1, plane 1 0]

and have it be well-typed. (It's [Shape].)

One thing about this transformation: It is now impossible to "look at" a 
shape and know anything about it. Before, when each shape was 
represented as a data structure, you could look at (say) a sphere and 
see where its center is or what its radius is. Now you can't do that. 
After the shape is created, it is impossible to do anything with it 
except ray intersection tests.

That might be a problem if, for example, you wanted to construct a 
bounding volume or something. OTOH, you could just add some more fields 
containing function pointers for supporting such volume construction. In 
general, whatever abilities you need shapes to have, you add more fields 
to Shape to support that.

Notice that with our fancy AnyShape wrapping technique, you /also/ lose 
the ability to do anything except what the Shape class provides. In 
fact, GHC internally implements that particular trick by doing exactly 
what we're manually doing here - storing a dictionary of function 
pointers. But doing it by hand means you don't need any unofficial 
language extensions.

OK, so we can solve the shape problem. Now how about a camera?

Well, this is where the fun /really/ starts. A camera simply maps an 
image plane coordinate to a ray. So in Haskell, a camera is /just a 
function/!

   type Camera = Vector2 -> Ray

Similarly, a Texture becomes utterly trivial:

   type Texture = Vector3 -> Surface

In general, and Java class that has "just one method" can be turned into 
a plain vanilla Haskell function. That means that Surface is just a 
function, although we need to think carefully about what arguments it 
needs. As it turns out, Surface needs quite a few items of data, all of 
similar types. Since function arguments are unnamed in Haskell, it's 
probably a good idea to define a data structure. (Otherwise we'll 
forever be supplying the arguments in the wrong order and causing weird 
bugs that the compiler can't catch.)

   data RayIsect =
     RayIsect
     {
       isect_point     :: Vector3,
       isect_direction :: Vector3,
       isect_normal    :: Vector3
     }

   type Surface = Scene -> RayIsect -> Colour

Notice, again, that we can fill out /all/ fields, and then lazy 
evaluation will skip any calculations we don't need. In particular, 
Surface types that don't depend on the surface normal can skip the 
surface normal calculation.

With the definitions above, a "constant texture" becomes so trivial it's 
barely worth defining a name for it. You can just write "const s" and 
you're done. Let's look at some simple instances:

   cam_orthographic :: Vector3 -> Vector3 -> Vector3 -> Vector3 -> Camera
   cam_orthographic v0 vx vy vz =
      \ (Vector2 x y) = Ray {start = v0 + x *| vx + y *| vy, direction = vz}

   cam_perspective :: Vector3 -> Vector3 -> Vector3 -> Camera
   cam_perspective vx vy vz =
      \ (Vector2 x y) = Ray {start = v0, direction = vnormalise (x *| vx 
+ y *| vy + vz)}

   sur_ambient :: Colour -> Surface
   sur_ambient c = \ _ _ -> c

   sur_diffuse :: Colour -> Surface
   sur_diffuse c =
     \ scene isect ->
       let
         fn light =
           if shadow_test scene light isect
             then cBlack
             else c *| (light_point light `vdot` isect_point isect)
         colours = map fn (scene_lights scene)
       in sum colours

   sur_reflect :: Colour -> Surface
   sur_reflect c =
     \ scene isect ->
       let
         v_in   = isect_direction isect
         v_norm = isect_normal isect
         v_out  = v_in - 2 * (v_in `vdot` v_norm) *| v_norm
       in trace_ray scene (Ray {start = isect_point isect, direction = 
v_out})

Recall that in Java, ever single one of these things would be an entire 
class, with the "public class Woo extends Wah" and the field declaration 
and the constructor declaration, and only THEN do we get to writing the 
bit of code that actually does something useful.

In Haskell, we just write the useful bit. All of the stuff about is 
useful code, and no filler. It's all stuff implementing actual maths, 
not micromanaging object construction or field initialisation or whatever.

So if you can't access object internals, how can you implement spatial 
transformations? Easy: You transform the coordinates before input.

   transform_texture :: Transform -> Texture -> Texture
   transform_texture transform texture =
     \ point -> texture (transform point)

Now wasn't that easy? Compare Java:

   public class TransformTexture extends Texture
   {
     public final Transform Trans;
     public final Texture   Text;

     public TransformTexture(Transform t1, Texture t2)
     {Trans = t1; Text = t2;}

     public Surface GetSurface(Vector3 p)
     {
       return Text.GetSurface(Trans.Apply(p));
     }
   }

What a load of waffle! Especially when you consider that a typical 
Haskeller would probably write

   transform_texture :: Transform -> Texture -> Texture
   transform_texture = flip (.)

and be done with it!

(This works because this convoluted-sounding concept of "apply a 
coordinate transformation to the input before passing it to the texture" 
is really nothing other than /function composition/, which is a 
well-known concept in Haskell. A coordinate transformation maps a point 
to a point, a texture maps a point to a Surface. Combining these steps 
is clearly function composition. When you think about it like that... 
suddenly it's /obvious/ that this is a trivial thing.)

When you start thinking of (say) a coordinate transformation as a 
trivial function rather than as some complex active "object" which has 
"behaviour", "state" and "identity" which needs "constructors" and 
"destructors" and "gettters" and "setters" and "reflection" and... Well, 
suddenly all your problems start to look a hell of a lot simpler.

Take a look back at all those "map" classes I had. In Haskell, each of 
these is merely a function. It's hardly worth assigning names to them. 
Indeed, in Java I had trouble picking suitably descriptive names for all 
the combinations and variations I could come up with. In Haskell, I 
don't need to bother. If I want to deal with functions that map points 
to scalars, I can just /call/ them "functions that map points to 
scalars". Which is actually less confusing then naming them ScalarMap or 
whatever. It more succinctly conveys what they do.



What have I just said? The short summary is that Haskell, the finest 
functional programming language in the land, is superior to Java, one of 
the more sucky OOP languages. Not exactly a revelation, is it? I think 
I'm going to go outside for a while...
Post a reply to this message