POV-Ray : Newsgroups : povray.off-topic : More Haskell fanning : Re: More Haskell fanning Server Time
29 Jul 2024 20:22:07 EDT (-0400)
  Re: More Haskell fanning  
From: Orchid XP v8
Date: 14 May 2011 10:53:07
Message: <4dce9753@news.povray.org>
On 13/05/2011 06:48 PM, Darren New wrote:
> On 5/13/2011 10:11, Orchid XP v8 wrote:
>> I'm fairly sure I posted one in the other group a while back.
>
> Yeah, I was just wondering if you had something convenient. I was
> curious waht it looked like.

Quoting myself from 3 years ago:

module Raw where

import System.Win32.Types
import Graphics.Win32.GDI.Types
import Graphics.Win32.Message

foreign import stdcall "windows.h PostMessageW"
   postMessage :: HWND -> WindowMessage -> WPARAM -> LPARAM -> IO LRESULT

This creates a Haskell function named "postMessage". When you call this 
function, it really calls PostMessageW() from the windows.h header. Note 
carefully that *you* are responsible for getting the type signature right!

How this actually finds the necessary code to execute, I have no idea. 
What I do know is that if you want to call your own C function (rather 
than something from the Win32 API), you have to tell the compiler to 
link the object code in at link time as well.

To call Haskell from C, you write a similar 1-liner to generate a C stub 
function which knows the compiler-specific details for calling the 
*real* function code.

>> Actually calling a C function is trivial. Passing a Haskell function
>> pointer
>> to C is trivial. What is very *non*-trivial is getting data
>> marshalling and
>> memory management to work.
>
> Well, yeah, that's why I was curious. :-)

You cannot access Haskell data from C at all. You can access C data from 
Haskell if you're careful. For anything beyond primitive data (integers, 
floats, etc), it's probably easier to write functions in the native 
language to get/set the fields you're interested in.

Memory managemet is fun. (Principly because if you get it wrong... well, 
you know what happens.) There are Haskell libraries for allocating and 
freeing values that the GC doesn't move around (so-called "pinned 
memory"). You can attach finalisers to things to have the GC call the 
free routine for you when appropriate. (This of course takes only 
Haskell into account; it cannot know whether C is still using the object.)

You can also arrange for the main() function to be implemented in C 
rather than Haskell. (I believe you have to make sure main() calls some 
Haskell startup functions though.) In this way, you can build a Haskell 
program that calls C, or a C program that calls Haskell.

You can also build Haskell DLLs. I have no idea how that works...

Speaking of which, GHC now supports compiling each Haskell library as a 
DLL, rather than linking them all statically. Unfortunately, this is 
broken out-of-the-box. If you compile this way, you have to manually 
find all the DLLs your program requires and move them into the search 
path, because the GHC installer doesn't do this for you. Oops!

>> I'm sure lots of people are doing this. After all, the basic abstractions
>> are not complex.
>
> They actually are pretty complex if you actually want it to be reliable.
> Think this: You spawn a process, but before it starts up, it dies. Do
> you handle it? Lots of edge cases to get right.

The *abstraction* is simple. The *implementation* is not. The same is 
true of many, many things. ;-)

>> Perhaps. Perhaps the data is already in some distributed database, so
>> it's
>> already persistent regardless of whether a node dies or not. (Isn't the
>> system supposed to recover from node death anyway?)
>
> So restarting a node requires first reloading all the records, one at a
> time, into a balanced tree. Then replaying all the changes since the
> last time the node was saved out to disk. Then playing all the changes
> made since the node went down, as copied from some other node.
>
> On a gigabyte CSV file of a half dozen fields per line, on a 3GHz
> machine with 100MBps data transfer, this takes about an hour for the
> first step of just reloading the saved file. Say you have 50 machines to
> upgrade. You're now talking 2 wall days just to restart the new
> processors, ignoring however long it might take to dump the stuff out.

Wouldn't converting one gigabyte of data to a new format online take 
just as long as reloading it from disk?

>> With hot update, you *still* have to manually write a bunch of code to do
>> any data conversions if data structures have changed, or do invariant
>> checks
>> if the data's invariants have changed. Either way, you gotta write code.
>
> Sure. But you don't have to write code to cleanly shut down *other*
> processes when some unrelated process on the same machine gets changed.

I'm not seeing why you would need to do this with cold code update 
either. You only need to shut down the processes related to the thing 
you're actually changing.

> You can also convert things slowly, using the old data in the new code
> until all the old code is gone, then slowly work thru the data (perhaps
> updating it each time it's touched) to the new format.

This is an unavoidable requirement which *must* be met if you want 
distributed processing. Otherwise upgrading the system requires stopping 
the entire distributed network.

>> Yeah. As I say, the current system is entirely predicated around
>> everybody
>> having the same code. As soon as that's not true, it's game over.
>
> That, I think, is the fundamental problem with systems like Haskell, and
> the fundamental reason why dynamic typing may very well win in this
> situation.

No, this is the fundamental problem with a small proof-of-concept 
implementation exploring whether it's even slightly possible to do this. 
Obviously the current implementation does not yet provide everything 
necessary for production use. Nobody is claiming it does.

>> It's already dealing with receiving messages of arbitrary type even in
>> the
>> presence of static typing. All that's needed is some way to compare types
>> beyond "do they have the same name?"
>
> Oh, OK.

Currently, every time you send a value, the implementation transmits the 
name of the value's type, followed by the value itself. The receiving 
end uses the type name to figure out which parser to deserialise with. 
Similarly, functions are sent by name (along with their free variables).

This works great for testing that your implementation works. It breaks 
worribly as soon as you have more than one version of your code. (Not to 
mention two unrelated applications both of which happen to define 
unrelated types having identical names...)

What is needed - and the documentation makes plain that the implementors 
are quite aware of this - is some way of varifying that the identifiers 
you send actually refer to the same thing. I don't actually know how 
Erlang manages to do this; looks like the Haskell guys are planning to 
use hashes of [some canonical representation of] the type definitions or 
something...

>> Yeah, passing channels around looks quite useful. It just seems a
>> pitty to
>> have more or less duplicated functionality in two incompatible systems -
>> mailboxes and channels. Couldn't they have implemented a mailbox as a
>> channel of arbitrary type or something?
>
> Two people designing the two systems, maybe?

Looks more like they built the mailbox system, realised that there was a 
guarantee it can't provide, and then engineered a second, independent 
system which does.

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.