POV-Ray : Newsgroups : povray.off-topic : Haskell raving Server Time
14 Nov 2024 23:05:17 EST (-0500)
  Haskell raving (Message 11 to 20 of 92)  
<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: Tim Attwood
Subject: Re: Haskell raving
Date: 31 Oct 2007 05:21:10
Message: <47285716$1@news.povray.org>
>  The usual way of doing this is to read a bunch of data from the file
> (for example some megabytes), process it, discard it, read a new bunch
> of data from the file, etc.
>
>  What I'm wondering is what a language like haskell will do in this case,
> even though it has lazy evaluation. In fact, lazy evaluation is of no
> help here because you are, after all, reading the entire file. If you
> use the completely abstract way you could end up having the haskell
> program trying to read the entire file into memory, thus running out of 
> it.
>
>  You have to somehow be able to tell it that you are only going to need
> small bunches at a time, and after they have been processed, they can be
> discarded. I wonder if there's a really abstract way of saying this in
> haskell, or whether you need to go to ugly low-level specifics.

Because of lazy evaluation in Haskell, something like...

readFile "filename" >>= process

is read as a stream in chunks (of system dependent size) into
available heap memory, the entire file isn't read all into memory before
process is called, they are both actively executing at the same time.
After blocks of data are used by process, they're disposed of, the
file then closes itself after it's all read. A large file will hit GC.

In fact, if you are opening a large number of small files this lazy
evaluation of files gets to be a problem, too many file handles are
left open at once. There are ways to force it to read all the contents,
and thus close them quickly, if needed.

There are also more typical IO actions, like openFile which returns a
handle, hClose, and hSetBuffering if you want to over-ride the
default buffering, hFlush to flush the buffer, and other stuff for random
access.
http://haskell.cs.yale.edu/haskell-report/IO.html


Post a reply to this message

From: Warp
Subject: Re: Haskell raving
Date: 31 Oct 2007 08:18:17
Message: <47288099@news.povray.org>
Tim Attwood <tim### [at] comcastnet> wrote:
> readFile "filename" >>= process

  Btw, how do you actually read the contents of the file in haskell with
a command like that? What is the unit type? Can you read, for example,
one byte at a time? What if you would want to parse an ascii-formatted
input file, for example? How is it done exactly?

-- 
                                                          - Warp


Post a reply to this message

From: nemesis
Subject: Re: Haskell raving
Date: 31 Oct 2007 13:25:00
Message: <web.4728c74c7b4224e7773c9a3e0@news.povray.org>
Warp <war### [at] tagpovrayorg> wrote:
> Tim Attwood <tim### [at] comcastnet> wrote:
> > readFile "filename" >>= process
>
>   Btw, how do you actually read the contents of the file in haskell with
> a command like that? What is the unit type? Can you read, for example,
> one byte at a time? What if you would want to parse an ascii-formatted
> input file, for example? How is it done exactly?

http://haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html#v%3AreadFile

readFile returns a String.  But don't despair:  it won't fill the whole memory
with a giant string for large files!  Lazyness means values are only evaluated
once needed.  If you just take the whole contents, yes, you'll be sorry.
Otherwise, you can use a function like "take 5 string" and it'll only take the
first five characters (a string is merely a list of characters in Haskell).
You may process it with folds or other higher-order functions.

To ilustrate lazyness in a simple REPL example:
ones = 1:ones
take 5 ones
take 5 (map (*2) ones)

ones is of course an infinite, recursively-defined list.  If you just evaluate
it at the prompt, the REPL will begin filling your screen with ones or worse
(like filling your memory).  The same for simply doing map (*2) ones which will
try to double every 1 in ones.

But lazyness means we can restrict the evaluated value to just a few members
from this infinite data structure, which is what we do by taking one a few.

So, with lazyness, size really doesn't matter.

Of course, there are other useful IO functions out there that offer more control
over opening/closing file handles etc.


Post a reply to this message

From: Warp
Subject: Re: Haskell raving
Date: 31 Oct 2007 14:58:57
Message: <4728de81@news.povray.org>
That didn't really answer the question of whether it is able to drop
the parts which are no longer used.

-- 
                                                          - Warp


Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:05:26
Message: <4728ee16@news.povray.org>
Tim Attwood wrote:
>>> When memory comes into play you might need destructive
>>> updates to arrays, etc.
>> Which is only a problem if your compiler isn't smart enough, or disallows 
>> providing the appropriate hints. :-)
> 
> The point is that there is no way for a compiler to decide.

...which is why you have the hints... ;-)


Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:08:56
Message: <4728eee8$1@news.povray.org>
somebody wrote:

> The simple fact is any argument for Haskell and against other languages will
> be valid when - scratch that - *if*  people start writing killer
> applications, a killer OSs, APIs and killer games in Haskell.
> 
> Ah, I stand corrected, here's a killer game:
> 
> http://www.geocities.jp/takascience/haskell/monadius_en.html   <g>
> 
> from
> 
> http://www.haskell.org/haskellwiki/Applications (2nd on list)
> 
> I cannot imagine what Haskell will bring us in another 20 years. "A
> scientist's toy box" is aptly where Haskell belongs.

Hmm... You're not going to talk about Haskell Quake then?

http://www.haskell.org/haskellwiki/Frag

Hardly "killer" (unless you'll pardon the pun), but at least a little 
more exciting...



One could legitimately level the claim that only real large-scale 
programs written in functional languages so far have been... compilers 
for more functional languages.

Still, I gather that Xilinix's current toolchain for FPGAs is based on 
Haskell technology. A few other blue-chip peeps are using it. Again, not 
really "killer" - for that, you must look to Erlang. *sigh*


Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:15:52
Message: <4728f088$1@news.povray.org>
Warp wrote:

>   However, I'm wondering what happens with huge files (which won't fit
> in memory) and which you read thoroughly. It could be, for example, a
> video file (which could be, for example, 3 gigabytes big, while your
> computer has only eg. 1 gigabyte of RAM).
>   The usual way of doing this is to read a bunch of data from the file
> (for example some megabytes), process it, discard it, read a new bunch
> of data from the file, etc.
> 
>   What I'm wondering is what a language like haskell will do in this case,
> even though it has lazy evaluation. In fact, lazy evaluation is of no
> help here because you are, after all, reading the entire file. If you
> use the completely abstract way you could end up having the haskell
> program trying to read the entire file into memory, thus running out of it.

Not so.

>   You have to somehow be able to tell it that you are only going to need
> small bunches at a time, and after they have been processed, they can be
> discarded. I wonder if there's a really abstract way of saying this in
> haskell, or whether you need to go to ugly low-level specifics.

There is. It's called a garbage collector. ;-)

Simply let lazy evaluation read the file as required, and the GC can 
delete the data you've already processed transparently behind you.

In this way, linearly processing a file in constant RAM usage is pretty 
trivial in Haskell.

Now, *random* access... *that* presents a bit more of a problem. (For 
the love of God, don't accidentally hang on to pointers you don't need 
any more! You could end up loading the entire file into RAM - and, 
obviously, that would be "bad".)


Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:17:28
Message: <4728f0e8@news.povray.org>
Warp wrote:
>   That didn't really answer the question of whether it is able to drop
> the parts which are no longer used.

Garbage collection.

In Haskell, a "string" is a linked-list of characters. Let go of the 
pointer to the start of the list and all elements up to the first one 
you've still got a pointer to will be collected and freed.


Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:28:34
Message: <4728f382$1@news.povray.org>
Warp wrote:

>> readFile "filename" >>= process
> 
>   Btw, how do you actually read the contents of the file in haskell with
> a command like that? What is the unit type? Can you read, for example,
> one byte at a time? What if you would want to parse an ascii-formatted
> input file, for example? How is it done exactly?

First of all, reading binary data in Haskell is currently "messy". 
Haskell tends to assume that files contain plain ASCII. (If you want to 
be technical, it usually reads 8 bits at a time, and interprets these 
literally as a Unicode code-point.)

That aside, the readFile function gives you (conceptually) a giant 
string representing the entire contents of the file. In typical Haskell 
fasion, this doesn't get actually created until you "look at it".

If you wanted to parse a text file, you'd likely read the whole file 
into a string (conceptually) and then pass that over to your parser. The 
parser just knows how to parse strings; it doesn't have to care that 
*this* string is being transparently read from a file first.

Similarly, if you had a function that takes a string and returns the 
decrypted form of that string, you could apply that to the thing 
returned from readFile and then pass the answer to your parser. The 
result is that the decryption function only decrypts data as the parser 
tries to demand it (due to lazy evaluation). All that stuff that Java 
does with layering stream filters is just function calls in Haskell.



Now, the flip side to all this is that lazy file reading can bite you. 
For example, one guy wrote some program that would crash with an OS 
error complaining about too many open file handles. Letting go of a 
pointer to a file handle causes the file to be closed WHEN THE GC RUNS, 
however exhausting the OS handles limit does not trigger the GC...

There is a certain faction of the Haskell community that considers 
readFile and its ilk to be "evil" and who say you should do things the 
manual way. For example, check this out:

   main = do
     data <- readFile "foo.txt"
     deleteFile "foo.txt"
     writeFile "bar.txt" data

This doesn't work at all. If you run this, the behaviour is technically 
undefined, but generally you will get some kind of OS error. Put simply, 
the first line doesn't *really* read any data - but then you try to 
delete the file (which hasn't actually been read yet, only opened), and 
then write the data to another file (which causes it to actually be 
read). Usually deleting the file is where it will bomb out - the OS will 
usually complain that you can't delete an open file...

Somebody did suggest a file that lazily reads directory structures, and 
allows you to lazily modify them. The general reply was "OMG, NO!!"



As you might expect, Haskell provides primitives for doing file I/O the 
"normal" way. In other words,

   openFile
   hCloseFile
   hReadLine
   hPutStrLn
   hFlush
   etc.

For interactive stuff at a minimum, it's usually critical to control 
exactly *when* I/O happens. And the only way to do this is... to specify 
exactly when I/O happens. No way round it.


Post a reply to this message

From: Darren New
Subject: Re: Haskell raving
Date: 31 Oct 2007 16:46:56
Message: <4728f7d0$1@news.povray.org>
Orchid XP v7 wrote:
> Tim Attwood wrote:
>>>> When memory comes into play you might need destructive
>>>> updates to arrays, etc.
>>> Which is only a problem if your compiler isn't smart enough, or 
>>> disallows providing the appropriate hints. :-)
>>
>> The point is that there is no way for a compiler to decide.
> 
> ....which is why you have the hints... ;-)

Or iterative optimizations, like big SQL servers do.  The program runs 
profiled a few times, and tells the linker/compiler which versions run 
faster etc.

-- 
   Darren New / San Diego, CA, USA (PST)
     Remember the good old days, when we
     used to complain about cryptography
     being export-restricted?


Post a reply to this message

<<< Previous 10 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.