POV-Ray : Newsgroups : povray.off-topic : Haskell raving : Re: Haskell raving Server Time
15 Nov 2024 01:16:12 EST (-0500)
  Re: Haskell raving  
From: Orchid XP v7
Date: 31 Oct 2007 16:28:34
Message: <4728f382$1@news.povray.org>
Warp wrote:

>> readFile "filename" >>= process
> 
>   Btw, how do you actually read the contents of the file in haskell with
> a command like that? What is the unit type? Can you read, for example,
> one byte at a time? What if you would want to parse an ascii-formatted
> input file, for example? How is it done exactly?

First of all, reading binary data in Haskell is currently "messy". 
Haskell tends to assume that files contain plain ASCII. (If you want to 
be technical, it usually reads 8 bits at a time, and interprets these 
literally as a Unicode code-point.)

That aside, the readFile function gives you (conceptually) a giant 
string representing the entire contents of the file. In typical Haskell 
fasion, this doesn't get actually created until you "look at it".

If you wanted to parse a text file, you'd likely read the whole file 
into a string (conceptually) and then pass that over to your parser. The 
parser just knows how to parse strings; it doesn't have to care that 
*this* string is being transparently read from a file first.

Similarly, if you had a function that takes a string and returns the 
decrypted form of that string, you could apply that to the thing 
returned from readFile and then pass the answer to your parser. The 
result is that the decryption function only decrypts data as the parser 
tries to demand it (due to lazy evaluation). All that stuff that Java 
does with layering stream filters is just function calls in Haskell.



Now, the flip side to all this is that lazy file reading can bite you. 
For example, one guy wrote some program that would crash with an OS 
error complaining about too many open file handles. Letting go of a 
pointer to a file handle causes the file to be closed WHEN THE GC RUNS, 
however exhausting the OS handles limit does not trigger the GC...

There is a certain faction of the Haskell community that considers 
readFile and its ilk to be "evil" and who say you should do things the 
manual way. For example, check this out:

   main = do
     data <- readFile "foo.txt"
     deleteFile "foo.txt"
     writeFile "bar.txt" data

This doesn't work at all. If you run this, the behaviour is technically 
undefined, but generally you will get some kind of OS error. Put simply, 
the first line doesn't *really* read any data - but then you try to 
delete the file (which hasn't actually been read yet, only opened), and 
then write the data to another file (which causes it to actually be 
read). Usually deleting the file is where it will bomb out - the OS will 
usually complain that you can't delete an open file...

Somebody did suggest a file that lazily reads directory structures, and 
allows you to lazily modify them. The general reply was "OMG, NO!!"



As you might expect, Haskell provides primitives for doing file I/O the 
"normal" way. In other words,

   openFile
   hCloseFile
   hReadLine
   hPutStrLn
   hFlush
   etc.

For interactive stuff at a minimum, it's usually critical to control 
exactly *when* I/O happens. And the only way to do this is... to specify 
exactly when I/O happens. No way round it.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.