|
|
> The usual way of doing this is to read a bunch of data from the file
> (for example some megabytes), process it, discard it, read a new bunch
> of data from the file, etc.
>
> What I'm wondering is what a language like haskell will do in this case,
> even though it has lazy evaluation. In fact, lazy evaluation is of no
> help here because you are, after all, reading the entire file. If you
> use the completely abstract way you could end up having the haskell
> program trying to read the entire file into memory, thus running out of
> it.
>
> You have to somehow be able to tell it that you are only going to need
> small bunches at a time, and after they have been processed, they can be
> discarded. I wonder if there's a really abstract way of saying this in
> haskell, or whether you need to go to ugly low-level specifics.
Because of lazy evaluation in Haskell, something like...
readFile "filename" >>= process
is read as a stream in chunks (of system dependent size) into
available heap memory, the entire file isn't read all into memory before
process is called, they are both actively executing at the same time.
After blocks of data are used by process, they're disposed of, the
file then closes itself after it's all read. A large file will hit GC.
In fact, if you are opening a large number of small files this lazy
evaluation of files gets to be a problem, too many file handles are
left open at once. There are ways to force it to read all the contents,
and thus close them quickly, if needed.
There are also more typical IO actions, like openFile which returns a
handle, hClose, and hSetBuffering if you want to over-ride the
default buffering, hFlush to flush the buffer, and other stuff for random
access.
http://haskell.cs.yale.edu/haskell-report/IO.html
Post a reply to this message
|
|