POV-Ray: Newsgroups: povray.off-topic: Haskell raving

POV-Ray : Newsgroups : povray.off-topic : Haskell raving		Server Time 26 Jun 2026 05:23:02 EDT (-0400)

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>

From: nemesis
Subject: Re: Haskell raving
Date: 31 Oct 2007 20:45:00
Message: <web.47292f687b4224e765e605d0@news.povray.org>

Warp <war### [at] tagpovrayorg> wrote:
> That didn't really answer the question of whether it is able to drop
> the parts which are no longer used.

no, I don't think readFile per se is a great means of working with large files.
It binds the whole contents as a string to a "variable" and even if you process
it by reading only a few chunks, the already read content is supposedly being
accumulated in the binding "variable".  I don't think if you go back and read
chunks from the beginning of the string the function actually goes back and
performs IO to read that position again.

laziness is great for highly abstract algebraic datatypes, but not that great
for IO.  For that, you have buffered IO just like in other languages (except
for being monadic).

Post a reply to this message

From: Darren New
Subject: Re: Haskell raving
Date: 31 Oct 2007 20:59:00
Message: <472932e4$1@news.povray.org>

Warp wrote:
>   If a string is really internally a list of smaller strings, wouldn't GC
> become more difficult because you can actually have "references" pointing
> to the middle of such strings and not their base?

You'd have to implement that kind of reference as a base+offset sort of 
thing. Or buffer it and copy it into a list as needed, like building up 
something from multiple reads of C stdio.

>   I must admit, though, that I don't know how GC is implemented.

No, you're right. The kind of GC that can handle arbitrary pointers is 
called "conservative GC" if I recall correctly.  It works pretty poorly.

-- 
   Darren New / San Diego, CA, USA (PST)
     Remember the good old days, when we
     used to complain about cryptography
     being export-restricted?

Post a reply to this message

From: Tim Attwood
Subject: Re: Haskell raving
Date: 1 Nov 2007 02:34:35
Message: <4729818b$1@news.povray.org>

>  But in this case the data is one big string?
>
>  Someone else said that string are actually linked lists of characters
> where each character is garbage-collected. This must require humongous
> amounts of memory, especially considering that one character would only
> require 1 byte...

Yeah, it does take a lot, and it uses unicode for characters too,
I think I heard it takes about 12 bytes per character that way.
It's perty bad on big files.

You can use the ByteString module though, it uses a single byte
value for a character, and passes pointers and stuff around behind
the scenes to avoid duplicating stuff that doesn't need to be.
A ByteString is sort of a list of pointers to C style strings, but
with all the pointers hidden.

Post a reply to this message

From: Warp
Subject: Re: Haskell raving
Date: 1 Nov 2007 07:57:55
Message: <4729cd53@news.povray.org>

Tim Attwood <tim### [at] comcastnet> wrote:
> Yeah, it does take a lot, and it uses unicode for characters too,
> I think I heard it takes about 12 bytes per character that way.
> It's perty bad on big files.

  But hey, that's the fad in modern programming: Memory usage and speed
are irrelevant. Processors are getting faster and memory amounts are
doubling every couple of years, so who cares if something like a string
takes 12 times as much memory as it could?

-- 
                                                          - Warp

Post a reply to this message

From: Warp
Subject: Re: Haskell raving
Date: 1 Nov 2007 08:00:48
Message: <4729ce00@news.povray.org>

Orchid XP v7 <voi### [at] devnull> wrote:
> Warp wrote:
> >   That didn't really answer the question of whether it is able to drop
> > the parts which are no longer used.

> Garbage collection.

> In Haskell, a "string" is a linked-list of characters. Let go of the 
> pointer to the start of the list and all elements up to the first one 
> you've still got a pointer to will be collected and freed.

  Btw, what does it do if you jump to the beginning of the string after
it has been garbage-collected? (After all, from the programmer's point
of view it's one single string, so you should be able to read it from
whatever point you like, shouldn't you?)

-- 
                                                          - Warp

Post a reply to this message

From: Alain
Subject: Re: Haskell raving
Date: 1 Nov 2007 11:00:49
Message: <4729f831$1@news.povray.org>

Tim Attwood nous apporta ses lumieres en ce 2007/11/01 03:34:
>>  But in this case the data is one big string?
>>
>>  Someone else said that string are actually linked lists of characters
>> where each character is garbage-collected. This must require humongous
>> amounts of memory, especially considering that one character would only
>> require 1 byte...
> 
> Yeah, it does take a lot, and it uses unicode for characters too,
> I think I heard it takes about 12 bytes per character that way.
> It's perty bad on big files.
It's about 12 BITS per characters, on average, not BYTES! That's useing UTF8 
encoding. About 16 BITS per characters if using UTF16 encoding.
UTF8 is only 7 BITS per characters if you stick to only standard ASCII 
characters set, but it gets bigger if you also use extended ASCII or characters 
from foreign alphabets.
> 
> You can use the ByteString module though, it uses a single byte
> value for a character, and passes pointers and stuff around behind
> the scenes to avoid duplicating stuff that doesn't need to be.
> A ByteString is sort of a list of pointers to C style strings, but
> with all the pointers hidden. 
> 
> 


-- 
Alain
-------------------------------------------------
You know you have been raytracing for too long when the animation you
render will be finished after yourself.
Urs Holzer

Post a reply to this message

From: Darren New
Subject: Re: Haskell raving
Date: 1 Nov 2007 12:44:00
Message: <472a1060$1@news.povray.org>

Warp wrote:
>   Btw, what does it do if you jump to the beginning of the string after
> it has been garbage-collected? 

It won't get GCed if it's possible to jump to the beginning.

> (After all, from the programmer's point
> of view it's one single string, 

If it's that, and not a "linked list of characters" (from the 
programmer's point of view), then it won't get GCed until you're done 
with it.

Probably not a good way to process big files, tho, just as it's not a 
good idea in C. :-)

-- 
   Darren New / San Diego, CA, USA (PST)
     Remember the good old days, when we
     used to complain about cryptography
     being export-restricted?

Post a reply to this message

From: Darren New
Subject: Re: Haskell raving
Date: 1 Nov 2007 12:46:22
Message: <472a10ee$1@news.povray.org>

Warp wrote:
>   But hey, that's the fad in modern programming: Memory usage and speed
> are irrelevant. Processors are getting faster and memory amounts are
> doubling every couple of years, so who cares if something like a string
> takes 12 times as much memory as it could?

Depends on the size of the string, really. If my 100-byte string holding 
an email address turns into 1200 bytes, I'm not really going to worry 
about it much. :-)  Agreed, really big strings where you're actually 
starting to cause significant paging can be a problem, but you're really 
just hitting the wall earlier in Haskell than in other impelmentation.

-- 
   Darren New / San Diego, CA, USA (PST)
     Remember the good old days, when we
     used to complain about cryptography
     being export-restricted?

Post a reply to this message

From: Warp
Subject: Re: Haskell raving
Date: 1 Nov 2007 13:40:34
Message: <472a1da2@news.povray.org>

Alain <ele### [at] netscapenet> wrote:
> It's about 12 BITS per characters, on average, not BYTES! That's useing UTF8 
> encoding. About 16 BITS per characters if using UTF16 encoding.
> UTF8 is only 7 BITS per characters if you stick to only standard ASCII 
> characters set, but it gets bigger if you also use extended ASCII or characters 
> from foreign alphabets.

  Except that if each single character is indeed garbage-collected, that
requires quite a lot of memory per character (compared to the size of the
character).

-- 
                                                          - Warp

Post a reply to this message

From: Orchid XP v7
Subject: Re: Haskell raving
Date: 1 Nov 2007 13:57:08
Message: <472a2184$1@news.povray.org>

Warp wrote:
> Orchid XP v7 <voi### [at] devnull> wrote:
> 
>> In Haskell, a "string" is a linked-list of characters.
> 
>   I shudder thinking how much memory that must require...

It's fine for "small" strings. For "large" strings, it's really 
sub-optimal in space and time.

This is why the "ByteString" library was developed. It can present a 
string-style interface, yet every character consumes only 1 byte of 
storage (plus a little overhead for the array itself).

In the most efficient variant, the string is actually a linked list of 
array chunks. That means you can do fast concatinate (just link some 
chunks together), fast substring extraction (just point to the ends, no 
data is copied), and GC can free individual chunks of data. And yet, it 
all handles "as if" it were a naive linked list of individual characters...

So you see, Haskell *can* be made efficient. It's just not completely 
done yet.

Post a reply to this message

<<< Previous 10 Messages

Goto Latest 10 Messages

Next 10 Messages >>>