POV-Ray : Newsgroups : povray.off-topic : Haskell arrays Server Time
29 Sep 2024 21:21:29 EDT (-0400)
  Haskell arrays (Message 10 to 19 of 29)  
<<< Previous 9 Messages Goto Latest 10 Messages Next 10 Messages >>>
From: Warp
Subject: Re: Haskell arrays
Date: 10 Mar 2009 15:27:26
Message: <49b6bf1d@news.povray.org>
Invisible <voi### [at] devnull> wrote:
> First of all, in any of the above languages, if you want to store 
> several items of data, an array is usually the first data structure you 
> reach for. It's simple, it's built-in, it's always there, it's 
> moderately efficient, and it's "good enough" to start with.

  The advantages of an array are:

1) It's the most memory-efficient data container possible. Not one single
bit of ancillary data is used per element. All elements are packed in
memory as tightly as possible (well, at least if the size of the element
is a multiple of the natural (half-)word size of the target architecture,
or the element has the size of one byte).

  You can pack certain type of elements even more efficiently than that
when using special data containers, for example radix trees, but even
then you'll have to put the container data inside an array if you want
to beat the memory usage of the "raw" array. In other words, you will
still be using an array (rather than some other data structure), even if
interpreting the contents of the array is a bit more complicated.

2) Accessing an element of the array is fast, as you have random access.
You can access any array in O(1), and moreover the constant factor
involved is very small (in the order of 1 clock cycle if the element
happens to be in the fastest cache; and even if there's a cache miss,
it will still usually be faster than accessing elements in other non-array
data containers).

3) Elements in an array are in contiguous memory locations. This fact is
often dismissed in higher-level languages, but sometimes used to one's
advantage in lower-level languages, sometimes for great speedups.

  The reason why this property can sometimes be used for speed is that
small array elements can be handled in bigger bunches (usually word-sized)
because they are all in contiguous memory locations. For example doing
certain operations to a byte string can be done 4 or 8 bytes at a time
(getting almost that much speedup) with low-level tricks.

  There are many algorithms which are very efficient and fast to perform
on arrays, while inefficient or difficult to do for other types of data
containers (such as linked lists).

  Of course arrays have their disadvantages, which is the reason why other
types of data containers exist in the first place. For example, removing one
element from the middle of the array is a slow operation, while for example
for a linked list it's very fast (assuming you have a handle to the element
to be removed). Adding elements to an array while constantly keeping it sorted
is inefficient, while with other data containers (eg. a balanced binary tree)
it's efficient. Arrays are not very dynamic, and trying to make them dynamic
(while still maintaining efficiency) usually causes overallocation and/or
memory fragmentation.

> One nice touch is that an unboxed array of Bool actually uses 1 bit per 
> element. The library automatically handles all the bit twiddling 
> operations required to pack 8 Bools into one byte, and unpack them again 
> later. (I wonder if the C++ STL can do that?)

  std::vector<bool> uses one bit per element.

  OTOH this actually causes more trouble than it's worth, as it makes
std::vector behave in certain cases rather differently when the element
type is bool compared to any other element type. This is a common complaint.

  The suggested solution is to use a separate, efficient data container for
bit-sized booleans. boost::dynamic_bitset is such a container, and it has
often been suggested to be included in the next C++ standard, and the
std::vector<bool> converted to a 1-byte-per-element container.

  boost::dynamic_bitset is an incredibly efficient data container for
individual bits. For example it can count the number of set bits in the
entire container at something like 0.5 clock cycles per bit in my computer.

-- 
                                                          - Warp


Post a reply to this message

From: Orchid XP v8
Subject: Re: Haskell arrays
Date: 10 Mar 2009 16:07:58
Message: <49b6c89e$1@news.povray.org>
Warp wrote:

>   The advantages of an array are:
> 
> 1) It's the most memory-efficient data container possible.

Yup. In Haskell you'd have a tiny amount of overhead (like, a few 
pointers), but for large arrays, this is insignifigant.

> 2) Accessing an element of the array is fast, as you have random access.
> You can access any array in O(1), and moreover the constant factor
> involved is very small.

Indeed. And since an array is so small, and contiguous in memory, it's 
highly likely that big chunks of it will be in your cache hierachy. (Not 
so for, say, linked lists, which tend to be scattered around all over 
the place.) You can also "prefetch" arrays, and so forth.

> 3) Elements in an array are in contiguous memory locations. This fact is
> often dismissed in higher-level languages, but sometimes used to one's
> advantage in lower-level languages, sometimes for great speedups.
> 
>   The reason why this property can sometimes be used for speed is that
> small array elements can be handled in bigger bunches (usually word-sized)
> because they are all in contiguous memory locations. For example doing
> certain operations to a byte string can be done 4 or 8 bytes at a time
> (getting almost that much speedup) with low-level tricks.

I would think particularly copying an array or some subset of it could 
be done in big batches (IIRC, the IA32 arch even has special opcodes for 
this), regardless of the notional size of the element type.

>   Of course arrays have their disadvantages, which is the reason why other
> types of data containers exist in the first place.

Sure.

>> One nice touch is that an unboxed array of Bool actually uses 1 bit per 
>> element. The library automatically handles all the bit twiddling 
>> operations required to pack 8 Bools into one byte, and unpack them again 
>> later. (I wonder if the C++ STL can do that?)
> 
>   std::vector<bool> uses one bit per element.

OK.

>   OTOH this actually causes more trouble than it's worth, as it makes
> std::vector behave in certain cases rather differently when the element
> type is bool compared to any other element type. This is a common complaint.

...because C++ "bool" is actually some mannar of integer?

>   boost::dynamic_bitset is an incredibly efficient data container for
> individual bits. For example it can count the number of set bits in the
> entire container at something like 0.5 clock cycles per bit in my computer.

Haskell is seemingly improving all the time, but it's still got a way to 
go. The stuff people have targetted their efforts at are quite fast, but 
some less-interesting parts are unexpectedly slow. (E.g., the various 
rounding functions are rather slow. They work by turning a Double into 
an approximate propper fraction and then converting that to an integer - 
or something equally absurd...)

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*


Post a reply to this message

From: Fredrik Eriksson
Subject: Re: Haskell arrays
Date: 10 Mar 2009 16:26:43
Message: <op.uqle2znd7bxctx@e6600>
On Tue, 10 Mar 2009 21:08:05 +0100, Orchid XP v8 <voi### [at] devnull> wrote:
> Warp wrote:
>>   OTOH this actually causes more trouble than it's worth, as it makes
>> std::vector behave in certain cases rather differently when the element
>> type is bool compared to any other element type. This is a common  
>> complaint.
>
> ...because C++ "bool" is actually some mannar of integer?

No, because std::vector<bool> is specialised in such a way that:
  1 - It is not a container (not as the C++ standard defines them)
  2 - It does not actually contain values of type 'bool'

It is one of those things that seemed like a good idea at the time, but  
did not turn out so good in practice.


-- 
FE


Post a reply to this message

From: Warp
Subject: Re: Haskell arrays
Date: 10 Mar 2009 18:41:00
Message: <49b6ec7c@news.povray.org>
> On Tue, 10 Mar 2009 21:08:05 +0100, Orchid XP v8 <voi### [at] devnull> wrote:
> > ...because C++ "bool" is actually some mannar of integer?

  The main problem is that you can't have a pointer pointing to an
individual bit. The smallest element to which a pointer can point is
a byte.

  However, in a std::vector<bool> you have 8 bools per byte, and you can't
create a pointer (or reference) which will point to one of them. This
makes std::vector<bool> behave inconsistently compared to a std::vector
of any other type.

  You can index a std::vector<bool> in the regular way, and you can even
assign values to these indexed elements. In other words, you can do this:

std::vector<bool> boolVec(100, false);
boolVec[50] = true; // Works ok

  However, since operator[] cannot return a true reference to the indexed
bool (because there can't be a reference to an individual bit), what it
returns is a proxy object which behaves like if it was a reference to the
individual bit. Thus it almost looks like std::vector<bool> behaves like
any other vector.

  However, since operator[] for bool vectors does not return a reference
of type bool, you can't do something like this:

bool& boolRef = boolVec[50];

  That will simply fail. The same would work for any type other than bool.

  References and pointers to vector elements are used quite a lot in
practical C++, so this makes std::vector<bool> inconsistent and less
usable than it could be.

  Since it's rare in practice that one would need a huge std::vector<bool>
(if what you really need is a huge bitset, then boost::dynamic_bitset is
by far a better choice, as it's more versatile and efficient), it can be
argued that making std::vector<bool> special was a design mistake.

  It has been suggested that std::vector<bool> would instead allocate
one byte per element because bytes can be addressed, and this would make
it consistent. It would probably also make it more efficient speedwise.

Fredrik Eriksson <fe79}--at--{yahoo}--dot--{com> wrote:
> No, because std::vector<bool> is specialised in such a way that:
>   1 - It is not a container (not as the C++ standard defines them)

  I'm not really sure what you mean by that.

  std::vector<bool> works mostly like any other container. The only thing
it cannot do is give you an address to an individual element. Too bad that's
a cumbersome limitation in some cases.

-- 
                                                          - Warp


Post a reply to this message

From: Nicolas Alvarez
Subject: Re: Haskell arrays
Date: 10 Mar 2009 18:48:11
Message: <49b6ee2b@news.povray.org>
Invisible wrote:
> Kyle wrote:
>> Wrong newsgroup.  Please use p.ot.f.h.b.b.b  :P
> 
> Or, alternatively, just stop writing about Haskell. I mean, it's not
> like anybody is interested in my ramblings or anything... :-S

Blog.


Post a reply to this message

From: Fredrik Eriksson
Subject: Re: Haskell arrays
Date: 10 Mar 2009 19:03:35
Message: <op.uqlmcghr7bxctx@e6600>
On Tue, 10 Mar 2009 23:41:00 +0100, Warp <war### [at] tagpovrayorg> wrote:
> Fredrik Eriksson <fe79}--at--{yahoo}--dot--{com> wrote:
>> No, because std::vector<bool> is specialised in such a way that:
>>   1 - It is not a container (not as the C++ standard defines them)
>
>   I'm not really sure what you mean by that.
>
>   std::vector<bool> works mostly like any other container. The only thing
> it cannot do is give you an address to an individual element. Too bad  
> that's a cumbersome limitation in some cases.

You described the reasons yourself higher up in your post. The C++  
standard describes the requirements for "container" types (section 23.1),  
and std::vector<bool> does not satisfy them.


-- 
FE


Post a reply to this message

From: Invisible
Subject: Re: Haskell arrays
Date: 11 Mar 2009 05:08:19
Message: <49b77f83$1@news.povray.org>
>>> Wrong newsgroup.  Please use p.ot.f.h.b.b.b  :P
>> Or, alternatively, just stop writing about Haskell. I mean, it's not
>> like anybody is interested in my ramblings or anything... :-S
> 
> Blog.

Heh. Nobody reads that either...


Post a reply to this message

From: Invisible
Subject: Re: Haskell arrays
Date: 11 Mar 2009 05:09:31
Message: <49b77fcb$1@news.povray.org>
>>> ...because C++ "bool" is actually some mannar of integer?
> 
>   The main problem is that you can't have a pointer pointing to an
> individual bit. The smallest element to which a pointer can point is
> a byte.

Ah, I see.


Post a reply to this message

From: Tim Attwood
Subject: Re: Haskell arrays
Date: 11 Mar 2009 05:37:15
Message: <49b7864b@news.povray.org>
>>>> Wrong newsgroup.  Please use p.ot.f.h.b.b.b  :P
>>> Or, alternatively, just stop writing about Haskell. I mean, it's not
>>> like anybody is interested in my ramblings or anything... :-S
>> 
>> Blog.
> 
> Heh. Nobody reads that either...

Did you see that MS is paying for a full ride for a PHD
at Strathclyde in Glasgow researching Haskell numeric
indexing? Pay is the key word there. They intend the 
code to end up in GHC, perhaps as optimizations for
parallel threads. You should apply for this Andrew,
it has to be better than where you are working. If you
make them happy I bet you could end up working for MS.


Post a reply to this message

From: Invisible
Subject: Re: Haskell arrays
Date: 11 Mar 2009 05:49:19
Message: <49b7891f$1@news.povray.org>
Tim Attwood wrote:

> Did you see that MS is paying for a full ride for a PHD
> at Strathclyde in Glasgow researching Haskell numeric
> indexing?

http://personal.cis.strath.ac.uk/~conor/phds/

No, I missed the announcement.

> Pay is the key word there. They intend the code to end up in 
> GHC, perhaps as optimizations for
> parallel threads. You should apply for this Andrew,
> it has to be better than where you are working. If you
> make them happy I bet you could end up working for MS.

Heh, well it can't hurt to ask...

(Where the hell is "strathclyde" though? Or Glasgow for that matter... 
Time for Google Maps!)


Post a reply to this message

<<< Previous 9 Messages Goto Latest 10 Messages Next 10 Messages >>>

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.