POV-Ray : Newsgroups : povray.off-topic : Nice data structure : Re: Nice data structure Server Time
5 Sep 2024 15:28:03 EDT (-0400)
  Re: Nice data structure  
From: Darren New
Date: 20 Jun 2009 17:26:18
Message: <4a3d53fa$1@news.povray.org>
Warp wrote:
> Darren New <dne### [at] sanrrcom> wrote:
>> I think the way the original paper worked it is to store the '\0' as part of 
>> the string in the tree. So your CAT vs CATS would be
>> -->C-->A-->T-->S-->\0
>>             |
>>             v
>>            \0
> 
>   But that's not a null pointer indicating the end of the string. That's
> a regular node containing a special character which indicates the end of
> the string. Implementing my automatically-balanced tree creation idea can
> use the exact same method for marking ends of strings.

Then we're in agreement. I was using \0 to indicate ASCII NUL, not C's NULL 
pointer.  You can either use a node whose character value is '\0', or you 
can use a separate bit, depending on your needs.

>   Btw, the node containing an end-of-string value idea works only as long
> as there's a special character value which never appears in a string. It
> can't be done like that if the range of values of each character in your
> "string" can hold is the full 0-255 range. Of course you could use values
> larger than 8 bits, but that's just basically the exact same thing as having
> an end-of-string bit at each node.

Yep. Or you could have another pointer that points to the data that is 
associated with that string (i.e., the "value" of the "key/value" pair), and 
if that's not NULL, then a string ends at that node.

> 16 bytes to store one single character sounds quite a lot... :)

And a fourth pointer for the value associated, yes. Of course, any tree 
structure is going to be close to the same size too.  If you have lots and 
lots of strings, and you're more worried about space than speed, a hash 
table works out better, I expect.

>   Of course strings with equal beginnings will share nodes, but if the
> strings are very diverse, they will still end up using quite a lot of
> memory.

If you're using up enough memory to be worried about it, chances are you 
have enough strings that the beginnings will overlap a lot? I mean, unless 
you're really memory-constrained.

>   In fact, since no character can be smaller than '\0', you could also
> have additional '\0' nodes on the left branch, each one pointing to
> associated data. That way you can have a multiset, ie. the same key
> pointing to multiple data, rather than just one.

Hmmmmm. Well, certainly the '\0' node could point to a different structure 
than the other nodes. You already have three pointers there to play with, 
and you could reasonably hang an entire tree underneath if you wanted.

-- 
   Darren New, San Diego CA, USA (PST)
   Insanity is a small city on the western
   border of the State of Mind.


Post a reply to this message

Copyright 2003-2023 Persistence of Vision Raytracer Pty. Ltd.