POV-Ray: Newsgroups: povray.off-topic: 10 things to use a Min Heap for

POV-Ray : Newsgroups : povray.off-topic : 10 things to use a Min Heap for		Server Time 12 Jul 2025 12:35:06 EDT (-0400)

Goto Latest 10 Messages

Next 10 Messages >>>

From: Invisible
Subject: 10 things to use a Min Heap for
Date: 18 Jun 2008 06:56:02
Message: <4858e9c2$1@news.povray.org>

...OK, actually I can only think of 2 things to use a Min Heap for:

1. A heap sort.

2. Building Huffman trees.

Can anybody else think of a good use for a Min Heap? (Or even a Max Heap?)

I can think of plenty of algorithms that require some data in sorted 
order, but any sorting algorithm will do for that. (And on current 
hardware, in-place algorithms are usually faster than a heap sort.) Can 
anybody think of an algorithm that benefits from an actual heap?



[For anybody not already familiar with such things...

A Min Heap is a kind of binary tree that stores data in such a way that 
retrieving the "minimum" element is an O(1) operation. Insertion and 
deletion are both O(log n), and I believe joining two heaps should be 
O(log n) as well.

A "heap sort" involves turning something into a heap (with O(n log n) 
complexity) and then repeatedly removing the minimum element (with O(n) 
complexity) to yield a fully sorted list.

To build a "Huffman tree", you start with a set of trivial 1-node trees. 
Each tree has a probability assigned to it. The algorithm is to simply 
select the two lowest-probability tries, remove them from the set, 
combine them into a larger tree (with a correspondingly higher 
probability) and insert this back into the set. Eventually you will have 
1 tree with Pr=1.

If this "set" is in fact a Min Heap sorted by tree porbability, then you 
end up with a nice fast algorithm.

Here ends the lesson on algorithms and data structures.]

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: scott
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 07:04:52
Message: <4858ebd4@news.povray.org>

Dude, you so should be doing something else than fixing computers for a job.

Post a reply to this message

From: Invisible
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 07:17:46
Message: <4858eeda$1@news.povray.org>

scott wrote:
> Dude, you so should be doing something else than fixing computers for a 
> job.

No, really??

Heh. Now tell me something I don't already know. ;-) [Such as how to 
*find* a better job... that'd be useful!]

While I'm not desputing the truth of your statement, I am curios to know 
what lead you to this conclusion. The fact that I know what a min heap 
is? [It's clearly described on Wikipedia. Doesn't take long to 
comprehend.] That I know about Huffman trees? [Again, not hard to look 
up.] What?

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: scott
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 07:33:02
Message: <4858f26e@news.povray.org>

> No, really??
>
> Heh. Now tell me something I don't already know. ;-) [Such as how to 
> *find* a better job... that'd be useful!]

Try hammering some random search terms into monster.co.uk, even "haskell" 
brings up a couple of results!  I can't believe there is no job on there for 
you.

> While I'm not desputing the truth of your statement, I am curios to know 
> what lead you to this conclusion. The fact that I know what a min heap is? 
> [It's clearly described on Wikipedia. Doesn't take long to comprehend.] 
> That I know about Huffman trees? [Again, not hard to look up.] What?

The fact that you always go on about quite complicated stuff that not many 
other people would understand, and I mean not many within the IT industry, 
not the entire population.

Post a reply to this message

From: Invisible
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 08:19:42
Message: <4858fd5e$1@news.povray.org>

scott wrote:
>> No, really??
>>
>> Heh. Now tell me something I don't already know. ;-) [Such as how to 
>> *find* a better job... that'd be useful!]
> 
> Try hammering some random search terms into monster.co.uk, even 
> "haskell" brings up a couple of results!

Yep. Already done that. And they have my CV, so every now and then they 
email me a job advert or 12. [Usually in Canada... WTF??] I even went to 
an interview a few weeks back...

They've sent me a couple of messages telling me I'm "ideal" to work for 
HMGCC. ("HMG" as in "Her Majesty's Government". Scary shit, right there!)

> I can't believe there is no job on there for you.

I'm fairly confident somebody somewhere wants to employ somebody like 
me. The part I'm worried about is *finding* this company...

>> While I'm not desputing the truth of your statement, I am curios to 
>> know what lead you to this conclusion.
> 
> The fact that you always go on about quite complicated stuff that not 
> many other people would understand, and I mean not many within the IT 
> industry, not the entire population.

Pfft. It's all stuff that's readily available on Wikipedia. And it's not 
exactly "complicated".

Assuming you already comprehend what a binary tree data structure is, a 
heap is just a special case: Every node holds data, and the data in each 
node must be less than or equal to the data in both its child nodes. 
[Notice there's no ordering constraint on siblings - so the tree need 
not be fully ordered, only partially.]

Finding the minimum element is O(1) because, by definition, the minimum 
element is always in the root node.

Inserting an element means comparing it to what's in the root node, 
putting the smaller of the two back into the root, and propogating the 
larger of the two down the tree recursively. [Notice I didn't say which 
child node. There are actually several possible algorithms here, each 
with different characteristics.]

Deleting an element likewise involves throwing away what's in the root 
and moving the smaller of the two child elements up into the root. This 
"moving" thereby deletes an element from one of the child nodes, so you 
must recurse.

Personally I think Haskell source code says it better, but that's just me.

A Huffman code is a similarly straight-forward idea. When you save a 
normal ASCII text file, each character is given an 8-bit code. [But if 
you want to be technical, ASCII is a 7-bit code, so the extra bit is 
always a 0.] The idea behind Huffman codes is that instead of every 
character having an 8-bit code, different characters can have codes of 
different lengths.

If you give the most common characters shorter codes [and therefore, by 
necessity, give the rarer characters longer codes], then simply 
re-encoding your text from ASCII to this new coding system will make 
your file smaller! :-D

But how to construct this new coding scheme? And if codes can be 
different lengths, how do you figure out where one code ends and the 
next begins?

The answer to the second question is that the code must be a "prefix 
code". That is, no full code can ever be the prefix of another full 
code. (It may be a suffic, but never a prefix.) So if 011 is a valid 
code, all on its own, then no other valid code can begin with 011 
(although it can END with 011). This means that if the decoder sees 011, 
it knows it just read a whole code, and the next bit is a seperate code.

An easy way to build a prefix code is to make each code be a set of 
directions from the root of a tree to the leaves. Assuming each leaf 
stores a character, each leaf's address will be a prefix code.

So, "all" we have to do is build a tree where the rare characters are 
deep in the tree, and the common characters are shallow. We then read 
off the tree address for each character and we will have our prefix code.

One obvious way is to take our set of characters and split it into two 
halves such that the total probability for each half is roughly 50%. 
Then all the characters in one set go on one side of our tree, and all 
the characters in the other half go on the other side of our tree. 
Recurse until you're done, and you have a nice tree.

This is called a Shannon tree, and it doesn't work very well. It's 
actually fairly hard to split a set into two equal-probability sets. (I 
have a vague recollection it might be NP-complete. Darren will know.)

The *other* thing you can do is work backwards. The rarest characters 
should end up at the bottom of the tree, so let's put those into trees 
*first*. And the common characters need to end up near the top, so let's 
add those *last*. In other words, pick the two lowest-probability trees, 
join them into a single tree, repeat until, like Hilander, there is Only 
One.

*This* is called a Huffman tree. And it works much better. So, to summarise:

- Assign a "probability" to each character. [Guess them, estimate them, 
count them up, whatever.]

- Build a set of trivial trees each consisting of a leaf containing a 
character and its probability.

- Select the two lowest-probability trees, remove them from the set, 
join them into a larger tree, insert this back into the set. Repeat 
until only 1 tree remains.

- For each ASCII code in your input, replace it with the tree address of 
that character in your tree.

- Assuming the probabilities you assigned were correct, the output file 
is *guaranteed* to be smaller than or, in the worst case (every 
character has equal probability), exactly the same size as the original. 
(But note that you probably need to write down what the tree was so the 
decoder can reconstruct it! This will make the file fractionally larger.)

Note that *any* prefix code will work. There's nothing "magical" about 
Huffman codes. The Huffman algorithm is just an efficient way to build 
such a code. If you know that, for example, low numbers are common in 
your input and high numbers are rare, you can use a "Fibonacci code", 
which is a prefix code with a predefined set of probability assumptions, 
so it takes no effort to compute. [It's based on Fibonacci numbers, 
hence the name.]

For something like a sound signal, where most samples are near to zero, 
this could be a win. (Also, look at Predictive Coding, where you make a 
"prediction" for each sample based on some property of the last few 
samples, and record the "error" between the prediction and the actual 
sample. Assuming your predictions are good, the errors will always be 
"small" numbers.)

Huffman codes will work on anything, and they work quite well. There 
*is*, however, a thing called "arithmetic coding", which does the same 
thing but better. You see, it assigns a *fractional* number of bits to 
each character.

That probably sounds completely incomprehensible. Put simply, the 
compressed file contains an integral number of bits. (Duh!) But if you 
encode 100 identical characters, the output bit length need not be 
divisible by 100. You could say the number of bits gets rounded up to 
the nearest whole number when you complete the file.

In practice, what this means is that the number of bits per character 
can be closer to the binary logarithm of the probability for that 
character - the "perfect" bit size. So you get a few percent more 
compression this way. OTOH, the algorithm is harder. And patented, I 
believe...?

Of course, comming up with better probability estimates will have a much 
more dramatic effect on compression. Running a Huffman coder on a normal 
text file gives typically 50% compression. But take into account letter 
*combination* frequencies and you can go much higher than that. 
Theoretically, PPM + arithmetic coding should give the best compression 
that money can buy!

You know what? I think I might have just disproven the point I was 
trying to make...? <:-,

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: scott
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 08:59:48
Message: <485906c4@news.povray.org>

> They've sent me a couple of messages telling me I'm "ideal" to work for 
> HMGCC. ("HMG" as in "Her Majesty's Government". Scary shit, right there!)

I would have thought so too, I had a couple of friends that seemed to posses 
the same skills and interests that you do, and they got jobs there.  The pay 
isn't that great, but it would be good experience.

> Pfft. It's all stuff that's readily available on Wikipedia. And it's not 
> exactly "complicated".

Not to you maybe, but it would be to lots of "IT" people I know.  Just 
believe me, it's not normal for someone working in IT (and I don't mean 
software development) to know about half the stuff you post about (and seem 
to understand).

Post a reply to this message

From: Mike the Elder
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 09:20:00
Message: <web.48590a7dc40c56bb5a8888d90@news.povray.org>

Invisible <voi### [at] devnull> wrote:

> Heh. Now tell me something I don't already know. ;-) [Such as how to
> *find* a better job... that'd be useful!]
>
Actually, with respect to finding a better job, the problem has less to do with
*knowing* what to do (which isn't all that complicated) than with resolving to
actually do it.  The work-a-day world can can easily leave one feeling drained
and thoroughly disinclined to take on as demanding a project as conducting a
proper job search.  Job hunting not only uses up lots of time and effort in the
physical sense, it is also emotionally taxing in that it requires one to face
the likelihood of multiple rejections before achieving any success.  It's
rather unlikely that the following list contains anything that you couldn't
have thought of yourself, but it might be useful to have these ideas in front
of you for consideration, so here goes:

1.  Don't wait until you NEED a new job to look for a new job.  That's how
people end up having to take yet another job that they don't get anything from
but a paycheck.

2.  Make a short list NOW of places where you would like to be working in two
years and establish communications with both the human resources department AND
the department(s) that you would like to work for.  Be very explicit that your
interest is specific, serious, focused and long-term and that you are NOT
merely inquiring in regard to current posted openings.

3.  Create and *maintain* a library of job search support documents such as
resumes in various formats, references, recommendations. college transcripts
and examples of your work so that you can take advantage of any opportunity
INSTANTLY using top quality documents.

4.  Always send important communications in BOTH electronic and paper format.
In any given company, it might be either one that will be retained and used to
contact you should an opportunity arise.  Always send your resume and cover
letter UNFOLDED and in a presentation cover in a flat 9' x 12' envelope. Making
the effort to introduce yourself with professional looking documents really DOES
matter.

5. Keep records of whom you have contacted and when.  Never allow more than
ninety days to go by without renewing a contact.  Only make contact more
frequently than this after you have established a relationship with a specific
individual who has expressed a positive interest.

Again, none of this is any great revelation. The real question is whether or not
one is actually DOING it.

Best Regards and Good Hunting,
Mike C.

Post a reply to this message

From: Invisible
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 09:23:47
Message: <48590c63$1@news.povray.org>

>> They've sent me a couple of messages telling me I'm "ideal" to work 
>> for HMGCC. ("HMG" as in "Her Majesty's Government". Scary shit, right 
>> there!)
> 
> I would have thought so too, I had a couple of friends that seemed to 
> posses the same skills and interests that you do, and they got jobs 
> there.  The pay isn't that great, but it would be good experience.

Heh. When I went to that job interview a few weeks back, the guy said "I 
don't think you belong in a commercial enterprise. I think you would be 
better as a research associate at a university somewhere." Yeah, cos 
research is what I do best...? Wait, WTF? o_O

I'm not too sure about HMGCC. The whole "you will be cleared to a high 
level of security clearance" think kinda freaks me out in that whole Big 
Brother type way. o_O

Plus, all their software development stuff clearly says all over it 
"prior experience of developing large-scale applications is an absolute 
requirement". So I guess I fail, right there.

>> Pfft. It's all stuff that's readily available on Wikipedia. And it's 
>> not exactly "complicated".
> 
> Not to you maybe, but it would be to lots of "IT" people I know.  Just 
> believe me, it's not normal for someone working in IT (and I don't mean 
> software development) to know about half the stuff you post about (and 
> seem to understand).

Hmm. Tell me something - all that stuff I just posted? Does it make any 
semblance of comprehensible sense?

To *me* it makes sense, but then *I* wrote it. And I'm aware that often 
an expert will try to explain something and completely forget some vital 
item of information that the audience doesn't have but is so "obvious" 
to the expert that they forget to mention it...

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

From: scott
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 09:35:09
Message: <48590f0d$1@news.povray.org>

> Actually, with respect to finding a better job, the problem has less to do 
> with
> *knowing* what to do (which isn't all that complicated) than with 
> resolving to
> actually do it.  The work-a-day world can can easily leave one feeling 
> drained
> and thoroughly disinclined to take on as demanding a project as conducting 
> a
> proper job search.

"Take time to sharpen your saw"

I forgot who told me that, it was probably quoted from some management fluff 
book, but it's true.  It's better to take a break from your routine and try 
to optimise it, it will be beneficial in the long run, even if it means you 
don't get as much work done that day.  And by optimise, it could just be 
writing a program in Haskell to speed up some repetitive task, reorganising 
your entire workload, or even looking for a new job.

I'm always quoting that to our Japanese colleagues because they seem to work 
such long hours so inefficiently, and often give the "we don't have time" 
excuse for not doing something.

> 2.  Make a short list NOW of places where you would like to be working in 
> two
> years and establish communications with both the human resources 
> department AND
> the department(s) that you would like to work for.

This link may help, if you can't think of enough places:

http://www.timesonline.co.uk/tol/life_and_style/career_and_jobs/best_100_companies/best_100_tables/

Post a reply to this message

From: Invisible
Subject: Re: 10 things to use a Min Heap for
Date: 18 Jun 2008 09:44:16
Message: <48591130$1@news.povray.org>

>> Heh. Now tell me something I don't already know. ;-) [Such as how to
>> *find* a better job... that'd be useful!]
>
> Actually, with respect to finding a better job, the problem has less to do with
> *knowing* what to do (which isn't all that complicated) than with resolving to
> actually do it.

Probably not complicated if you know what you're doing. I haven't got a 
clue. Sure, I've tried the obvious things, but those haven't worked. So 
now I'm kinda not sure what's next.

> The work-a-day world can can easily leave one feeling drained
> and thoroughly disinclined to take on as demanding a project as conducting a
> proper job search.

Yeah, that's true enough. Sometimes I struggle to find the motivation to 
do stuff I *like* doing, damnit! (It can be very hard to try to motivate 
yourself to do stuff when you know nobody is actually going to care one 
way or the other.)

Actually, I've been thinking about maybe taking a week off work so I can 
spend some time doing this "for real". Unfortunately every place of 
business shuts 20 minutes before I get home. [Quite maddening really!] 
If I had a week, I'd be able to go places and speak to people and stuff...

> 1.  Don't wait until you NEED a new job to look for a new job.

Heh. I'm tempted to leave this job right NOW, even though I have nowhere 
else to go! But clueless as this company is, they've been around for 20 
years, and I don't suppose they're going anywhere real soon. They'll 
just continue to blunder on, making spectacular financial losses, as 
they always have.

> 2.  Make a short list NOW of places where you would like to be working in two
> years and establish communications with both the human resources department AND
> the department(s) that you would like to work for.  Be very explicit that your
> interest is specific, serious, focused and long-term and that you are NOT
> merely inquiring in regard to current posted openings.

OK, I fail here. I cannot think of *any* company I'd actually want to 
work for.

I mean, theoretically working for somebody like Yamaha or Korg or even 
Nokia "sounds" realy neat. But actually? I have no clue who would really 
be worth working with. My dad works in a laboratory, which "sounds" 
really cool, but actually... isn't.

I did look into a list of big-name technology companies to see where 
they have bases of employment. IIRC, Nokia, HP and Erricson have big UK 
bases, but they're nowhere near where I live. Yamaha has a big place in 
my city, but I think that's manufacturing not R&D. Korg has a base here 
too, but I couldn't find any data without speaking to a human. That's 
about as far as I got.

> 3.  Create and *maintain* a library of job search support documents such as
> resumes in various formats, references, recommendations. college transcripts
> and examples of your work so that you can take advantage of any opportunity
> INSTANTLY using top quality documents.

I have an up-to-date CV. Spent lots of money on consultants to help me 
sharpen it. It's currently available online at Monster.co.uk. Every now 
and then some random agency contacts me, asks me a few questions, and 
then is never heard from again.

It says on my CV "references available on request", but it strikes me 
taht I don't actually *have* anybody I could point them to. Nobody knows 
me! All the management in my current place of work recently changed, so 
nobody in management here really knows anything about me. My university 
shut down when I graduated. I don't have any "friends" or anything...

As for examples of my work... it strikes me that I've almost never 
produced a "finished" program in my life. That's not a good thing. And 
obviously, nobody is going to employ me to program in Haskell; they'll 
want C, C++, Java or VB (or maybe some scripting language). I haven't 
touched any of those in years now.

And then there's the minor detail that although I know a lot of stuff 
about stuff... how many people actually need to know what a Huffman tree 
is? None. Nobody needs to know this. It serves no useful purpose. And 
Warp is probably right: I'm not a very good programmer.

> 4.  Always send important communications in BOTH electronic and paper format.

You can't do that without a postal address. (And stamps. Do you have any 
idea how hard it is to purchase stamps?!)

> 5. Keep records of whom you have contacted and when.  Never allow more than
> ninety days to go by without renewing a contact.  Only make contact more
> frequently than this after you have established a relationship with a specific
> individual who has expressed a positive interest.

Fair enuf...

-- 
http://blog.orphi.me.uk/
http://www.zazzle.com/MathematicalOrchid*

Post a reply to this message

Goto Latest 10 Messages

Next 10 Messages >>>