POV-Ray: Newsgroups: povray.off-topic: A monologue involving binary log

POV-Ray : Newsgroups : povray.off-topic : A monologue involving binary log		Server Time 8 Jul 2024 09:01:59 EDT (-0400)

Goto Latest 10 Messages

Next 10 Messages >>>

From: Orchid Win7 v1
Subject: A monologue involving binary log
Date: 14 Feb 2015 10:33:32
Message: <54df6acc@news.povray.org>

OK, so it's been a while since I posted a big wall of text. So here goes...



The name of the game is data compression. If you already know how that 
works, this is going to be a VERY boring post. If you don't know, it 
probably seems like magic; you take a file, throw it at WinZip, and 
suddenly the file gets 4x smaller.

Now MP3 does that to sound files, but it does that by DELETING DATA. The 
compressed file has lower fidelity than the original. It's not that 
surprising that you can make a file smaller by removing data from it. 
But LOSSLESS compression algorithms do something seemingly impossible: 
they take a file, somehow make it smaller, and yet the process is 
EXACTLY REVERSIBLE. When you decompress the file again, it is IDENTICAL 
to the original. How is that even possible?

Well, it ISN'T always possible. Sometimes when you zip a file it gets 
tiny, and sometimes it doesn't get that much smaller. You may not have 
experienced this, but occasionally a file will get LARGER. For example, 
zip a file - any file - and then go zip the zip file. It WILL NOT get 
any smaller. I promise. (Think about it; otherwise you could take an 
arbitrary file and zip it over and over until it gets arbitrarily small. 
But that really IS impossible!)

Rather than dwell on this further, I'm going to show you some 
compression algorithms now. We're going to start with Huffman coding.

As you probably know, the American Standard Code for Information 
Interchange ("ASCII") assigns an 8-bit code to every letter of the 
alphabet. (You may or may not know that technically ASCII is actually a 
7-bit code; any text encoding which actually *uses* the 8th bit for 
something technically isn't ASCII, it's an ASCII extension.)

You might also be aware that Unicode defines zillions of characters as 
24-bit codes. In UTF-32, every character is stored as 32 bits. In 
UTF-16, almost all characters are stored as 16 bits, but particularly 
unusual characters take 32 bits. And in UTF-8, many characters take only 
8 bits, a few take 16 bits, the really rare ones take 24 bits and a tiny 
few take 32 bits. (The extra bits are necessary to signal how many bits 
make up the character.)

The various UTF encodings are of course designed to make encoding and 
decoding as painless as possible, while not wasting absurd amounts of 
space. UTF-32 is particularly easy to deal with, but very space-hungry. 
UTF-8 is the most frugal, but quite awkward if there are lots of 
multi-byte characters.

But suppose we want to assign codes to characters, not so that 
processing is easy, but so that space is minimised. Ease of use is not a 
requirement; we just want to crunch the space down as much as possible.

It seems obvious that the fewer bits per character we have, the less 
space this is all going to take up. If you have a plain ASCII file, just 
dropping every 8th bit already gives us 1/8 compression ( = 12.5%).

We can go one step further: Scan the whole file, and see which 
characters are *actually used* in this file. Codes 0 to 31 in ASCII are 
all "control codes", and with the exception of the newline character(s), 
they're all pretty much obsolete now. By not assigning any code at all 
to characters we aren't actually using, we *might* be able to get away 
with 6 bits per character. Maybe.

But even if we do, that's still "only" 25% compression. And if there are 
64 characters used then 6 bits of character will do, but if there are 65 
characters used then we have to have 7 bits - which is annoying.

But now, suppose we assign different code lengths to different 
characters. By assigning shorter codes to common letters and longer 
codes to rare letters, we can potentially store the same data in very 
much reduced space.

Suppose, for example, that we assign a 2-bit code to the letter E, and 
7-bit codes to everything else. Then every single time the letter E 
appears, we're saving 6 bits (compared to the original ASCII file). 
Since E probably appears quite a bit, that could add up to substantial 
savings. (Hint: The character " " probably appears *way more* than ANY 
actual letter of the alphabet, but hey.)

It might seem tempting to assign the shortest possible codes to 
everything. However, for X number of bits, there are only 2^X possible 
codes. If you make one code shorter, you have to make other codes 
correspondingly longer. So if you take a character that isn't *that* 
common and give it a really short code, you'll have to make all the 
other codes longer to compensate. Because the character isn't *that* 
common, the savings from the short code don't make up for the losses in 
all the other longer codes.

In summary, for every character there is a mathematically "perfect" code 
length, which will give best possible compression. And that length is 
equal to [the negation of] the binary logarithm of the probability of 
that character appearing:

   bits(X) = -log2 Pr(X)

If X takes more bits than this, you're wasting space. If X takes fewer 
bits than this, you're wasting bits somewhere else. So this is the 
"perfect" length of X.

(Note that if we were trying to encode characters with decimal codes, it 
would be log10 rather than log2. The 2 is because computers are binary.)

You can work out the "probability of X appearing" by simply counting how 
many X characters there are in the entire file, divided by the *total* 
number of characters. So that's easy enough.

Now, how to assign codes to characters? And how to figure out where one 
character ends and the next begins? In a normal (ASCII) text file, you 
know that each byte is one character. But if different characters take 
different numbers of bits, how are we to know where the start and end of 
each character code is?

The answer is to use a binary tree. For example:

       +
      / \
     +   C
    / \
   A   B

Starting from the top, go left for 0 and go right for 1. If we start at 
the top and go right, we immediately reach C. So C = 1. If we go left 
and then left, we get to A. So A = 00. And B = 01. So from this tree, we 
have

   A = 00
   B = 01
   C = 1

As we read bits from the input file, we can walk down the tree, going 
left when we read a 0 and right when we read a 1. When we get to a leaf, 
we spit out that character, and go back to the top of the tree. Easy. 
[But very inefficient. There are tricks to speed this up, though.]

So to build a code which can be unambiguously decoded, we just need to 
build a suitable code tree. How can we do that?

One way is to build a "Shannon tree". We count the characters in the 
file, and end up with a table like

   Char  Prob
    A      7%
    B     12%
    C      3%
   ...    ...

Now we need to build a tree. What we do is to try to split the set of 
all [used] characters into two sets who's total probability is as close 
to 50% as possible. All the items in the left subset get codes starting 
with a 0, all the codes in the right subset get codes starting with a 1. 
Now recurse. When a subset contains only one item, we have determined 
its code.

Trouble is, splitting a set into two subsets with equal sum... That's 
the Partition Problem, and it's NP-complete. Which is a technical way of 
saying that this is REALLY FREAKING HARD to do well. Worse, there may be 
more than one partitioning that yields roughly 50%, and which one you 
pick affects how easy subsequent partitionings are. This looks really, 
really hard...

Worse still, it can be proven that Shannon trees are sometimes 
unavoidably sub-optimal. So this algorithm is really hard to problem, 
really slow to execute, and doesn't even give optimal results. Great. :-/

The thing is, you don't need to do any of this. Because there's a way, 
way easier algorithm: You can build a Huffman tree instead.

Instead of starting with the set of all characters and trying to 
recursively split it to build a tree, we start with single characters 
and recursively join them to build a tree. The algorithm is:

   1. Start with a set of trivial 1-element trees containing just a 
character.
   2. Pick the two lowest-probability trees and remove them from the set.
   3. Combine the two trees, and insert the new tree back into the set.
   4. If the set has more than one tree, to go step 2.

It's difficult to come up with an example big enough to see what's 
happening, but small enough to not take thirty pages of printout. Let's 
try this:

   Char  Prob
    A      2%
    B      2%
    C      5%
    D     20%
    E     68%
    F      3%

Our initial set of trees is therefore:

    2%    2%    3%    5%    20%    68%
    A     B     F     C      D      E

The two lowest-probability trees are A (2%) and B (2%). 2% + 2% = 4%. So 
now we get:

    3%    4%    5%    20%    68%
    F     +     C      D      E
         / \
        A   B
        0   1

Now we join F to AB:

    5%    7%    20%    68%
    C     +      D      E
         / \
        F   +
        0  / \
          A   B
         10   11

Now we join C to that tree:

     12%    20%    68%
      +      D      E
     / \
    C   +
    0  / \
      F   +
     10  / \
        A   B
      110   111

Add on D:

       32%    68%
        +      E
       / \
      +   D
     / \  1
    C   +
   00  / \
      F   +
    010  / \
        A   B
     0110   0111

And finally, our Huffman tree:

        100%
          +
         / \
        +   E
       / \  1
      +   D
     / \  01
    C   +
  000  / \
      F   +
   0010  / \
        A   B
    00110   00111

So we have

   Char   Prob   Code    Bits   Log2(Prob)
    A      2%    00110   5      -5.643
    B      2%    00111   5      -5.643
    C      5%    000     3      -4.321
    D     20%    01      2      -2.321
    E     68%    1       1      -0.556
    F      3%    0010    4      -5.058

As you can see, the actual number of bits is quite close to -Log2(Prob). 
(Only C is a bit off.) If E really does make up 68% of the characters in 
the file, then by encoding it with just 1 bit, we're saving a huge 
amount of space.

The trouble with all this, of course, is that in order to *decode* the 
data, you need a copy of the Huffman tree used to encode it. So you have 
to store that somewhere. Fortunately, if you pick the codes in a 
suitable systematic way, it's possible to reconstruct the codes just by 
knowing how many bits were assigned to each character.



Alright, so that's nice and everything. But look at that table again:

   Char   Prob   Code    Bits   Log2(Prob)
    A      2%    00110   5      -5.643
    B      2%    00111   5      -5.643
    C      5%    000     3      -4.321
    D     20%    01      2      -2.321
    E     68%    1       1      -0.556
    F      3%    0010    4      -5.058

Ideally, we want to assign half a bit to E. Well, that's *obviously* 
impossible, right? I mean, a bit is a 0 or a 1. What the heck would half 
a bit look like?

It turns out you *can* actually do this. (!) Suppose we want to use 
exactly 1.5 bits per character. If you ask the encoder to encode 1 
character, then 2 bits come out. But if you ask it to encode 2 
characters, only 3 bits come out. Not 4, just 3. And this is what it 
means; the number of bits output is always a whole number (obviously), 
but it's the fractional number of bits, rounded up. (It's a bit like 
buying a box of 7400s priced at 21.225p each. Obviously there's no such 
thing as 0.225p; if you buy one chip, they change you 22p. But if you 
buy 1,000 chips...)

Huffman encoding can't do fractional bits. And there are some 
limitations on the combinations of bit-lengths you can get. (E.g., C 
should be 4 bits or 5 bits, but we've given it 3 bits.) But there is an 
algorithm that lacks these limitations: range coding.

I'm going to demonstrate this in decimal, but the principle in binary is 
just the same. We start off with an "active range", which begins with 
0.0000 to 0.9999:

   Max = 0.9999
   Min = 0.0000

We then split that range into several subranges, one for each possible 
character. The width of that range equals the probability of that 
character appearing next:

         0.9999
    F    ||||||
         0.9700
         0.9699
    E    ||||||
         0.2900
         0.2199
    D    ||||||
         0.0900
         0.0899
    C    ||||||
         0.0400
         0.0399
    B    ||||||
         0.0200
         0.0199
    A    ||||||
         0.0000

Suppose the first character to be encoded happens to be a C. Then that 
becomes the current range:

   Max = 0.0899
   Min = 0.0400

and we split this new subrange up as before:

         0.089999
    F    ||||||||
         0.085800
         0.085799
    E    ||||||||
         0.054500
         0.054499
    D    ||||||||
         0.044500
         0.044499
    C    ||||||||
         0.042000
         0.041999
    B    ||||||||
         0.041000
         0.040999
    A    ||||||||
         0.040000

This process continues for each and every character. Finally, at the end 
of the calculation, we have a really, really tiny range. For example:

   Max = 0.04487515156722040876510168073524657069870453400
   Min = 0.04487515156722040876510168073524657069870453399

We then output one of the two values (it doesn't matter which), together 
with some indication of how many characters are supposed to be decoded. 
(E.g., if the input was just lots of As, the min value would stay at 
zero forever, and we couldn't tell how many As to decode.)

This is great and all, but it looks like we need to calculate with 
*ridiculous* precision. If I compress a 2GB file and it becomes a 200MB 
file, that's 200MB for *one number*. We have to do all calculations with 
200MB of precision?! That doesn't sound fun...

Fortunately, this is not the case. As the range gets smaller and 
smaller, soon the leading digits all become the same. Look at the table 
above: every single code begins 0.0xxx. So we can output the 0.0 part, 
and shift it out of our calculations. The result is that when we get to 
here:

   Max = 0.448737
   Min = 0.448611

we can output "0.448" and change the numbers to

   Max = 73799999
   Min = 61100000

We can then continue our calculation as before. So we only need to hold 
in memory the handful of digits that are *actually changing*. To the 
left are digits we've already output into the compressed file. To the 
right, the max is 999 recurring, and the min is 000 recurring.

(There is an issue called "underflow", where the min gets stuck at 9999 
and the max gets stuck at 0000 and the digits don't meet before you run 
out of precision. You can work around that by "remembering" how many 0s 
and 9s there are, and cancelling them out temporarily. When the digits 
do align, you just need to remember to reanimate the cancelled digits. 
It's fiddly, but not hard.)

Notice that the smaller a range is, the quicker the leading digits start 
to match, and the sooner we write digits out to the compressed file. In 
other words, narrow ranges take more precision to specify. To summarise, 
rare symbols take more digits than common ones - just like with Huffman 
codes.



Huffman coding and range coding are both examples of "entropy coding". 
In either case, how much compression you get depends on:

   1. How much difference there is between the probabilities of the symbols.
   2. How accurately you can predict the probabilities.

#1 is a property of the file itself; you can't change that. Some files 
just compress better than others. #2, however, depends on how you 
estimate your symbol probabilities.

The simplest thing to do is to count how many times each character 
appears in the whole file, and divide by the total character count to 
get a probability for each character. That *works*, but the decoder 
doesn't have access to the uncompressed file to compute the statistics, 
which means you have to *tell* the decompressor what to do.

Fundamentally, there are two kinds of compression:

* In "static" compression, you scan the entire file, decide how to 
compress it, and then do the actual compression.

* In "dynamic" compression, you decide how to compress the next item 
based only on the data you've already processed so far.

Static compression has the advantage that you can "see" all the data, so 
you can make the best compression decisions. But the decompressor can't 
see that data, so you have to "tell" the decompressor what you decided 
to do (e.g., tell it what Huffman table you built), or it won't be able 
to decompress. "Telling" the decompressor takes up disk space, making 
the file larger and partially negating the potentially superior 
compression ratio.

Dynamic compression means you have limited information to work with, so 
you can't make compression decisions quite as well, resulting in a lower 
compression rate. On the other hand, the decompressor can "see" exactly 
what the compressor saw, and by running the same algorithm it can come 
to the same conclusion. In other words, you don't have to "tell" the 
decompressor what you did; it can work it out automatically from the 
compressed data itself. So you get lower compression rates, but there's 
no extra data tables to send, so there's less file overhead.

In the end, it's swings and roundabouts. But I will say this: some 
algorithms work better static, and some work better dynamic.

You can do dynamic Huffman compression. You can start estimating all 
characters with equal frequency, and AFTER compressing each character, 
update your frequency estimates and rebuild the whole Huffman tree. The 
decompressor can do the same, and hence recover the Huffman tree without 
needing any tables. However, rebuilding the Huffman tree takes time; 
more likely you'll update it every dozen characters or something, 
otherwise speed will suffer too much.

But in general, people usually do static compression when they use 
Huffman coding. It's quicker to build the Huffman tree once, and then 
send it once.

You can do static range coding. You calculate all your probabilities, 
write them into the header of the file, and then encode everything. 
It'll probably take a bit of space (it's not just bit-counts like 
Huffman, it's limited-precision probability values for each character), 
but you can do it.

The key advantage of range coding, though, is that it *costs nothing* to 
completely change all your probability estimates for every single 
character. So range coders are usually dynamic.

Whether Huffman or range coder, static compression of an English text 
ASCII file usually gets about 50% compression. That's good, but not 
fantastic. I just tried some random text files on my PC, and by zipping 
them I got around 80% compression. So how do we do that?

A key thing you can do is to use a higher-order model. U is a fairly 
uncommon letter. There aren't too many words with U in them. So the 
probability of seeing a U might seem fairly low. But if the previous 
letter happens to be Q... surely the probability that the next letter is 
a U ought to be *a lot* higher now!

If you calculate a fixed probability for each character, that is a 
"1st-order Markov model". If you calculate a different set of 
probabilities for each letter depending on what the previous letter was, 
that's a "2nd-order Markov model". And it can give substantially 
improved compression.

Of course, coming up with probabilities for 256 different byte values is 
one thing. Coming up with 256 tables of 256 probabilities, that's 65,536 
probabilities we need to compute... If each one takes 1 byte to store, 
that's 64KB just to store the probability tables! If you're trying to 
compress a 5KB text file, adding 64KB of overhead is going to give you 
some serious negative compression! (I.e., the "compressed" file will 
actually be *bigger* than the original.)

Suffice it to say, nobody does static 2nd-order Markov. It eats too much 
space. But dynamic? Sure. And *that* probably means you're using a range 
coder.

But why base your probabilities only on the *last* character? Why not 
the last two? Or three? Or then?

In fact, if you do this, you start hitting the problem that if you've 
never seen a certain sequence of characters before, how do you estimate 
probabilities following it?

The usual solution is "partial predictive matching" (PPM). The 
probability tables [of which there are a hell of a lot!] are sparse. If 
you haven't seen a certain 10-character sequence before, you ignore the 
oldest character and look up the corresponding 9-character sequence. If 
you haven't seen that, you try 8-character, until you finally get a 
match. In this way, for each character you can make the best possible 
probability estimate. And better probability estimates = better 
compression. (And because this is all dynamic, there's no need to store 
huge tables in the compressed file. The decompressor will run the same 
algorithm and compute the same estimates.)

Compressors such as PAQ use range coding and PPM (amount other things) 
to top the compression ratio ratings. (At the expense of speed and 
memory usage, it must be said.)



Actually, PAQ is interesting, since it uses context mixing. That is, 
half a dozen different algorithms (not just PPM) estimate the 
probabilities for the next character. It then uses a left-learning 
artificial neural network (!!) to combine these probability estimates to 
get a final estimate.

It does all of this PER BIT!

The result is an excruciatingly slow compressor that gives astonishing 
compression ratios.

Post a reply to this message

From: Doctor John
Subject: Re: A monologue involving binary log
Date: 14 Feb 2015 16:10:21
Message: <54dfb9bd$1@news.povray.org>

On 14/02/15 15:33, Orchid Win7 v1 wrote:
> OK, so it's been a while since I posted a big wall of text. So here goes...
> 

Aha! The real Andrew is back.

An interesting read - have you ever thought of taking up teaching?

John
-- 
Protect the Earth
It was not given to you by your parents
You hold it in trust for your children

Post a reply to this message

From: Orchid Win7 v1
Subject: Re: A monologue involving binary log
Date: 14 Feb 2015 18:29:42
Message: <54dfda66$1@news.povray.org>

On 14/02/2015 09:10 PM, Doctor John wrote:
> Aha! The real Andrew is back.

Will the real Mathematical Orchid please stand up...

> An interesting read - have you ever thought of taking up teaching?

Erm... I've thought about it. But I've yet to see a class of people who 
actually *want* to learn stuff.

* Nobody at school wanted to learn a damn thing.
* The people at college were only there so they didn't have to get an 
actual job. None of them seemed that bothered if they get their 
qualification, never mind what grades they get.
* At university, everybody wanted the best grades, but nobody cared 
about the actual subject. They just wanted a nice piece of paper.
* I went on a Microsoft technology course. Everybody wanted the bit of 
paper, nobody really cared about the actual course content. Most of them 
were only there because their employer paid for it.

In short, I haven't come across a lesson where people are there because 
they actually *want* to learn something. Most of them are just seem to 
want the bit of paper that makes employers think they're awesome.

The fun part is when the *teacher* doesn't particularly want to teach. 
I'm sure I've told you about Mr Apathy, who I met at uni. I prefaced a 
lecture on binary by saying "of course you don't *really* need to know 
any of this. It's pointless really. I mean, the Java virtual machine 
handles all of this for you. But it's in the exam, so I'm supposed to 
teach it to you". Damn, I'm sold! Sign me up!!

Mind you, this was the same guy who spent ten minutes disinterestedly 
explaining how "pointless" the rebuilding of Colossus is. "I mean, it 
won't be the *actual* machine that they used. It's just a replica. And 
they don't make half the parts any more, so they'll have to change the 
parts and redesign the circuitry to compensate, and at the end of all 
that, it'll be so different from the original that there's just no point."

Not my favourite lecturer. :-P

Post a reply to this message

From: Doctor John
Subject: Re: A monologue involving binary log
Date: 14 Feb 2015 19:50:25
Message: <54dfed51$1@news.povray.org>

On 14/02/15 23:29, Orchid Win7 v1 wrote:
> On 14/02/2015 09:10 PM, Doctor John wrote:
>> Aha! The real Andrew is back.
> 
> Will the real Mathematical Orchid please stand up...
> 
: -)

> 
> Erm... I've thought about it. But I've yet to see a class of people who
> actually *want* to learn stuff.

<snip>


> In short, I haven't come across a lesson where people are there because
> they actually *want* to learn something. Most of them are just seem to
> want the bit of paper that makes employers think they're awesome.
> 

I was thinking about you going for a doctorate; you are as capable as
the Kat and I. All you need is an original project.

<snip>

> 
> Mind you, this was the same guy who spent ten minutes disinterestedly
> explaining how "pointless" the rebuilding of Colossus is. "I mean, it
> won't be the *actual* machine that they used. It's just a replica. And
> they don't make half the parts any more, so they'll have to change the
> parts and redesign the circuitry to compensate, and at the end of all
> that, it'll be so different from the original that there's just no point."
> 
> Not my favourite lecturer. :-P

Yeah. Come the revolution, he'll be the first against the wall. [I am
not joking]

John
-- 
Protect the Earth
It was not given to you by your parents
You hold it in trust for your children

Post a reply to this message

From: Thomas de Groot
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 03:34:53
Message: <54e05a2d$1@news.povray.org>

On 15-2-2015 0:29, Orchid Win7 v1 wrote:
> * Nobody at school wanted to learn a damn thing.
> * The people at college were only there so they didn't have to get an
> actual job. None of them seemed that bothered if they get their
> qualification, never mind what grades they get.
> * At university, everybody wanted the best grades, but nobody cared
> about the actual subject. They just wanted a nice piece of paper.
> * I went on a Microsoft technology course. Everybody wanted the bit of
> paper, nobody really cared about the actual course content. Most of them
> were only there because their employer paid for it.
>
> In short, I haven't come across a lesson where people are there because
> they actually *want* to learn something. Most of them are just seem to
> want the bit of paper that makes employers think they're awesome.
>


You are too pessimistic I believe (or the world has changed too much). 
Except for the last point (the Microsoft course) with which I agree, at 
all other levels I have been with people who genuinely wanted to /know/ 
and to /learn/. Not all of them of course but a substantial part, yes.

And I agree with John: you would be an excellent teacher.

-- 
Thomas

Post a reply to this message

From: Orchid Win7 v1
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 05:33:29
Message: <54e075f9$1@news.povray.org>

>> In short, I haven't come across a lesson where people are there because
>> they actually *want* to learn something. Most of them are just seem to
>> want the bit of paper that makes employers think they're awesome.
>
> I was thinking about you going for a doctorate; you are as capable as
> the Kat and I. All you need is an original project.

My mum wanted me to do a PhD. I have no idea why. As best as I can tell 
from reading Stack Overflow, if you have a PhD then you are basically 
over-qualified for 99% of all employment opportunities ever, and you'll 
never be able to work outside academia ever again.

>> Not my favourite lecturer. :-P
>
> Yeah. Come the revolution, he'll be the first against the wall. [I am
> not joking]

Dude, if you hate your job so much, WHY ARE YOU STILL DOING IT?!?

Still, at least he could speak English. Many of my lecturers couldn't 
(or at least not very well). Seriously, if your JOB is to teach people 
who only speak English, not being able to speak English is surely an 
instant disqualification. No?!?

Post a reply to this message

From: Orchid Win7 v1
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 05:44:16
Message: <54e07880@news.povray.org>

On 15/02/2015 08:34 AM, Thomas de Groot wrote:
> You are too pessimistic I believe (or the world has changed too much).
> Except for the last point (the Microsoft course) with which I agree, at
> all other levels I have been with people who genuinely wanted to /know/
> and to /learn/. Not all of them of course but a substantial part, yes.

I haven't met them. Everybody I've met regards having technical 
knowledge as only something "nerds" and "losers" have - including the 
people I work with, who are all professional computer programmers. They 
all know how to write C#, yet they seem to think that understanding 
relational algebra or knowing how floating-point arithmetic works is 
only for lamers.

> And I agree with John: you would be an excellent teacher.

Again, I don't know. I'm good at writing long monologues that nobody 
will ever read; interacting with a room full of live humans is quite 
different. It requires a different skill set.

My Dad tells me [so I don't know how true this is] that Bach was a 
virtuoso organ player, but a HORRIBLE teacher. He just couldn't 
understand why nobody else could play like him...

Post a reply to this message

From: Stephen
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 06:05:46
Message: <54e07d8a$1@news.povray.org>

On 15/02/2015 10:33, Orchid Win7 v1 wrote:
>> I was thinking about you going for a doctorate; you are as capable as
>> the Kat and I. All you need is an original project.
>
> My mum wanted me to do a PhD. I have no idea why. As best as I can tell
> from reading Stack Overflow, if you have a PhD then you are basically
> over-qualified for 99% of all employment opportunities ever, and you'll
> never be able to work outside academia ever again.
>

Maybe Mother knows best. :-)

Going from your mathematical ramblings. I think that you would be 
eminently suitable to do a PhD at a good university. Not one of those 
who were ennobled, just to take the overflow from schools. When 
politicians decided it would keep them off the dole for another three years.

>>> Not my favourite lecturer. :-P
>>
>> Yeah. Come the revolution, he'll be the first against the wall. [I am
>> not joking]
>
> Dude, if you hate your job so much, WHY ARE YOU STILL DOING IT?!?
>

Yeah! Go on and tell him. I'm all ears. :-P

And while you are at it. Which revolution or counter revolution, would 
that be? ;-)

> Still, at least he could speak English. Many of my lecturers couldn't
> (or at least not very well). Seriously, if your JOB is to teach people
> who only speak English, not being able to speak English is surely an
> instant disqualification. No?!?

It would be at a a school. Universities are different, they employ 
lecturers for what they know and publish. Their world and ours touch but 
occasionally.

-- 

Regards
     Stephen

Post a reply to this message

From: Stephen
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 06:16:10
Message: <54e07ffa$1@news.povray.org>

On 15/02/2015 10:44, Orchid Win7 v1 wrote:
> On 15/02/2015 08:34 AM, Thomas de Groot wrote:
>> You are too pessimistic I believe (or the world has changed too much).
>> Except for the last point (the Microsoft course) with which I agree, at
>> all other levels I have been with people who genuinely wanted to /know/
>> and to /learn/. Not all of them of course but a substantial part, yes.
>
> I haven't met them. Everybody I've met regards having technical
> knowledge as only something "nerds" and "losers" have - including the
> people I work with, who are all professional computer programmers. They
> all know how to write C#, yet they seem to think that understanding
> relational algebra or knowing how floating-point arithmetic works is
> only for lamers.
>

That is because you have academic interests. Most professional computer 
programmers, I work with. Do it because it is a job. Not for love.


>> And I agree with John: you would be an excellent teacher.
>
> Again, I don't know. I'm good at writing long monologues that nobody
> will ever read; interacting with a room full of live humans is quite
> different. It requires a different skill set.
>

Unless you can use your competitive dancing skills to be in front of an 
audience.

> My Dad tells me [so I don't know how true this is] that Bach was a
> virtuoso organ player, but a HORRIBLE teacher. He just couldn't
> understand why nobody else could play like him...

And his throw away scales became the Brandenburg Concertos.

-- 

Regards
     Stephen

Post a reply to this message

From: Orchid Win7 v1
Subject: Re: A monologue involving binary log
Date: 15 Feb 2015 06:16:55
Message: <54e08027$1@news.povray.org>

>> My mum wanted me to do a PhD. I have no idea why. As best as I can tell
>> from reading Stack Overflow, if you have a PhD then you are basically
>> over-qualified for 99% of all employment opportunities ever, and you'll
>> never be able to work outside academia ever again.
>
> Maybe Mother knows best. :-)

I doubt it. :-P

As I say, it seems to just preclude you ever getting employed, ever again.

> Going from your mathematical ramblings. I think that you would be
> eminently suitable to do a PhD at a good university. Not one of those
> who were ennobled, just to take the overflow from schools. When
> politicians decided it would keep them off the dole for another three
> years.

Fun fact: They stopped giving out university grants just a year or two 
before I started university. I AM STILL PAYING FOR MY TUITION! More than 
a decade later... *sigh*

>> Still, at least he could speak English. Many of my lecturers couldn't
>> (or at least not very well). Seriously, if your JOB is to teach people
>> who only speak English, not being able to speak English is surely an
>> instant disqualification. No?!?
>
> It would be at a a school. Universities are different, they employ
> lecturers for what they know and publish. Their world and ours touch but
> occasionally.

Favourite anecdote: So we're in a Network Systems lecture, and the 
lecturer writes the Shannon channel coding theorem on the board. [That's 
the same Shannon who invented Shannon codes, so we're drifting back 
on-topic now...] The formula states that the maximum data-rate over a 
noisy channel is given by

   C = B log2 (1 + S/N)

And everybody is like "ZOMG! WTF is a logarithm?!" And the lecturer is 
like "oh, really? I'm sorry, I just assumed that THIRD-YEAR 
UNDERGRADUATES would already know that".

So it turns out that only two people in the room actually know what a 
logarithm is. Obviously one of them is me. And the other one... is the 
crazy beer-drinking rugby thug from Manchester. o_O And I'm serious; he 
actually knows what he's talking about. First and last time I ever heard 
him say something sensible...

Post a reply to this message

Goto Latest 10 Messages

Next 10 Messages >>>