|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Orchid XP v7 <voi### [at] devnull> wrote:
> Well, I now have the full algorithm working. :-P
> Or at least, it seems to work for messages shorter than 52 bytes anyway.
Then it's not a full working algorithm, is it?-)
Testing that the algorithm works for all possible inputs would be a rather
laborious job. Just testing that it can produce any MD5 value would be a
rather impossible job, taking into account how many there are.
I wonder how they make sure that an implementation is correct...
--
- Warp
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Orchid XP v7 <voi### [at] devnull> wrote:
>> Well, I now have the full algorithm working. :-P
>
>> Or at least, it seems to work for messages shorter than 52 bytes anyway.
>
> Then it's not a full working algorithm, is it?-)
Well, it's damn near. :-P
> Testing that the algorithm works for all possible inputs would be a rather
> laborious job. Just testing that it can produce any MD5 value would be a
> rather impossible job, taking into account how many there are.
>
> I wonder how they make sure that an implementation is correct...
Test all code paths, I assume.
For example, I am pretty damn confident that the compression function in
my implementation is correct. I've tested it with dozens of different
input blocks, and it's always produced the right output. There's no
particular reason why it shouldn't for other input blocks. It processes
them all the same way...
Now, the padding algorithm, *that* changes for every message size. For
some sizes it seems to work, but not for all of them. I need to test it
with a large array of sizes to check it's working. (Or rather, to figure
out why currently it isn't.)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> Orchid XP v7 <voi### [at] devnull> wrote:
>> Well, I now have the full algorithm working. :-P
>
>> Or at least, it seems to work for messages shorter than 52 bytes anyway.
>
> Then it's not a full working algorithm, is it?-)
Ah-HAH! One more bug, annihilated. >:-)
In the midst of a complex padding algorithm, I forgot to add on the
running count of the number of bytes processed. The "size" figure
appended onto the end of the bitstream was always the size of the last
block, not the whole message. (IOW, message size mod 512.) Oops!
The program now appears to give correct results for every case I've
tried. (I.e., not very many. However, I threw a 2 MB file at it, and it
gave me the right answer. This indicates that it's not *completely*
broken...)
And yes, it's just a tad slower than C. ( = exaggeration.)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Warp wrote:
> I wonder how they make sure that an implementation is correct...
Generally, code coverage and intermediate results, along with a set of
test vectors designed to cover all paths through and often showing the
intermediate results.
It's somewhat easier when your specification is "here's the C code to do
it" rather than an ambiguous description of what you're supposed to do.
Especially since there isn't any "real world" right or wrong answer,
other than "match what the C code outputs."
Plus, of course, it's designed to go wildly arwy with the slightest
peturbation of the process, so even tiny bugs tend to produce completely
different results. Which is its own debugging problem.
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Post a reply to this message
|
|
| |
| |
|
|
From: Orchid XP v7
Subject: Re: The nebulous question of probability
Date: 18 Nov 2007 04:38:56
Message: <47400830@news.povray.org>
|
|
|
| |
| |
|
|
Darren New wrote:
> Plus, of course, it's designed to go wildly arwy with the slightest
> peturbation of the process, so even tiny bugs tend to produce completely
> different results. Which is its own debugging problem.
I got 2 bits wrong in one of the magic numbers. Totally messed up the
output... until I found and fixed the bug.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> However, one might also conjecture that the probability of any two
> arbitrary messages having the same [MD5] hash code would be 2^(-128).
Random fact:
2^(-128) ~= 2.93 * 10^(-39)
Winning the National Lottery (assuming you only bought 1 ticket) has a
probability of roughly 7.15 * 10^(-8).
If my arithmetic is right, that means that winning the National Lottery
jackpot 5 times back-to-back is still more than 637 times *more* likely
than something with a probability of 2^(-128) happening.
Assuming the probability of a random file mutation leaving the MD5 hash
unchanged really *is* 2^(-128), it's a pretty unlikely event. You should
probably worry more about the building being burned to the ground...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> However, one might also conjecture that the probability of any two
>> arbitrary messages having the same [MD5] hash code would be 2^(-128).
>
> Random fact:
>
> 2^(-128) ~= 2.93 * 10^(-39)
>
> Winning the National Lottery (assuming you only bought 1 ticket) has a
> probability of roughly 7.15 * 10^(-8).
>
> If my arithmetic is right, that means that winning the National Lottery
> jackpot 5 times back-to-back is still more than 637 times *more* likely
> than something with a probability of 2^(-128) happening.
>
> Assuming the probability of a random file mutation leaving the MD5 hash
> unchanged really *is* 2^(-128), it's a pretty unlikely event. You should
> probably worry more about the building being burned to the ground...
If you are only worried about one file per week and you don't backup off
site, then sure...
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
>>> However, one might also conjecture that the probability of any two
>>> arbitrary messages having the same [MD5] hash code would be 2^(-128).
>>
>> Random fact:
>>
>> 2^(-128) ~= 2.93 * 10^(-39)
>>
>> Winning the National Lottery (assuming you only bought 1 ticket) has a
>> probability of roughly 7.15 * 10^(-8).
>>
>> If my arithmetic is right, that means that winning the National
>> Lottery jackpot 5 times back-to-back is still more than 637 times
>> *more* likely than something with a probability of 2^(-128) happening.
>>
>> Assuming the probability of a random file mutation leaving the MD5
>> hash unchanged really *is* 2^(-128), it's a pretty unlikely event. You
>> should probably worry more about the building being burned to the
>> ground...
>
> If you are only worried about one file per week and you don't backup off
> site, then sure...
1. We backup off-site, but - absurdly enough, when you think about it -
we don't *archive* anything off-site. So when a project is finished, it
gets put into a pair of CDs, deleted from the servers, and stored in our
own building. So actually, our "current" projects are protected better
than the finished ones... Hmm, actually, we should probably fix that! ._.
2. Let's look at the rules of probability:
Pr(A and B) = Pr(A) * Pr(A)
Pr(A or B) = Pr(A) + Pr(B)
[Subject to various assumptions.] In particular, we have
Pr(A and B) < Pr(A)
Pr(A or B) > Pr(A)
So ANDs decrease the probability, and ORs increase it. Thus, the
probability of winning the Lottery today AND next week AND the week
after AND... becomes quite small. However, the probability of file A
being corrupt OR file B being corrupt OR... increases as we add more
files. BUT NOT AS FAST!
If the probability of a single file being corrupted but not detected by
MD5 is 2.93 * 10^(-39) then the probability of any single file out of
1,000 files being so-corrupted is... 2.93 * 10^(-36). Which is still
pretty tiny. (It's about 2x more likely than 5 wins on the Lottery -
which is astronomically unlikely in the first place.)
With 1 million files per month, we get 2.96 * 10^(-33) as the
probability of one of them getting corrupted but us not noticing. That's
nearer to 4 consecutive Lottery wins... I don't think I'll worry too
much. ;-)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
On 16 Nov 2007 17:08:14 -0500, Warp wrote:
> Nicolas Alvarez <nic### [at] gmailisthebestcom> wrote:
>> error("Not implemented yet");
>
> error("I'll implement this when I have the time.")
error("This feature is only available in the registered version.");
--
Joel Yliluoma - http://iki.fi/bisqwit/
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Joel Yliluoma wrote:
>>> error("Not implemented yet");
>> error("I'll implement this when I have the time.")
> error("This feature is only available in the registered version.");
Joel wins.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |