|
 |
Mike Raiford wrote:
> It uses "fingerprinting" technology to match the song with its entry on
> the database. What I found even more interesting is if I stripped a
> track of its name, tags, and anything other than the audio to identify
> it. It would correctly identify it. Even if that song came from a
> "greatest hits" album (e.g. the same exact song could be found elsewhere
> on a different album)
>
> Obviously there must be some sort of hashing involved ... but how? We're
> talking a file that has been encoded with a lossy algorithm, and while
> the data resembles the original, it is not the original ...
http://mtg.upf.edu/files/publications/MMSP-2002-pcano.pdf
Apparently you take your sound signal, compute its frequency spectrum,
and then take various statistical measurements from that. (Whatever you
think is "perseptually significant"; the overall energy distribution,
the frequencies of the main spectural peaks, how fast the spectrum is
changing, whatever you think will work.) You eventually summarise all of
this data to yield a "fingerprint" code in such a way that things that
sound similar to the human ear are likely to yield similar fingerprint
codes. Then you just need to develop a fast search algorithm...
So how it works depends on *exactly* which system it uses. (WinAmp is
using MusicID from Gracenote - the people who invented CDDB, and who
therefore presumably have their hands on a huge catalogue of CD data!)
It's comparing sound signals based on their overall spectrum (how bassy
or toppy they are, possibly the pictures of the main signal spikes) and
how it evolves over time (slowly changing vs rapidly changing).
I would suggest that if you played two different tunes on the same
instrument in the same key at the same tempo, it might be rather hard
for the machine to tell them apart. But then, think about it: how well
would a human do this?
I almost want to rush out and grab WinAmp just so I can see how badly it
misclassifies material that isn't in the database... >:-D
Post a reply to this message
|
 |