|
|
|
|
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Mike Raiford wrote:
> It uses "fingerprinting" technology to match the song with its entry on
> the database. What I found even more interesting is if I stripped a
> track of its name, tags, and anything other than the audio to identify
> it. It would correctly identify it. Even if that song came from a
> "greatest hits" album (e.g. the same exact song could be found elsewhere
> on a different album)
>
> Obviously there must be some sort of hashing involved ... but how? We're
> talking a file that has been encoded with a lossy algorithm, and while
> the data resembles the original, it is not the original ...
http://mtg.upf.edu/files/publications/MMSP-2002-pcano.pdf
Apparently you take your sound signal, compute its frequency spectrum,
and then take various statistical measurements from that. (Whatever you
think is "perseptually significant"; the overall energy distribution,
the frequencies of the main spectural peaks, how fast the spectrum is
changing, whatever you think will work.) You eventually summarise all of
this data to yield a "fingerprint" code in such a way that things that
sound similar to the human ear are likely to yield similar fingerprint
codes. Then you just need to develop a fast search algorithm...
So how it works depends on *exactly* which system it uses. (WinAmp is
using MusicID from Gracenote - the people who invented CDDB, and who
therefore presumably have their hands on a huge catalogue of CD data!)
It's comparing sound signals based on their overall spectrum (how bassy
or toppy they are, possibly the pictures of the main signal spikes) and
how it evolves over time (slowly changing vs rapidly changing).
I would suggest that if you played two different tunes on the same
instrument in the same key at the same tempo, it might be rather hard
for the machine to tell them apart. But then, think about it: how well
would a human do this?
I almost want to rush out and grab WinAmp just so I can see how badly it
misclassifies material that isn't in the database... >:-D
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> Actually I have a phone book entry on my phone, which you can call, hold
> your phone up to some random speaker (be it in your car, on the TV, etc)
> for 10 seconds, then a few seconds later it texts you back the artist
> and song name. It's pretty neat and has worked every time I've tried it.
Heh. Well that's one way to find out what tune that is on that YouTube
video... ;-)
What does it do if you play something that isn't in the database? (I.e.,
you pick up an instrument and play something yourself.)
> Maybe it works in the frequency domain, so takes the fourier transform
> of the sample, then uses some fuzzy matching algorithm to see what it
> matches up with?
See my other reply. Statistical summaries of the frequency spectrum (not
forgetting temporal information too) plus fuzzy matching.
>> (But then, so should identifying a CD by it's serial number, and
>> apparently that is a "solved problem".)
>
> What do you mean? Isn't the whole idea of a *serial* number that you can
> identify which one it is?
Yes - but if you aren't the manufacturer, it's just a useless number to
you. The only reason this is usable is that somebody sat down and
somehow built a giant database containing all the serial numbers and the
matching metadata; AFAIK, the manufacturers didn't hand this data over,
some poor sod collected it.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> Yes - but if you aren't the manufacturer, it's just a useless number to
> you. The only reason this is usable is that somebody sat down and
> somehow built a giant database containing all the serial numbers and the
> matching metadata; AFAIK, the manufacturers didn't hand this data over,
> some poor sod collected it.
And that's exactly how it happened. Except replace poor sod with
CD-Ripping community. They'd tag the resultant MP3's and send the data
to CDDB, which would then get sent out to anyone else who ripped the CD.
--
~Mike
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Invisible wrote:
> I almost want to rush out and grab WinAmp just so I can see how badly it
> misclassifies material that isn't in the database... >:-D
From what I've seen on a few tracks that weren't really released music,
it would simply come back with nothing ...
--
~Mike
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> What does it do if you play something that isn't in the database? (I.e.,
> you pick up an instrument and play something yourself.)
I tried whistling a song to it once, I just got a text back saying it
couldn't recognise the song :-( grrrrr
>> What do you mean? Isn't the whole idea of a *serial* number that you can
>> identify which one it is?
>
> Yes - but if you aren't the manufacturer, it's just a useless number to
> you. The only reason this is usable is that somebody sat down and somehow
> built a giant database containing all the serial numbers and the matching
> metadata; AFAIK, the manufacturers didn't hand this data over, some poor
> sod collected it.
Or rather, multiple poor sods collected it :-)
http://en.wikipedia.org/wiki/CDDB#History
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
Mike Raiford wrote:
>
> And that's exactly how it happened. Except replace poor sod with
> CD-Ripping community. They'd tag the resultant MP3's and send the data
> to CDDB, which would then get sent out to anyone else who ripped the CD.
>
And either someone is fucking the system up from his part or the same
serial can be used for multiple CD's. Sometimes CDDB and FreeDB give you
multiple choices, of which usually one or two is that actual album.
-Aero
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> And either someone is fucking the system up from his part or the same
> serial can be used for multiple CD's. Sometimes CDDB and FreeDB give you
> multiple choices, of which usually one or two is that actual album.
They don't work from the serial number as such, but they make up some sort
of hash code based on the track lengths etc.
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
>> What does it do if you play something that isn't in the database?
>> (I.e., you pick up an instrument and play something yourself.)
>
> I tried whistling a song to it once, I just got a text back saying it
> couldn't recognise the song :-( grrrrr
I forsee this degenerating into something akin to the intros round from
Nevermind the Buzzcocks. ;-)
>> AFAIK, the manufacturers didn't hand this data
>> over, some poor sod collected it.
>
> Or rather, multiple poor sods collected it :-)
Heh. Sounds almost like the way Linux happened... ;-)
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
scott wrote:
> They don't work from the serial number as such, but they make up some
> sort of hash code based on the track lengths etc.
I don't know about that... One time, I had a CD sitting on my desk, and
I noticed that round the spindle hole there is some *tiny* writing
that's barely readable.
I typed the code number into Google. It instantly gave me the entire
track listing for the CD. (!) Just from the 200-digit code printed on
the CD's surface. I imagine CDDB reads this same code. (It's almost
certainly burned onto the data track as well as being printed on the label.)
The best thing is, the CD in question was "Best Trance Anthems EVER
volume #1,374" or something! :-D
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |
| |
|
|
> I don't know about that...
It works so that copies of audio CDs still give the correct result, actually
reading up about it, any CD that has exactly the same number of tracks
starting at the same points on the CD with the same total play time will
give the same freedb/CDDB result.
> One time, I had a CD sitting on my desk, and I noticed that round the
> spindle hole there is some *tiny* writing that's barely readable.
>
> I typed the code number into Google. It instantly gave me the entire track
> listing for the CD. (!) Just from the 200-digit code printed on the CD's
> surface.
Well yeh, I think that's more like a barcode, you know the one that most
shops have a database for to convert bar codes to prices at the checkout. I
imagine at least some websites have CDs listed with their corresponding
serial numbers.
> The best thing is, the CD in question was "Best Trance Anthems EVER volume
> #1,374" or something! :-D
Yeh it's always cool when you're thinking "I bet nobody in the entire world
could possibly have ever even played this CD, let alone typed in all the
track names", and then wham you get the results back!
Post a reply to this message
|
|
| |
| |
|
|
|
|
| |