Automatic music rating based on listening habits_问答_开发者

I've created a Winamp-like music player in Delphi. Not so complex, of course. Just a simple one.

But now I would like to add a more complex feature: Songs in the library should be automatically rated based on the user's listening habits.

This means: The application should "understand" if the user likes a song or not. And not only whether he/she likes it but also how much.

My approach so far (data which could be used):

Simply measure how often a song was played per time. Start counting time when the song was added to the library so that recent songs don't have any disadvantage.
Measure how long a song was played on average (minutes).
Starting a song but directly change to another one should have a bad influen开发者_如何转开发ce on the ranking since the user didn't seem to like the song.
...

Could you please help me with this problem? I would just like to have some ideas. I don't need the implementation in Delphi.

I would track all of your users' listening habits in a central database, so you can make recommendations based on what other people like too ("people that liked this song, also liked these other songs")

some other metrics to consider:

proportion of times that the song was immediately replayed (ex. this song was immediately replayed 12% of the times it was played)
did they turn on the "repeat this song" button during play?
times played per hour, day, week, month
proportion of times this song was skipped. (ex. this song was played, but immediately skipped 99% of the time)
proportion of song listened to (the user listened to 50% of this song on average, versus 100% of some other song)

also:

listen in on the user's microphone. do they sing along? :D

what volume do they play the song? do they crank it up?

Put in a "recommend this song to friends" button (that emails song title to friend or something). Songs they recommend, they probably like.

You might want to do some feature extraction on the audio stream, and find similar songs. This is hard, but you can read more about it here:

"Automatic Feature Extraction for Classifying Audio Data " Link

"Understandable models Of music collections based on exhaustive feature generation with temporal statistics" http://portal.acm.org/citation.cfm?id=1150523

"Collaborative Use of Features in a Distributed System for the Organization of Music Collections" http://www.idea-group.com/Bookstore/Chapter.aspx?TitleId=24432

Measure how long a song was played on average (minutes).

I don't think this is a good metric, because a long song would gain an unfair advantage over a short song. You should use a percentage instead:

avg. time played / total song length

Please let degrade likeliness over time. You seem to like songs better if you heard them often during the last n days, while older songs should only get a casual mentioning, since you like them but heard them way too much, probably.

Least but not last you could add beat detection (and maybe frequence spectrum) to find similar songs, which could provide you with more data than the user inputted by hearing the songs.

I would also go for grouping songs having the same MP3-Id Tag here, since this also gives a hint what the user is currently on. And if you want to provide some autoplay function, it would also help. After hearing a great Goa song, switching to Punk is strange, even if I like songs of both worlds.

Concerning your additional metrics: Shouldn't one combine metric #4 and metric #5? If a song is immediately skipped, then the proportion listened to is just 1% or so, right? – marco92w May 21 at 15:08

These should be separate. Skipping should result in negative rating for the song that was skipped. However, if the user closes the application when a song begins, you should not consider it as negative rating, even though only a low percentage of the song was played.

(ListenPartCount * (ListenFullCount ^ 2)) + (AverageTotalListenTime * ListenPartTimeAverage)
--------------------------------------------------------------------------------------------
               ((AverageTotalListenTime - ListenPartTimeAverage) + 0.0001f)

This formula will produce an nice result, since user could really like just part of song, this should be seen in the score, also if user likes full song then weight should be doubled.

You can tweak this folmula in various ways, f.ex include user tree of listening, f.ex if user listens one song and after that he listens another song few times, etc.

Use the date the song was added to the library as a starting point.

Measure how often the song/genre/artist/album is played (fully, or in part or skipped) - this will also allow you to measure how often a song/genre/artist/album is not played.

Come up with a weighting based on these parameters, when a song, it's genre, artist or album has not been played frequently, it should rank poorly. When an artist is played every day songs from this artist should get a boost, but say one of the artist's songs is never played this song should still rank pretty low

Simply measure how often a song was played per time.

Often, I go to play a particular song, and then just let my iPod run until the end of an album. So this method would give an unfair advantage to songs late in an album. Something you might want to compensate for if your music player works the same way.

What about artificial intelligence appliance on this problem?

Well! Let me say that starting from scratch could be really funny to use a network of clients with their own "intelligence" and finally collect client results on a central "intelligence".

Each client could produce his own "user ratings" based on user habitudes (as already said: average listenig, listenig count, etc...).

Than a central "intelligent" collector could merge individual ratings into "global ratings" showing trands, suggestions and every high level rating you need.

Anyway to train such a "brain" means that you have to solve the problem in an analytical way first, but really could be funny to build such a cloud of interconnected small brains to produce higher level "intelligence".

As usual, as I don´t know your skills, take a look to neural networks, genetic algorithms, fuzzy logic, pattern recognition and similar problems for a deeper understanding.

You can use some simple function like:

listened_time_of_song/(length_of_song + 15s)

 listened_time_of_song/(length_of_song * 1.1)

that means that if song was stopped in 15 seconds then it would be rated with negative score, or maybe the second case is even better (length of song would have no matter to final note if user listened whole song)

Another way may be using neural networks if you are common with this subject.