开发者

Related video algorithm MySql using genre

开发者 https://www.devze.com 2023-02-28 16:52 出处:网络
I\'ve looked around for info on an efficient \'related videos\' algorithm but i\'m struggling to get well ordered, accurate results

I've looked around for info on an efficient 'related videos' algorithm but i'm struggling to get well ordered, accurate results

I get given the 'genre' as a pipe-delimited string. eg: |Action|Sci-Fi|Thriller|

$genre = explode("|", $row['genre']);

if (count($genre) == 3) {
    $sql = "SELECT title FROM `movie` WHERE genre LIKE '%$genre[1]%' LIMIT 0,8";
} else {
    $sql = "SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' AND genre LIKE '%$genre[2]%') UNION SELECT title FROM `movie` WHERE (genre LIKE '%$genre[1]%' 开发者_StackOverflowOR genre LIKE '%$genre[2]%') LIMIT 0,10";
}
$related = mysql_query($sql);

Then I basically explode it and do a manual, inefficient search for genre matches depending on genre count. The results are poor and returns anything that is semi related.

This code makes me want to gag! It works but I hate it and I know its uber lame. Any tips to improve the SQL and getting richer results?


Move the mappings of genres to movies into a new table movie_genres with columns movie and genre.

This allows you to do this:

$genres = explode('|', trim($row['genre'], '|'));
$sql = "SELECT `movie`, COUNT(*) AS hits
        FROM `movie_genres`
        WHERE `genre` IN ('" . join("', '", $genres) . "')
        GROUP BY `movie`
        ORDER BY `hits` DESC
        LIMIT 8";

You have to make sure to prevent SQL injection, though.

The extra table is also a good idea, because your database schema is not normalized. Especially Chris Date's fourth condition of the first normal form is violated:

Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else).

0

精彩评论

暂无评论...
验证码 换一张
取 消