Projeto Recommender System: Movies

Esse projeto tem como finalidade explorar alguns métodos sobre sistema de recomendação. A proposta é criarmos um sistema de recomendação simples utilizando o algoritmo de Recomendação Colaborativa. O ponto de partida é o artigo Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. Vamos implementar uma das propostas de recomendação colaborativa do artigo. Detalhes do método implementado é dado mais a frente.

Esse tutorial é dividido em:

Dataset

Para demostrar os algoritmos de recomendação vamos utilizar o dataset da MovieLens. O site possui varias versões do dataset cada qual com um número diferentes de filmes e usuários. Vamos utilizar a versão small deles que é descrita como segue:

MovieLens Latest Datasets

These datasets will change over time, and are not appropriate for reporting research results. We will keep the download links stable for automated downloads. We will not archive or make available previously released versions.

Small: 100,000 ratings and 1,300 tag applications applied to 9,000 movies by 700 users. Last updated 10/2016.

A primeira etapa é carregar a base de dados como vários DataFrames do Pandas.

Vamos carregar 4 arquivos:

  • links: possui referência do id de cada filme para o id na base do IMDb e na base do TheMovieDb. Essa informação será usada no final para exibir mais informações dos filmes recomendados utilizando as APIs disponibilizadas por estes sites.
  • movies: lista de filmes da base. Cada filme possui o título e uma lista de gêneros associada.
  • ratings: tabela de avaliação de filmes. Cada usuário avalia um filme com uma nota de 1 a 5. É armazenado também o timestamp de cada avaliação.
  • tags: termos associados a cada filme cadastrados pelos usuários.

Para este tutorial vamos utilizar somente as três primeiras tabelas.


In [13]:
# Import necessários para esta seção
import pandas as pd
idx = pd.IndexSlice

In [2]:
# Preparando o Dataset

links = pd.read_csv("../datasets/movielens/links.csv",  index_col=['movieId'])
movies = pd.read_csv("../datasets/movielens/movies.csv", sep=",", index_col=['movieId'])
ratings = pd.read_csv("../datasets/movielens/ratings.csv", index_col=['userId','movieId'])
tags = pd.read_csv("../datasets/movielens/tags.csv", index_col=['userId','movieId'])

In [3]:
ratings.head()


Out[3]:
rating timestamp
userId movieId
1 31 2.5 1260759144
1029 3.0 1260759179
1061 3.0 1260759182
1129 2.0 1260759185
1172 4.0 1260759205

Descrição do Método de Recomendação

Como dito anteriormente, vamos utilizar um método de recomendação apresentada no artigo Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. Existem vários tipos de sistemas de recomendação, neste tutorial vamos utilizar a recomendação dita colaborativa. Esse titpo de recomendação utiliza a avaliação do usuário (ratings) para recomendar. Os usuários são comparados entre si, utilizando alguma métrica de similaridade, e a recomendação é proposta tomando como base os ratings dos usuários mais semelhantes. A recomendação é medida a partir da predição da nota que usuário daria a um determinado item (predict rating)

Duas métricas precisam ser definidas:

  • O cálculo da similaridade de usuários
  • O cálculo do predict rating

Vamos utilizar as seguintes equações propostas no artigo:

Cáculo da Similaridade

$ sim(x,y) = \frac{\sum_{ s \in S_{xy}} { (r_{x,s} - \bar{r_{x}}) (r_{y,s} - \bar{r_{y}}) } } { \sqrt{ \sum_{s \in S_{xy}}{ (r_{x,s} - \bar{r_{x}})^2 } \sum_{s \in S_{xy}}{ (r_{y,s} - \bar{r_{y}})^2 } } } $, onde:

  • $S_x$: itens avaliados pelo usuário $x$;
  • $S_y$: itens avaliados pelo usuário $y$;
  • $S_{xy}$: o conjunto de todos os itens que foram avaliados tanto por x quanto por y, em outras palavras, a interseção dos conjuntos $S_x$ e $S_y$;
  • $r_{x,s}$: rating do usuário $x$ para o item $s$;
  • $r_{y,s}$: rating do usuário $y$ para o item $s$;
  • $\bar{r_x}$: média do ratings dos filmes avaliados por $x$
  • $\bar{r_y}$: média do ratings dos filmes avaliados por $y$

O cálculo do predict rating

Para cada filme da base que o usuário não avaliou é calculado um valor de rating que o usuário daria ao filme. A proposta é calcular isto para todos os filmes da base e recomendar ao usuário os 10 filmes mais bem avaliados. Para este cálculo vamos utilizar a equação:

$ r_{c,s} = \bar{r_c} + k * \sum_{c' \in \hat{C}}{sim(c,c') \times (r_{c',s} - \bar{r_{c'}})}$, onde:

  • $c$ e $c'$: são usuários;
  • $s$: um item;
  • $k$: é um fator noramlizador dado por $k = \frac{1}{\sum_{c' \in \hat{C}}{|sim(c, c')|}}$
  • $sim(c, c'):$ a similaridade do usuário c com o usuário c' dada pela equação anterior;
  • $\hat{C}:$ o conjunto dos $N$ usuários mais similares a $c$ que avaliaram o item $s$.

as demais variáveis forma descritas na equação anterior, mudando apenas as letras utilizadas.

Métodos Auxiliares

Para facilitar a implementação das duas equações, vamos implementar uma série de métodos auxiliares que vão nos ajudar a extrair as informações da base de dados. O código a seguir implementa tais métodos. O que cada método faz está descrito no início de cada um.


In [15]:
def get_movies_by_user(id_user, rating_cut=0, list_=False):
    
    """Retorna a lista de filmes avaliados por um usuário

    Keyword arguments:
    id_user -- id do usuário
    rating_cut -- retorna só itens avaliados com rating maior que rating_cut (default: 0)
    list_ -- se True retorna somente os ids dos filmes, se False retorna os ids com o valor do rating (default: False)
    
    """

    
    return_dict = {}
    dict_ = ratings.loc[idx[id_user, :], 'rating'].T.to_dict()
    
    for d in dict_:
        if rating_cut != 0:
            if dict_[d] >= rating_cut:
                return_dict[d[1]] = dict_[d]
        else:
            return_dict[d[1]] = dict_[d]
    
    if list_:
        return list(return_dict.keys())

    return return_dict

def get_users_by_movie(id_movie, rating_cut=0, list_=False):
    
    """Retorna a lista de usuários que avaliaram determinado filme

    Keyword arguments:
    id_movie -- id do filme
    rating_cut -- retorna só usuários que avaliaram o filme com rating maior que rating_cut (default: 0)
    list_ -- se True retorna somente os ids dos usuários, se False retorna os ids com o valor do rating
    
    """
    
    return_dict = {}
    dict_ = ratings.loc[idx[:, id_movie],'rating'].T.to_dict()
    for d in dict_:
        if rating_cut != 0:
            if dict_[d] >= rating_cut:
                return_dict[d[0]] = dict_[d]
        else:
            return_dict[d[0]] = dict_[d]
        
    if list_:
        return list(return_dict.keys())
    
    return return_dict


def get_rating_by_user_movie(id_user, id_movie):
    
    """Retorna o rating que o usuário (id_user) deu para um filme (id_movie). Se não exister, retorna 0.0.

    Keyword arguments:
    id_user -- id do usuário
    id_movie -- id do filme
    
    """
    
    rating = 0.0;
    
    try:
        rating = ratings.loc[idx[id_user, id_movie], 'rating']
    except KeyError as e:
        rating = 0.0

    return rating

def get_all_users():
    
    """Retorna o id de todos os usuários.
    
    """
    
    return list(set([x[0] for x in ratings.index.values]))

def get_movie_title(id_movie):
    
    """Retorna o título de um filme.

    Keyword arguments:
    id_movie -- id do filme
    
    """
    
    info = movies.loc[idx[id_movie], :]
    return info['title']

Mesmo com esses métodos algumas operações podem ter um certo custo computacional já que serão chamadas várias vezes. Por exemplo, quando vamos calcular a similaridade de um usuário com todos da base, isso tem um certo custo. Por conta disso, algumas informações serão geradas antes e armazenadas em variáveis na memória. Tais informações serão geradas nas células a seguir. Essas variáveis só serão utilizadas nos métodos em que são utilizadas muitas vezes.


In [16]:
'''
    Neste trecho vamos armazenar em memória as informações de filmes avaliados pelos usuários. Isso evitar 
    fazermos muitos acesso a estrutura do DataFrame. 
'''

all_users = get_all_users()

movies_user_true = {}
movies_user_false = {}

for user in all_users:
    movies_user_true[user] = get_movies_by_user(user, list_=True)
    movies_user_false[user] = get_movies_by_user(user, list_=False)

Sistema De Recomendação Simples

Só para que a gente veja alguns desses métodos funcionando, vamos implementar um "Sistema de Recomendação" bem simples. A proposta é recomendar os filmes avaliados com nota 5 que foram assistidos por usuários que já assistiram os mesmos filmes com nota 4 e 5 do usuário em questão.


In [17]:
# Usuário para qual a recomendação será apresentada, chamarei de usuário A  
selected_user = 1

# Filmes com notas 4 e 5 avaliados por este usuário
my_movies = get_movies_by_user(selected_user, rating_cut=4, list_=True)


# Lista de todos os usuários que avaliaram os filmes assistidos por A
all_users = []

for movie in my_movies:
    all_users = all_users + get_users_by_movie(movie, rating_cut=5, list_=True)

# Para eliminar os usuários repetidos, transformamos a lista em um conjunto e depois convertemos novamente em uma lista
all_users = list(set(all_users))


# Neste passo, pegamos todos os filmes nota 5 avaliados pelos usuários de all_users
all_movies = []

for user in all_users:
    movies_ = get_movies_by_user(user, rating_cut=5, list_=True)
    all_movies = all_movies + movies_
    
# Removemos os filmes repetidos e aqueles já assistidos por A     
all_movies = list(set(all_movies) - set(my_movies))

# Apresentamos a lista dos filmes 
print("Foram encontrados: " + str(len(all_movies)) + " filmes")
for movie in all_movies:
    print("\t"+ get_movie_title(movie))


Foram encontrados: 1106 filmes
	Toy Story (1995)
	Hiroshima Mon Amour (1959)
	Heat (1995)
	Sabrina (1995)
	American President, The (1995)
	Fighter, The (2010)
	Casino (1995)
	Sense and Sensibility (1995)
	Get Shorty (1995)
	Leaving Las Vegas (1995)
	Persuasion (1995)
	City of Lost Children, The (Cité des enfants perdus, La) (1995)
	Twelve Monkeys (a.k.a. 12 Monkeys) (1995)
	Babe (1995)
	Carrington (1995)
	Dead Man Walking (1995)
	Clueless (1995)
	Richard III (1995)
	Seven (a.k.a. Se7en) (1995)
	Viridiana (1961)
	Usual Suspects, The (1995)
	Postman, The (Postino, Il) (1994)
	Confessional, The (Confessionnal, Le) (1995)
	Mr. Holland's Opus (1995)
	3 Women (Three Women) (1977)
	Grey Gardens (1975)
	White Balloon, The (Badkonake sefid) (1995)
	City Hall (1996)
	The Beatles: Eight Days a Week - The Touring Years (2016)
	Braveheart (1995)
	Taxi Driver (1976)
	Rumble in the Bronx (Hont faan kui) (1995)
	Anne Frank Remembered (1995)
	Chungking Express (Chung Hing sam lam) (1994)
	Flirting With Disaster (1996)
	Trip to the Moon, A (Voyage dans la lune, Le) (1902)
	Birdcage, The (1996)
	Brothers McMullen, The (1995)
	Apollo 13 (1995)
	Rob Roy (1995)
	Crumb (1994)
	Desperado (1995)
	Devil in a Blue Dress (1995)
	Jeffrey (1995)
	Living in Oblivion (1995)
	Safe (1995)
	Smoke (1995)
	Strange Days (1995)
	Unstrung Heroes (1995)
	Waterworld (1995)
	Burnt by the Sun (Utomlyonnye solntsem) (1994)
	Before Sunrise (1995)
	Clerks (1994)
	Don Juan DeMarco (1995)
	Eat Drink Man Woman (Yin shi nan nu) (1994)
	Ed Wood (1994)
	Hoop Dreams (1994)
	Heavenly Creatures (1994)
	Interview with the Vampire: The Vampire Chronicles (1994)
	Star Wars: Episode IV - A New Hope (1977)
	Little Women (1994)
	Like Water for Chocolate (Como agua para chocolate) (1992)
	Madness of King George, The (1994)
	Jetée, La (1962)
	Once Were Warriors (1994)
	Léon: The Professional (a.k.a. The Professional) (Léon) (1994)
	Pulp Fiction (1994)
	Quiz Show (1994)
	Picture Bride (Bijo photo) (1994)
	Three Colors: Red (Trois couleurs: Rouge) (1994)
	Three Colors: Blue (Trois couleurs: Bleu) (1993)
	Three Colors: White (Trzy kolory: Bialy) (1994)
	Secret of Roan Inish, The (1994)
	Shawshank Redemption, The (1994)
	Shallow Grave (1994)
	Swimming with Sharks (1995)
	In Bruges (2008)
	To Live (Huozhe) (1994)
	Star Trek: Generations (1994)
	What's Eating Gilbert Grape (1993)
	Muriel's Wedding (1994)
	Adventures of Priscilla, Queen of the Desert, The (1994)
	Bullets Over Broadway (1994)
	Clear and Present Danger (1994)
	Client, The (1994)
	Day at the Races, A (1937)
	Crooklyn (1994)
	Forrest Gump (1994)
	Four Weddings and a Funeral (1994)
	Peter Pan (1960)
	It Could Happen to You (1994)
	Wonderful, Horrible Life of Leni Riefenstahl, The (Macht der Bilder: Leni Riefenstahl, Die) (1993)
	Lion King, The (1994)
	Mask, The (1994)
	Red Rock West (1992)
	Speed (1994)
	True Lies (1994)
	Faster Pussycat! Kill! Kill! (1965)
	Crash (2004)
	Age of Innocence, The (1993)
	Barcelona (1994)
	American Hustle (2013)
	Blink (1994)
	Her (2013)
	Bronx Tale, A (1993)
	Dave (1993)
	Last Starfighter, The (1984)
	Demolition Man (1993)
	Spider-Man 2 (2004)
	Fearless (1993)
	Firm, The (1993)
	Fugitive, The (1993)
	Hudsucker Proxy, The (1994)
	Bourne Supremacy, The (2004)
	In the Line of Fire (1993)
	In the Name of the Father (1993)
	Jurassic Park (1993)
	King of the Hill (1993)
	Last Action Hero (1993)
	Much Ado About Nothing (1993)
	Perfect World, A (1993)
	Philadelphia (1993)
	Piano, The (1993)
	Ref, The (1994)
	Remains of the Day, The (1993)
	Romeo Is Bleeding (1993)
	Romper Stomper (1992)
	Schindler's List (1993)
	Searching for Bobby Fischer (1993)
	Secret Garden, The (1993)
	Shadowlands (1993)
	Short Cuts (1993)
	Sirens (1994)
	Six Degrees of Separation (1993)
	Sleepless in Seattle (1993)
	Blade Runner (1982)
	Thirty-Two Short Films About Glenn Gould (1993)
	Nightmare Before Christmas, The (1993)
	True Romance (1993)
	Welcome to the Dollhouse (1995)
	Celluloid Closet, The (1995)
	Aladdin (1992)
	Terminator 2: Judgment Day (1991)
	Dances with Wolves (1990)
	Batman (1989)
	Silence of the Lambs, The (1991)
	Snow White and the Seven Dwarfs (1937)
	Beauty and the Beast (1991)
	Pinocchio (1940)
	Pretty Woman (1990)
	Wild Bunch, The (1969)
	Fargo (1996)
	Heavy Metal (1981)
	Harold and Kumar Go to White Castle (2004)
	Aristocats, The (1970)
	Family Thing, A (1996)
	Dragonheart (1996)
	Neighbouring Sounds (O som ao redor) (2012)
	James and the Giant Peach (1996)
	Mystery Science Theater 3000: The Movie (1996)
	Shaun of the Dead (2004)
	Faces (1968)
	Mulholland Falls (1996)
	Wallace & Gromit: The Best of Aardman Animation (1996)
	Cold Comfort Farm (1995)
	Rock, The (1996)
	Cemetery Man (Dellamorte Dellamore) (1994)
	Ghost in the Shell (Kôkaku kidôtai) (1995)
	Wallace & Gromit: A Close Shave (1995)
	Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb (1964)
	Sideways (2004)
	Undertow (2004)
	Trainspotting (1996)
	Independence Day (a.k.a. ID4) (1996)
	Lone Star (1996)
	Django Unchained (2012)
	Emma (1996)
	Day the Sun Turned Cold, The (Tianguo niezi) (1994)
	Cyclo (Xich lo) (1995)
	Godfather, The (1972)
	Supercop 2 (Project S) (Chao ji ji hua) (1993)
	Twelfth Night (1996)
	Philadelphia Story, The (1940)
	Singin' in the Rain (1952)
	American in Paris, An (1951)
	Funny Face (1957)
	Breakfast at Tiffany's (1961)
	Vertigo (1958)
	Rear Window (1954)
	It Happened One Night (1934)
	Gay Divorcee, The (1934)
	North by Northwest (1959)
	Apartment, The (1960)
	Some Like It Hot (1959)
	Charade (1963)
	Casablanca (1942)
	Maltese Falcon, The (1941)
	My Fair Lady (1964)
	Sabrina (1954)
	Roman Holiday (1953)
	Meet Me in St. Louis (1944)
	Wizard of Oz, The (1939)
	Gone with the Wind (1939)
	My Favorite Year (1982)
	Sunset Blvd. (a.k.a. Sunset Boulevard) (1950)
	Citizen Kane (1941)
	2001: A Space Odyssey (1968)
	All About Eve (1950)
	Women, The (1939)
	Rebecca (1940)
	Notorious (1946)
	To Catch a Thief (1955)
	Band Wagon, The (1953)
	Love in the Afternoon (1957)
	Gigi (1958)
	Adventures of Robin Hood, The (1938)
	Laura (1944)
	Ghost and Mrs. Muir, The (1947)
	Top Hat (1935)
	To Be or Not to Be (1942)
	My Man Godfrey (1936)
	Giant (1956)
	East of Eden (1955)
	Thin Man, The (1934)
	His Girl Friday (1940)
	It's a Wonderful Life (1946)
	Mr. Smith Goes to Washington (1939)
	Bringing Up Baby (1938)
	Little Lord Fauntleroy (1936)
	39 Steps, The (1935)
	Night of the Living Dead (1968)
	African Queen, The (1951)
	Beat the Devil (1953)
	Cat on a Hot Tin Roof (1958)
	Meet John Doe (1941)
	Pompatus of Love, The (1996)
	Big Night (1996)
	Escape to Witch Mountain (1975)
	Old Yeller (1957)
	20,000 Leagues Under the Sea (1954)
	Cool Runnings (1993)
	Cinderella (1950)
	Winnie the Pooh and the Blustery Day (1968)
	Mary Poppins (1964)
	Dumbo (1941)
	Pete's Dragon (1977)
	Bedknobs and Broomsticks (1971)
	Alice in Wonderland (1951)
	Fox and the Hound, The (1981)
	Sound of Music, The (1965)
	Die Hard (1988)
	Lawnmower Man, The (1992)
	Looking for Richard (1996)
	Everyone Says I Love You (1996)
	Swingers (1996)
	Sleepers (1996)
	Am Ende eiens viel zu kurzen Tages (Death of a superhero) (2011)
	Shall We Dance (1937)
	Willy Wonka & the Chocolate Factory (1971)
	Sleeper (1973)
	Bananas (1971)
	Fish Called Wanda, A (1988)
	Monty Python's Life of Brian (1979)
	Victor/Victoria (1982)
	Candidate, The (1972)
	Bonnie and Clyde (1967)
	Dial M for Murder (1954)
	Dirty Dancing (1987)
	Reservoir Dogs (1992)
	Platoon (1986)
	Crying Game, The (1992)
	Sophie's Choice (1982)
	E.T. the Extra-Terrestrial (1982)
	Christmas Carol, A (1938)
	Top Gun (1986)
	Rebel Without a Cause (1955)
	Streetcar Named Desire, A (1951)
	People vs. Larry Flynt, The (1996)
	Perfect Candidate, A (1996)
	On Golden Pond (1981)
	Return of the Pink Panther, The (1975)
	Abyss, The (1989)
	Jean de Florette (1986)
	Manon of the Spring (Manon des sources) (1986)
	Edukators, The (Die Fetten Jahre sind vorbei) (2004)
	Monty Python and the Holy Grail (1975)
	When We Were Kings (1996)
	Wallace & Gromit: The Wrong Trousers (1993)
	Return of Martin Guerre, The (Retour de Martin Guerre, Le) (1982)
	Tin Drum, The (Blechtrommel, Die) (1979)
	Cook the Thief His Wife & Her Lover, The (1989)
	Sherlock Jr. (1924)
	Delicatessen (1991)
	Paths of Glory (1957)
	Grifters, The (1990)
	Hear My Song (1991)
	English Patient, The (1996)
	Mediterraneo (1991)
	My Left Foot (1989)
	Sex, Lies, and Videotape (1989)
	Strictly Ballroom (1992)
	Thin Blue Line, The (1988)
	Paris Is Burning (1990)
	One Flew Over the Cuckoo's Nest (1975)
	Cheech and Chong's Up in Smoke (1978)
	Andalusian Dog, An (Chien andalou, Un) (1929)
	Star Wars: Episode V - The Empire Strikes Back (1980)
	Princess Bride, The (1987)
	Raiders of the Lost Ark (Indiana Jones and the Raiders of the Lost Ark) (1981)
	Brazil (1985)
	Aliens (1986)
	Good, the Bad and the Ugly, The (Buono, il brutto, il cattivo, Il) (1966)
	Withnail & I (1987)
	12 Angry Men (1957)
	Lawrence of Arabia (1962)
	Clockwork Orange, A (1971)
	To Kill a Mockingbird (1962)
	Apocalypse Now (1979)
	Once Upon a Time in the West (C'era una volta il West) (1968)
	Star Wars: Episode VI - Return of the Jedi (1983)
	Wings of Desire (Himmel über Berlin, Der) (1987)
	Third Man, The (1949)
	Goodfellas (1990)
	Alien (1979)
	Army of Darkness (1993)
	Ran (1985)
	Killer, The (Die xue shuang xiong) (1989)
	Psycho (1960)
	Blues Brothers, The (1980)
	Godfather: Part II, The (1974)
	Full Metal Jacket (1987)
	Grand Day Out with Wallace and Gromit, A (1989)
	Henry V (1989)
	Amadeus (1984)
	Quiet Man, The (1952)
	Once Upon a Time in America (1984)
	Raging Bull (1980)
	Annie Hall (1977)
	Right Stuff, The (1983)
	Stalker (1979)
	Boot, Das (Boat, The) (1981)
	Sting, The (1973)
	Harold and Maude (1971)
	Seventh Seal, The (Sjunde inseglet, Det) (1957)
	Local Hero (1983)
	Terminator, The (1984)
	Glory (1989)
	Rosencrantz and Guildenstern Are Dead (1990)
	Manhattan (1979)
	Miller's Crossing (1990)
	Dead Poets Society (1989)
	Graduate, The (1967)
	Touch of Evil (1958)
	Femme Nikita, La (Nikita) (1990)
	Bridge on the River Kwai, The (1957)
	8 1/2 (8½) (1963)
	Chinatown (1974)
	Day the Earth Stood Still, The (1951)
	Treasure of the Sierra Madre, The (1948)
	Duck Soup (1933)
	Better Off Dead... (1985)
	Shining, The (1980)
	Stand by Me (1986)
	M (1931)
	Evil Dead II (Dead by Dawn) (1987)
	Seve (2014)
	Deer Hunter, The (1978)
	Diva (1981)
	Groundhog Day (1993)
	Great Escape, The (1963)
	Manchurian Candidate, The (1962)
	Unforgiven (1992)
	Arsenic and Old Lace (1944)
	Back to the Future (1985)
	Fried Green Tomatoes (1991)
	Patton (1970)
	Down by Law (1986)
	Highlander (1986)
	Cool Hand Luke (1967)
	Young Frankenstein (1974)
	Raise the Red Lantern (Da hong deng long gao gao gua) (1991)
	Great Dictator, The (1940)
	Fantasia (1940)
	High Noon (1952)
	Big Sleep, The (1946)
	Heathers (1989)
	Somewhere in Time (1980)
	Ben-Hur (1959)
	This Is Spinal Tap (1984)
	Koyaanisqatsi (a.k.a. Koyaanisqatsi: Life Out of Balance) (1983)
	Some Kind of Wonderful (1987)
	Indiana Jones and the Last Crusade (1989)
	Being There (1979)
	Gandhi (1982)
	Unbearable Lightness of Being, The (1988)
	Room with a View, A (1986)
	Pink Floyd: The Wall (1982)
	Killing Fields, The (1984)
	My Life as a Dog (Mitt liv som hund) (1985)
	Forbidden Planet (1956)
	Field of Dreams (1989)
	Man Who Would Be King, The (1975)
	Butch Cassidy and the Sundance Kid (1969)
	Paris, Texas (1984)
	Napoléon (1927)
	When Harry Met Sally... (1989)
	American Werewolf in London, An (1981)
	Birds, The (1963)
	Blob, The (1958)
	Bride of Frankenstein, The (Bride of Frankenstein) (1935)
	Cape Fear (1991)
	Cape Fear (1962)
	Carrie (1976)
	Nightmare on Elm Street, A (1984)
	Nosferatu (Nosferatu, eine Symphonie des Grauens) (1922)
	Omen, The (1976)
	Breaking the Waves (1996)
	Star Trek: First Contact (1996)
	Shine (1996)
	Sling Blade (1996)
	Paradise Lost: The Child Murders at Robin Hood Hills (1996)
	Ridicule (1996)
	Star Trek VI: The Undiscovered Country (1991)
	Star Trek II: The Wrath of Khan (1982)
	Star Trek III: The Search for Spock (1984)
	Star Trek IV: The Voyage Home (1986)
	Grease (1978)
	Jaws (1975)
	Jerry Maguire (1996)
	Raising Arizona (1987)
	Tin Men (1987)
	Sneakers (1992)
	Ghosts of Mississippi (1996)
	La Cérémonie (1995)
	Last of the Mohicans, The (1992)
	Hamlet (1996)
	Mother (1996)
	Walkabout (1971)
	Message to Love: The Isle of Wight Festival (1996)
	Angel Baby (1995)
	Kolya (Kolja) (1996)
	Waiting for Guffman (1996)
	Hotel de Love (1996)
	Lost Highway (1997)
	Private Parts (1997)
	Saint, The (1997)
	Grosse Pointe Blank (1997)
	Romy and Michele's High School Reunion (1997)
	Austin Powers: International Man of Mystery (1997)
	Eclisse, L' (Eclipse) (1962)
	Fifth Element, The (1997)
	Shall We Dance? (Shall We Dansu?) (1996)
	Lost World: Jurassic Park, The (1997)
	Batman & Robin (1997)
	Contempt (Mépris, Le) (1963)
	Gabbeh (1996)
	Batman (1966)
	Men in Black (a.k.a. MIB) (1997)
	Contact (1997)
	Conan the Barbarian (1982)
	Cop Land (1997)
	Conspiracy Theory (1997)
	Air Force One (1997)
	Hunt for Red October, The (1990)
	My Own Private Idaho (1991)
	L.A. Confidential (1997)
	Game, The (1997)
	Ice Storm, The (1997)
	Full Monty, The (1997)
	Mrs. Brown (a.k.a. Her Majesty, Mrs. Brown) (1997)
	House of Yes, The (1997)
	Gattaca (1997)
	Duel (1971)
	Boogie Nights (1997)
	Witness (1985)
	Starship Troopers (1997)
	Truman Show, The (1998)
	Wings of the Dove, The (1997)
	Amistad (1997)
	Apostle, The (1997)
	Butcher Boy, The (1997)
	Flubber (1997)
	Good Will Hunting (1997)
	Midnight in the Garden of Good and Evil (1997)
	Sweet Hereafter, The (1997)
	Titanic (1997)
	Jackie Brown (1997)
	Big Lebowski, The (1998)
	Wag the Dog (1997)
	Dark City (1998)
	Holy Mountain, The (Montaña sagrada, La) (1973)
	Fallen (1998)
	Wedding Singer, The (1998)
	As Good as It Gets (1997)
	Men with Guns (1997)
	Fireworks (Hana-bi) (1997)
	Big One, The (1997)
	Spanish Prisoner, The (1997)
	Zero Effect (1998)
	Nil By Mouth (1997)
	Kurt & Courtney (1998)
	Mr. Nice Guy (Yat goh ho yan) (1997)
	Taste of Cherry (Ta'm e guilass) (1997)
	Character (Karakter) (1997)
	Bulworth (1998)
	Fear and Loathing in Las Vegas (1998)
	Perfect Murder, A (1998)
	X-Files: Fight the Future, The (1998)
	Smoke Signals (1998)
	There's Something About Mary (1998)
	Plan 9 from Outer Space (1959)
	Choose Me (1984)
	Wings (1927)
	All Quiet on the Western Front (1930)
	Mutiny on the Bounty (1935)
	Life of Emile Zola, The (1937)
	You Can't Take It with You (1938)
	How Green Was My Valley (1941)
	Lost Weekend, The (1945)
	Best Years of Our Lives, The (1946)
	Gentleman's Agreement (1947)
	All the King's Men (1949)
	From Here to Eternity (1953)
	On the Waterfront (1954)
	Marty (1955)
	West Side Story (1961)
	Tom Jones (1963)
	Man for All Seasons, A (1966)
	In the Heat of the Night (1967)
	Midnight Cowboy (1969)
	Rocky (1976)
	Kramer vs. Kramer (1979)
	Ordinary People (1980)
	Chariots of Fire (1981)
	Terms of Endearment (1983)
	Out of Africa (1985)
	Last Emperor, The (1987)
	Rain Man (1988)
	Driving Miss Daisy (1989)
	Take the Money and Run (1969)
	Klute (1971)
	Repo Man (1984)
	Metropolitan (1990)
	Labyrinth (1986)
	Breakfast Club, The (1985)
	Friday the 13th (1980)
	Poltergeist (1982)
	Exorcist, The (1973)
	Lethal Weapon (1987)
	Goonies, The (1985)
	Mask of Zorro, The (1998)
	Metropolis (1927)
	Poseidon Adventure, The (1972)
	Bambi (1942)
	Seven Samurai (Shichinin no samurai) (1954)
	Dangerous Liaisons (1988)
	Dune (1984)
	Godfather: Part III, The (1990)
	Lolita (1997)
	Saving Private Ryan (1998)
	Condorman (1981)
	Flight of the Navigator (1986)
	Roger & Me (1989)
	Purple Rose of Cairo, The (1985)
	Out of the Past (1947)
	Doctor Zhivago (1965)
	Fanny and Alexander (Fanny och Alexander) (1982)
	Trip to Bountiful, The (1985)
	Tender Mercies (1983)
	Fandango (1985)
	Blue Velvet (1986)
	Jungle Book, The (1967)
	Lady and the Tramp (1955)
	Little Mermaid, The (1989)
	101 Dalmatians (One Hundred and One Dalmatians) (1961)
	One Magic Christmas (1985)
	Peter Pan (1953)
	Return from Witch Mountain (1978)
	Rocketeer, The (1991)
	Sleeping Beauty (1959)
	Song of the South (1946)
	Splash (1984)
	Steamboat Willie (1928)
	Swing Kids (1993)
	Jerk, The (1979)
	Dead Men Don't Wear Plaid (1982)
	Man with Two Brains, The (1983)
	Grand Canyon (1991)
	Outsiders, The (1983)
	Indiana Jones and the Temple of Doom (1984)
	1984 (Nineteen Eighty-Four) (1984)
	Atlantic City (1980)
	Who's Afraid of Virginia Woolf? (1966)
	Weird Science (1985)
	Charlotte's Web (1973)
	Dark Crystal, The (1982)
	Legend (1985)
	Sixteen Candles (1984)
	Pretty in Pink (1986)
	Gods Must Be Crazy, The (1980)
	Hearts of Darkness: A Filmmakers Apocalypse (1991)
	Rosemary's Baby (1968)
	NeverEnding Story, The (1984)
	Navigator: A Mediaeval Odyssey, The (1988)
	Beetlejuice (1988)
	Rope (1948)
	Strangers on a Train (1951)
	Willow (1988)
	Untouchables, The (1987)
	Shadow of a Doubt (1943)
	Lady Vanishes, The (1938)
	Wild Tales (2014)
	Visions of Light: The Art of Cinematography (1992)
	Cube (1997)
	Seven Beauties (Pasqualino Settebellezze) (1976)
	My Bodyguard (1980)
	Broadcast News (1987)
	Working Girl (1988)
	Say Anything... (1989)
	Hero (1992)
	One Crazy Summer (1986)
	Few Good Men, A (1992)
	One True Thing (1998)
	Ronin (1998)
	Sheltering Sky, The (1990)
	If.... (1968)
	Thing, The (1982)
	Player, The (1992)
	Edward Scissorhands (1990)
	Producers, The (1968)
	My Cousin Vinny (1992)
	Nashville (1975)
	Love Is the Devil (1998)
	Children of a Lesser God (1986)
	Elephant Man, The (1980)
	Happiness (1998)
	Pleasantville (1998)
	Life Is Beautiful (La Vita è bella) (1997)
	Hands on a Hard Body (1996)
	Living Out Loud (1998)
	Elizabeth (1998)
	Nights of Cabiria (Notti di Cabiria, Le) (1957)
	Big Chill, The (1983)
	Central Station (Central do Brasil) (1998)
	Waking Ned Devine (a.k.a. Waking Ned) (1998)
	Celebration, The (Festen) (1998)
	Pink Flamingos (1972)
	King Kong (1933)
	Babe: Pig in the City (1998)
	Simple Plan, A (1998)
	Rushmore (1998)
	Shakespeare in Love (1998)
	Miracle on 34th Street (1947)
	Jewel of the Nile, The (1985)
	Romancing the Stone (1984)
	Cocoon (1985)
	Karate Kid, The (1984)
	You've Got Mail (1998)
	Thin Red Line, The (1998)
	Boy Who Could Fly, The (1986)
	Fly, The (1958)
	Texas Chainsaw Massacre, The (1974)
	Name of the Rose, The (Name der Rose, Der) (1986)
	Peggy Sue Got Married (1986)
	Crocodile Dundee (1986)
	Color of Money, The (1986)
	Cowboy Bebop (1998)
	Peeping Tom (1960)
	Payback (1999)
	October Sky (1999)
	Towering Inferno, The (1974)
	Logan's Run (1976)
	Planet of the Apes (1968)
	Lock, Stock & Two Smoking Barrels (1998)
	Dead Ringers (1988)
	Village of the Damned (1960)
	Children of the Damned (1963)
	Walk on the Moon, A (1999)
	Matrix, The (1999)
	Lovers of the Arctic Circle, The (Los Amantes del Círculo Polar) (1998)
	Pushing Tin (1999)
	Election (1999)
	Mildred Pierce (1945)
	Dick Tracy (1990)
	Inglourious Basterds (2009)
	William Shakespeare's A Midsummer Night's Dream (1999)
	Endurance (1999)
	Star Wars: Episode I - The Phantom Menace (1999)
	Superman (1978)
	Superman II (1980)
	Dracula (1931)
	Frankenstein (1931)
	Rocky Horror Picture Show, The (1975)
	It Came from Hollywood (1982)
	Invasion of the Body Snatchers (1956)
	Notting Hill (1999)
	Eternity and a Day (Mia aoniotita kai mia mera) (1998)
	Limbo (1999)
	Austin Powers: The Spy Who Shagged Me (1999)
	Run Lola Run (Lola rennt) (1998)
	Lovers on the Bridge, The (Amants du Pont-Neuf, Les) (1991)
	Eyes Wide Shut (1999)
	Ghostbusters (a.k.a. Ghost Busters) (1984)
	Mystery Men (1999)
	Runaway Bride (1999)
	WALL·E (2008)
	Killing, The (1956)
	Spartacus (1960)
	Lolita (1962)
	400 Blows, The (Les quatre cents coups) (1959)
	Jules and Jim (Jules et Jim) (1961)
	Color Purple, The (1985)
	Mission, The (1986)
	Radio Days (1987)
	Frances (1982)
	Sixth Sense, The (1999)
	Thomas Crown Affair, The (1999)
	Thomas Crown Affair, The (1968)
	Heaven Can Wait (1978)
	Raven, The (1963)
	Monty Python's And Now for Something Completely Different (1971)
	Airplane! (1980)
	Big (1988)
	Oscar and Lucinda (a.k.a. Oscar & Lucinda) (1997)
	Tequila Sunrise (1988)
	Christmas Story, A (1983)
	Three Days of the Condor (3 Days of the Condor) (1975)
	On the Ropes (1999)
	Muse, The (1999)
	O Auto da Compadecida (Dog's Will, A) (2000)
	Yellow Submarine (1968)
	American Beauty (1999)
	Stop Making Sense (1984)
	Hard Day's Night, A (1964)
	Deliverance (1972)
	Excalibur (1981)
	Pajama Game, The (1957)
	Sommersby (1993)
	Tommy (1975)
	Armour of God (Long xiong hu di) (1987)
	Three Kings (1999)
	Risky Business (1983)
	Total Recall (1990)
	Body Heat (1981)
	Ferris Bueller's Day Off (1986)
	Year of Living Dangerously, The (1982)
	Children of Paradise (Les enfants du paradis) (1945)
	Drunken Master (Jui kuen) (1978)
	Conformist, The (Conformista, Il) (1970)
	Reds (1981)
	Days of Heaven (1978)
	Lady Eve, The (1941)
	Sullivan's Travels (1941)
	Man Facing Southeast (1986)
	Gilda (1946)
	South Pacific (1958)
	Dirty Dozen, The (1967)
	Help! (1965)
	Goldfinger (1964)
	Fistful of Dollars, A (Per un pugno di dollari) (1964)
	Fight Club (1999)
	Time Bandits (1981)
	Fitzcarraldo (1982)
	All That Jazz (1979)
	Crimes and Misdemeanors (1989)
	Brother, Can You Spare a Dime? (1975)
	On Any Sunday (1971)
	RoboCop (1987)
	Who Framed Roger Rabbit? (1988)
	Live and Let Die (1973)
	Being John Malkovich (1999)
	Princess Mononoke (Mononoke-hime) (1997)
	Bone Collector, The (1999)
	Insider, The (1999)
	American Movie (1999)
	Last Night (1998)
	Rosetta (1999)
	They Shoot Horses, Don't They? (1969)
	Falling Down (1993)
	General, The (1926)
	Yojimbo (1961)
	Omega Man, The (1971)
	Robin Hood (1973)
	Mister Roberts (1955)
	Little Big Man (1970)
	Face in the Crowd, A (1957)
	Trading Places (1983)
	Meatballs (1979)
	Dead Again (1991)
	Commitments, The (1991)
	Longest Day, The (1962)
	Tora! Tora! Tora! (1970)
	Women on the Verge of a Nervous Breakdown (Mujeres al borde de un ataque de nervios) (1988)
	Verdict, The (1982)
	Effect of Gamma Rays on Man-in-the-Moon Marigolds, The (1972)
	Adventures of Buckaroo Banzai Across the 8th Dimension, The (1984)
	Stand and Deliver (1988)
	Moonstruck (1987)
	All About My Mother (Todo sobre mi madre) (1999)
	Babes in Toyland (1934)
	Harvey (1950)
	Bicycle Thieves (a.k.a. The Bicycle Thief) (a.k.a. The Bicycle Thieves) (Ladri di biciclette) (1948)
	Matewan (1987)
	Kagemusha (1980)
	McCabe & Mrs. Miller (1971)
	Grapes of Wrath, The (1940)
	My Man Godfrey (1957)
	Shop Around the Corner, The (1940)
	Natural, The (1984)
	River Runs Through It, A (1992)
	Fatal Attraction (1987)
	Midnight Run (1988)
	Fisher King, The (1991)
	Places in the Heart (1984)
	Toy Story 2 (1999)
	Go West (1925)
	Grand Illusion (La grande illusion) (1937)
	Great Santini, The (1979)
	U2: Rattle and Hum (1988)
	Deuce Bigalow: Male Gigolo (1999)
	Last Picture Show, The (1971)
	7th Voyage of Sinbad, The (1958)
	Magnolia (1999)
	Boiling Point (1993)
	Carnal Knowledge (1971)
	Easy Rider (1969)
	Galaxy Quest (1999)
	Talented Mr. Ripley, The (1999)
	Hurricane, The (1999)
	Stalag 17 (1953)
	Old Boy (2003)
	Papillon (1973)
	Last Detail, The (1973)
	Five Easy Pieces (1970)
	Fast Times at Ridgemont High (1982)
	Poison (1991)
	Malcolm X (1992)
	Agnes of God (1985)
	Wayne's World (1992)
	League of Their Own, A (1992)
	Patriot Games (1992)
	Howards End (1992)
	Hard-Boiled (Lat sau san taam) (1992)
	Of Mice and Men (1992)
	Circus, The (1928)
	City Lights (1931)
	Kid, The (1921)
	Wonder Boys (2000)
	Splendor in the Grass (1961)
	Born Yesterday (1950)
	Birdy (1984)
	Never Cry Wolf (1983)
	Defending Your Life (1991)
	Breaking Away (1979)
	Bull Durham (1988)
	Dog Day Afternoon (1975)
	American Graffiti (1973)
	Asphalt Jungle, The (1950)
	Searchers, The (1956)
	Taking of Pelham One Two Three, The (1974)
	JFK (1991)
	Making a Murderer (2015)
	Night to Remember, A (1958)
	Captain Horatio Hornblower R.N. (1951)
	Crimson Pirate, The (1952)
	Thelma & Louise (1991)
	...And Justice for All (1979)
	Animal House (1978)
	Do the Right Thing (1989)
	Double Indemnity (1944)
	Good Earth, The (1937)
	Good Morning, Vietnam (1987)
	Guess Who's Coming to Dinner (1967)
	Romeo Must Die (2000)
	Modern Times (1936)
	Hud (1963)
	Hustler, The (1961)
	Inherit the Wind (1960)
	Dersu Uzala (1975)
	Close Encounters of the Third Kind (1977)
	Place in the Sun, A (1951)
	Ladyhawke (1985)
	High Fidelity (2000)
	True Grit (1969)
	Murphy's Romance (1985)
	Solaris (Solyaris) (1972)
	Network (1976)
	Odd Couple, The (1968)
	Outlaw Josey Wales, The (1976)
	Return to Me (2000)
	Arthur (1981)
	Predator (1987)
	Diner (1982)
	Cabaret (1972)
	What Ever Happened to Baby Jane? (1962)
	Death on the Staircase (Soupçons) (2004)
	Guys and Dolls (1955)
	The Hunger (1983)
	Marathon Man (1976)
	Idiots, The (Idioterne) (1998)
	Gladiator (2000)
	Lives of Others, The (Das leben der Anderen) (2006)
	Gypsy (1962)
	On the Town (1949)
	Shanghai Noon (2000)
	Carnival of Souls (1962)
	Gold Rush, The (1925)
	Moonraker (1979)
	Endless Summer, The (1966)
	Blow-Out (La grande bouffe) (1973)
	Blazing Saddles (1974)
	Eraserhead (1977)
	Baraka (1992)
	For a Few Dollars More (Per qualche dollaro in più) (1965)
	Blood Simple (1984)
	Fabulous Baker Boys, The (1989)
	Porky's (1982)
	American Pimp (1999)
	Coming Home (1978)
	Conversation, The (1974)
	Serpico (1973)
	Ace in the Hole (Big Carnival, The) (1951)
	Lonely Are the Brave (1962)
	Big Trouble in Little China (1986)
	Badlands (1973)
	Titan A.E. (2000)
	The Golden Voyage of Sinbad (1973)
	X-Men (2000)
	Love and Death (1975)
	Replacements, The (2000)
	Naked Gun: From the Files of Police Squad!, The (1988)
	Suddenly, Last Summer (1959)
	Almost Famous (2000)
	Dancer in the Dark (2000)
	Fantastic Voyage (1966)
	Bank Dick, The (1940)
	Cosmos (1980)
	Get Carter (1971)
	Requiem for a Dream (2000)
	Charlie's Angels (2000)
	Crouching Tiger, Hidden Dragon (Wo hu cang long) (2000)
	Wall Street (1987)
	O Brother, Where Art Thou? (2000)
	Traffic (2000)
	House of Games (1987)
	Beverly Hills Cop (1984)
	Innerspace (1987)
	Masters of the Universe (1987)
	Series 7: The Contenders (2001)
	Elmer Gantry (1960)
	Memento (2000)
	Spy Kids (2001)
	Bridget Jones's Diary (2001)
	Scarface (1983)
	Norma Rae (1979)
	Startup.com (2001)
	Shrek (2001)
	City Slickers (1991)
	Magnificent Seven, The (1960)
	Lumumba (2000)
	Cannonball Run, The (1981)
	Something Wild (1986)
	Cries and Whispers (Viskningar och rop) (1972)
	Garden of the Finzi-Continis, The (Giardino dei Finzi-Contini, Il) (1970)
	Misfits, The (1961)
	Sweet Smell of Success (1957)
	Last Dragon, The (1985)
	Adventures of Baron Munchausen, The (1988)
	Without a Clue (1988)
	American Ninja (1985)
	American Ninja 2: The Confrontation (1987)
	Mildred Pierce (2011)
	American Ninja 3: Blood Hunt (1989)
	Bill & Ted's Excellent Adventure (1989)
	Major League (1989)
	Sea of Love (1989)
	Zorro, the Gay Blade (1981)
	Theremin: An Electronic Odyssey (1993)
	Battle Creek Brawl (Big Brawl, The) (1980)
	O Lucky Man! (1973)
	Clerks II (2006)
	Toy Story 3 (2010)
	Endurance: Shackleton's Legendary Antarctic Expedition, The (2000)
	Phantom of the Paradise (1974)
	It's a Mad, Mad, Mad, Mad World (1963)
	Silkwood (1983)
	Iron Monkey (Siu nin Wong Fei-hung ji: Tit Ma Lau) (1993)
	Monsters, Inc. (2001)
	Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone) (2001)
	Breathless (À bout de souffle) (1960)
	Flash Gordon (1980)
	Ocean's Eleven (2001)
	No Man's Land (2001)
	And Then There Were None (1945)
	Amelie (Fabuleux destin d'Amélie Poulain, Le) (2001)
	Bill & Ted's Bogus Journey (1991)
	Lord of the Rings: The Fellowship of the Ring, The (2001)
	Big Heat, The (1953)
	Enigma of Kaspar Hauser, The (a.k.a. Mystery of Kaspar Hauser, The) (Jeder für sich und Gott Gegen Alle) (1974)
	M*A*S*H (a.k.a. MASH) (1970)
	Storytelling (2001)
	Bourne Ultimatum, The (2007)
	Wild Strawberries (Smultronstället) (1957)
	Gloria (1980)
	Resident Evil (2002)
	And Your Mother Too (Y tu mamá también) (2001)
	Return of the Secaucus 7 (1980)
	Piano Teacher, The (La pianiste) (2001)
	Rashomon (Rashômon) (1950)
	Nine Queens (Nueve reinas) (2000)
	Lenny (1974)
	Spider-Man (2002)
	Last Waltz, The (1978)
	Bourne Identity, The (2002)
	Powerpuff Girls, The (2002)
	Z (1969)
	Nosferatu the Vampyre (Nosferatu: Phantom der Nacht) (1979)
	The Big Bus (1976)
	Rollerball (1975)
	White Ribbon, The (Das weiße Band) (2009)
	Son of the Bride (Hijo de la novia, El) (2001)
	Spirited Away (Sen to Chihiro no kamikakushi) (2001)
	Little Miss Sunshine (2006)
	Birdman: Or (The Unexpected Virtue of Ignorance) (2014)
	Professional, The (Le professionnel) (1981)
	Forsyte Saga, The (1967)
	Harry Potter and the Chamber of Secrets (2002)
	They All Laughed (1981)
	Talk to Her (Hable con Ella) (2002)
	Adaptation (2002)
	My Winnipeg (2007)
	Victory (a.k.a. Escape to Victory) (1981)
	Bleak House (2005)
	Catch Me If You Can (2002)
	Hours, The (2002)
	Pianist, The (2002)
	King of Comedy, The (1983)
	Confessions of a Dangerous Mind (2002)
	City of God (Cidade de Deus) (2002)
	American Friend, The (Amerikanische Freund, Der) (1977)
	Lost in La Mancha (2002)
	Willie & Phil (1980)
	Into the Wild (2007)
	Michael Clayton (2007)
	Million Dollar Baby (2004)
	Cria Cuervos (1976)
	Irreversible (Irréversible) (2002)
	Earth Entranced (Terra em Transe) (1967)
	Stevie (2002)
	Cowboy Bebop: The Movie (Cowboy Bebop: Tengoku no Tobira) (2001)
	Gigantic (A Tale of Two Johns) (2002)
	Finding Nemo (2003)
	Brokeback Mountain (2005)
	Unforgiven, The (1960)
	Samsara (2011)
	Shaolin Soccer (Siu lam juk kau) (2001)
	Pink Panther, The (1963)
	Discreet Charm of the Bourgeoisie, The (Charme discret de la bourgeoisie, Le) (1972)
	No Country for Old Men (2007)
	Hunt, The (Jagten) (2012)
	Holy Motors (2012)
	Sgt. Pepper's Lonely Hearts Club Band (1978)
	Red Balloon, The (Ballon rouge, Le) (1956)
	Kill Bill: Vol. 1 (2003)
	Unvanquished, The (Aparajito) (1957)
	Pervert's Guide to Cinema, The (2006)
	Master and Commander: The Far Side of the World (2003)
	WarGames (1983)
	Passion of Joan of Arc, The (Passion de Jeanne d'Arc, La) (1928)
	Cabinet of Dr. Caligari, The (Cabinet des Dr. Caligari., Das) (1920)
	Battle Royale (Batoru rowaiaru) (2000)
	Vivre sa vie: Film en douze tableaux (My Life to Live) (1962)
	Aguirre: The Wrath of God (Aguirre, der Zorn Gottes) (1972)
	Passage to India, A (1984)
	Diabolique (Les diaboliques) (1955)
	Side by Side (2012)
	Night at the Opera, A (1935)
	Stolen Kisses (Baisers volés) (1968)
	Captain Phillips (2013)
	Seventh Continent, The (Der siebente Kontinent) (1989)
	Juno (2007)
	Chitty Chitty Bang Bang (1968)
	Kung Fu Hustle (Gong fu) (2004)
	Good bye, Lenin! (2003)
	Persona (1966)
	Eternal Sunshine of the Spotless Mind (2004)
	Dogville (2003)
	Germany Year Zero (Germania anno zero) (Deutschland im Jahre Null) (1948)
	After Hours (1985)
	Pan's Labyrinth (Laberinto del fauno, El) (2006)
	Kill Bill: Vol. 2 (2004)
	Band of Brothers (2001)
	Departed, The (2006)
	Hands in the Air (2010)
	There Will Be Blood (2007)
	Pride and Prejudice (1995)
	Drained (O cheiro do Ralo) (2006)
	Hachiko: A Dog's Story (a.k.a. Hachi: A Dog's Tale) (2009)
	Zorba the Greek (Alexis Zorbas) (1964)
	Woodstock (1970)
	Legend, The (Legend of Fong Sai-Yuk, The) (Fong Sai Yuk) (1993)
	To the Left of the Father (Lavoura Arcaica) (2001)
	Helter Skelter (1976)
	Sammy and Rosie Get Laid (1987)
	Book Thief, The (2013)
	Dolce Vita, La (1960)

Da para perceber que esta não é uma abordagem muito boa. Como é retornado somente filmes com nota 5 não existe nenhum critério para ordená-los, fazendo com que seja exibido todos os filmes recomendados (1106). Na próxima seção, vamos apresentar uma proposta em que utilizamos a similaridade de usuários para melhorar essa recomendação.

Sistema de Recomendação Colaborativo

Uma forma melhor de recomendar é usar os ratings que os usuários deram aos filmes para classificar os usuários de acordo com uma similaridade. A recomendação é feita a partir dos usuários mais similares. Para isso, devemos calcular a distância do usuário em questão com todos os usuários da base e, por fim, a nota que o usuário daria. Como já foi dito, vamos utilizar as equações apresentadas anteriomente.


In [18]:
# Imports necessários para esta seção

import math
import numpy as np
import operator
from sklearn.externals import joblib

A primeira função que vamos implementar é a de similaridade de usuários:

$ sim(x,y) = \frac{\sum_{ s \in S_{xy}} { (r_{x,s} - \bar{r_{x}}) (r_{y,s} - \bar{r_{y}}) } } { \sqrt{ \sum_{s \in S_{xy}}{ (r_{x,s} - \bar{r_{x}})^2 } \sum_{s \in S_{xy}}{ (r_{y,s} - \bar{r_{y}})^2 } } } $, onde:

  • $S_x$: itens avaliados pelo usuário $x$;
  • $S_y$: itens avaliados pelo usuário $y$;
  • $S_{xy}$: o conjunto de todos os itens que foram avaliados tanto por x quanto por y, em outras palavras, a interseção dos conjuntos $S_x$ e $S_y$;
  • $r_{x,s}$: rating do usuário $x$ para o item $s$;
  • $r_{y,s}$: rating do usuário $y$ para o item $s$;
  • $\bar{r_x}$: média do ratings dos filmes avaliados por $x$
  • $\bar{r_y}$: média do ratings dos filmes avaliados por $y$

Para tal, precisamos definir uma função que dados dois usuários, retorne os itens avaliados por ambos.


In [19]:
def intersect_items(id_user_x, id_user_y):
    
    """Retorna duas listas de ratings. Os ratings correspondem aos itens avaliados por x e y.
        É retornada duas listas distintas já que os itens são os mesmo, mas as avaliações são distintas.
        Isso irá facilitar na hora de calcuar a similaridade dos usuários.

    Keyword arguments:
    id_user_x -- id do usuário x
    id_user_y -- id do usuário y
    
    """
    
    dict_x = movies_user_false[id_user_x]
    dict_y = movies_user_false[id_user_y]

    all_keys = set(list(dict_x.keys()) + list(dict_y.keys()))

    ratings_x = []
    ratings_y = []

    for key in all_keys:
        if key in dict_x and key in dict_y:
            ratings_x.append(dict_x[key])
            ratings_y.append(dict_y[key])
        
    ratings_x = np.array(ratings_x)
    ratings_y = np.array(ratings_y)
    
    return ratings_x, ratings_y

O passo seguinte é implementarmos a função que calcular a similaridade dos usuários x e y. Observer que estamos utilizando o numpy para realizar algumas operações. Isso evita a utilização de for para realizar algumas operações sobre os vetores.


In [20]:
def similarity(id_user_x, id_user_y):
    
    """Retorna a similaridade de dois usuários baseada nos ratings dos filmes

    Keyword arguments:
    id_user_x -- id do usuário x
    id_user_y -- id do usuário y
    
    """
    
    ratings_x, ratings_y = intersect_items(id_user_x, id_user_y)
    
    if(len(ratings_x) == 0):
        return 0.0
    
    mean_rating_x = np.mean(ratings_x)
    mean_rating_y = np.mean(ratings_y)
    
    numerador = (ratings_x - mean_rating_x)*(ratings_y - mean_rating_y)
    numerador = np.sum(numerador)
    
    den_x = np.sum(np.power(ratings_x - mean_rating_x, 2))
    den_y = np.sum(np.power(ratings_y - mean_rating_y, 2))
    
    similarity_value = numerador / np.sqrt((den_x * den_y))

    return similarity_value

Para o cálculo do predict rating é necessário pegar os Top $N$ usuários mais semelhantes ao usuário para qual desejamos prover a recomendação. Para encontrar essa lista, precisamos calcular a similaridade deste usuário com todos da base, ordenar a lista resultante pela similaridade e, por fim, retornar os TopN. Essa tarefa pode ser um tanto custosa para se fazer no processo de recomendação. Por isso, vamos pré-calcular esta similaridade e armazenar em um arquivo que será carregado quando necessário. A opção de salvar em arquivo é porque desta forma podemos utiliza-lo em outro momento quando a seção do Jupyter ou do Python for finalizada. Um outro motivo é que essa tarefa é um tanto custosa e, a depender do número de usuários, pode demorar.

O código a seguir calcular essa similaridade e persiste o objeto com todas as similaridades de todos os usuários da base no arquivo usersimilarity.pkl. Só é necessário rodar este código na primeira vez que estiver rodando esse tutorial ou, se por um acaso, a base de usuário mudar.


In [393]:
all_users = get_all_users()

map_similarity = {}

for i in all_users:
    for j in all_users:
        if(i < j):
            map_similarity[(i, j)] = similarity(i, j)

            
joblib.dump(map_similarity, "usersimilarity.pkl")


/Users/adolfoguimaraes/Desenvolvimento/ludiicos/tensorflow/tensorenv/lib/python3.6/site-packages/ipykernel/__main__.py:46: RuntimeWarning: invalid value encountered in double_scalars
Out[393]:
['usersimilarity.pkl']

Desta forma, podemos carregar a base de similaridades sempre que precisarmos. Se o arquivo já foi gerado, não é ncessário mais chamar o código anterior. Basta executar o código a seguir para carregar toda similaridade na memória.


In [21]:
user_similarity = joblib.load('usersimilarity.pkl')

Com as informações de similaridade pré-calculadas podemos calcular o predict rating.

$ r_{c,s} = \bar{r_c} + k * \sum_{c' \in \hat{C}}{sim(c,c') \times (r_{c',s} - \bar{r_{c'}})}$, onde:

  • $c$ e $c'$: são usuários;
  • $s$: um item;
  • $k$: é um fator noramlizador dado por $k = \frac{1}{\sum_{c' \in \hat{C}}{|sim(c, c')|}}$
  • $sim(c, c'):$ a similaridade do usuário c com o usuário c' dada pela equação anterior;
  • $\hat{C}:$ o conjunto dos $N$ usuários mais similares a $c$ que avaliaram o item $s$.

O cálculo mais crítico desta equação é encontrar o conjunto $\hat{C}$, a lista dos $N$ usuários mais similares a $c$ que avaliaram o item $s$ que desejamos recomendar a $c$. Essa lista é definida pela função a seguir.


In [22]:
def get_topneighbors_rateditem(id_user, id_movie, N):
    
    """Retorna os N usuários mais similares a id_user que avaliram o id_item

    Keyword arguments:
    id_user -- id do usuário
    id_item -- id do item
    N -- Número de usuários semelhantes retornados. 
    
    """
    
    all_users = get_all_users()
    similars = {}
    
    for user_ in all_users:
        items_user_ = movies_user_true[user_]

        if(id_user != user_):
            if id_movie in items_user_:
                similars[(id_user, user_)] = user_similarity[(id_user, user_)]
        
        
    sorted_ = sorted(similars.items(), key=operator.itemgetter(1), reverse=True)
    
    return sorted_[:N]

Um vez que a função foi dos TopN foi definida, podemos usá-la na implementação na função do predict rating:


In [25]:
def predict_rating(id_user, id_movie):
    
    N = 20
    
    items_user = movies_user_false[id_user]
    all_user_values = [items_user[x] for x in items_user]
    
    mean_user = np.mean(all_user_values)
    
    topN_users = get_topneighbors_rateditem(id_user, id_movie, N)
    
    sum_ = 0
    sum_k = 0
    
    for topuser in topN_users: 
        similarity = topuser[1]
        user_u = topuser[0][1]
        
        rating_user_u = get_rating_by_user_movie(user_u, id_movie)
        
        items_user_u = movies_user_false[user_u]
        
    
        all_user_u_values = [items_user_u[x] for x in items_user_u]
        
        mean_user_u = np.mean(all_user_u_values)
        
        sum_ += similarity * (rating_user_u - mean_user_u)
        
        sum_k += abs(similarity)
    
    if sum_k == 0:
        k = 0
    else:
        k = 1 / sum_k
    
    rating_final = mean_user + k * sum_
    
    return rating_final

In [26]:
print("Nota do usuário 1 para o filme 30:")
predict_rating(1, 30)


Nota do usuário 1 para o filme 30:
Out[26]:
2.9621472667004038

Com a função de predição de ratings implementada podemos realizar uma recomendação para um usuário da base. Vamos realizar a recomendação para o usuário 1. Para isso, vamos calcular o rating do usuário 1 para todos os filmes da base que ele ainda não avaliou. Feito isso, vamos retornar e exibir os top 10 filmes mais bem avaliados.

Essa tarefa demora em torno de 5 minutos


In [249]:
user = 1

all_movies = list(movies.index.values)
all_user_movies = get_movies_by_user(user, list_=True)

movies_to_predict = [x for x in all_movies if x not in all_user_movies]

predict = {}

for item in movies_to_predict:
    predict[item] = predict_rating(user, item)

Uma vez calculados os ratings, devemos ordenar e exibir os 10 mais bem avaliados.


In [250]:
sorted_items = sorted(predict.items(), key=operator.itemgetter(1), reverse=True)
sorted_items[:10]


Out[250]:
[(6772, 5.5966269841269831),
 (8387, 5.5739166351341156),
 (49013, 5.5677865612648221),
 (4453, 5.4296992481203006),
 (6298, 5.4242236024844717),
 (8811, 5.4242236024844717),
 (31290, 5.4242236024844717),
 (47815, 5.4242236024844717),
 (48591, 5.4242236024844717),
 (58146, 5.4242236024844717)]

Melhorando a forma de apresentação, temos:


In [334]:
count = 1
top10 = sorted_items[:10]
print("Filmes recomendados para o usuário %s:" % user)
for movie in top10:
    print("\t %.2d" % count, "[%.1f]" % movie[1], get_movie_title(movie[0]), )
    count += 1


Filmes recomendados para o usuário 1:
	 01 [5.6] To Be and to Have (Être et avoir) (2002)
	 02 [5.6] Police Academy: Mission to Moscow (1994)
	 03 [5.6] Santa Clause 3: The Escape Clause, The (2006)
	 04 [5.4] Michael Jordan to the Max (2000)
	 05 [5.4] Malibu's Most Wanted (2003)
	 06 [5.4] Yu-Gi-Oh! (2004)
	 07 [5.4] Beastmaster 2: Through the Portal of Time (1991)
	 08 [5.4] Crossover (2006)
	 09 [5.4] Grudge 2, The (2006)
	 10 [5.4] Witless Protection (2008)

Usando a API do IMDb

Podemos utilizar a API do IMDb para retornar algumas informações a mais sobre os filmes recomendados. O código a seguir importa a biblioteca e com as informações contidas na tabela links da Movielens acessa detalhes dos filmes na base do IMDb. Vamos utilizar a biblioteca imdbpie: https://github.com/richardasaurus/imdb-pie.


In [331]:
# Imports necessários

from imdbpie import Imdb
from IPython.display import Image, display

In [277]:
# Método que retorna o ID do IMDb dado o id do filme

def get_imdb_id(id_movie):
    
    imdbid = int(links.loc[idx[id_movie], 'imdbId'])
    
    imdbid = "tt%.7d" % imdbid
    
    return imdbid

In [289]:
imdbid = get_imdb_id(300)
print(imdbid)


tt0110932

In [ ]:
# Carregando a biblioteca do IMDb
imdb = Imdb()
imdb = Imdb(anonymize=True)

In [293]:
imdbid = get_imdb_id(1)
title = imdb.get_title_by_id(imdbid)
print(title)


<Title: 'Toy Story' - 'tt0114709'>

In [304]:
Image(title.cover_url)


Out[304]:

In [305]:
print(title.title)
print(title.year)
print(title.genres)
print(title.release_date)


Toy Story
1995
['Animation', 'Adventure', 'Comedy', 'Family', 'Fantasy']
1995-11-22

In [306]:
cast = title.cast_summary
for person in cast:
    print(person.name)


Tom Hanks
Tim Allen
Don Rickles
Jim Varney

In [324]:
reviews = imdb.get_title_reviews(imdbid, max_results=2)

In [326]:
for review in reviews:
    print("Review by %s" % review.username)
    if(review.rating is not None):
        print("Rating: %.1f" % review.rating)
    print("Review: %s" % review.text)
    print()


Review by Philip Van der Veken
Rating: 9.0
Review: I am a big fan of the animated movies coming from the Pixar Studios. They are always looking for the newest technological possibilities to use in their movies, creating movies that are more than just worth a watch, even when they were made a decade ago.

The movie is about toys that come to life when their owner is asleep or not in the same room. When the young boy's birthday is coming up, all the toys are nervous. They don't want to be ignored when the new one arrives. Woody the cowboy is their "leader" because he's the most popular one of them all. He's the only one that hasn't got to be afraid, but than a new favorite arrives ... Buzz Lightyear. He hates him and tries everything possible to get rid of him, but as the time passes by they learn to appreciate each other...

When you see Toy Story, you may think that the different human like characters (Woody the cowboy for instance) aren't always as perfect as we are used to see in todays animated movies. Perhaps that's true, but if you keep in mind that all this was done in 1995, when computers weren't yet as strong and the technology for creating such movies was almost unknown, than you can only have a lot of respect for what the creators did. I loved the story and liked the animations a lot. I give it an 8.5/10.

Review by Michael DeZubiria (wppispam2013@gmail.com)
Rating: 10.0
Review: 
Toy Story is not only the best Disney film because it has the best story and the best animation, but also because of the excellent actors chosen to provide the voices of the animals. The casting was perfect from top to bottom, and the movie provides an excellent adventure story about friendship and loyalty that keeps you engrossed until the nail-biting climax.


Tom Hanks and Tim Allen provided excellent voices for Woody and Buzz Lightyear -their performances alone are one of the biggest things that made this such a spectacular movie. Besides that, though, you have the excellent story that is not only noteworthy because it has never really been told from this perspective before, but also because it was just told so well. All of the characters in the film are very well developed and all have appropriate and effective actors chosen to provide their voices. 

And of course, who could forget the revolutionary animation! The computer animation used for this movie not only made it startlingly realistic but also opened up tons of possibilities, and thankfully the filmmakers chose to explore these possibilities. There are dozens of things that are hidden in the woodwork throughout the film, as well as in the songs – note, for example, the subtle playing of the Indiana Jones theme song in the scene where Woody knocks Buzz out the window with the desk lamp.


Toy Story is by far the best Disney film ever made, it's pretty much perfect. It's adventurous, it's exciting, it's entertaining, it's good for the whole family, it's got great characters, story, and plot, and above all, it's fun.


Basta acessar a página da API para ver todas as informações que podem ser retornadas a partir da API.

Para finalizar vamos melhorar a nossa apresentação da recomendação utilizando a API do IMDb.


In [333]:
count = 1
top10 = sorted_items[:10]
print("Filmes recomendados para o usuário %s:" % user)
for movie in top10:
    imdbid = get_imdb_id(movie[0])
    print("%.2d" % count, "[%.1f]" % movie[1], get_movie_title(movie[0]), )
    if imdb.title_exists(imdbid):
        title = imdb.get_title_by_id(imdbid)
        display(Image(title.cover_url))
    count += 1


Filmes recomendados para o usuário 1:
01 [5.6] To Be and to Have (Être et avoir) (2002)
02 [5.6] Police Academy: Mission to Moscow (1994)
03 [5.6] Santa Clause 3: The Escape Clause, The (2006)
04 [5.4] Michael Jordan to the Max (2000)
05 [5.4] Malibu's Most Wanted (2003)
06 [5.4] Yu-Gi-Oh! (2004)
07 [5.4] Beastmaster 2: Through the Portal of Time (1991)
08 [5.4] Crossover (2006)
09 [5.4] Grudge 2, The (2006)
10 [5.4] Witless Protection (2008)

Atividade do MiniProjeto

A atividade do MiniProjeto consiste em utilizar todo o conhecimento adquirido neste tutorial e propor um outro método de recomendação que de certa forma melhore ou traga algum ganho relevante ao método apresentado neste tutorial. Pesquise sobre a área e implemente tal solução.