Math still not the answer

I wrote a quick (but not very elegant) python script to retrieve locally enough data from for pattern recognition purposes. The main goal is to help me decide how much I will enjoy a movie, before watching it. I included the script at the end of the post, in case you want to try it yourself (and maybe improve it too!). It takes a while to complete, although it is quite entertaining to see its progress on screen. At the end, it provides with two lists of the same length: critics—a list of str containing the names of the critics; and scoredMovies—a list of dict containing, at index k, the evaluation of all the movies scored by the critic at index k in the previous list.

For example:

>>> critics[43]

‘James White’
>>> scoredMovies[43]

{‘hall-pass’: 60, ‘the-karate-kid’: 60, ‘the-losers’: 60,
‘the-avengers-2012’: 80, ‘the-other-guys’: 60, ‘shrek-forever-after’: 80,
‘the-lincoln-lawyer’: 80, ‘the-company-men’: 60, ‘jonah-hex’: 40,
‘arthur’: 60, ‘vampires-suck’: 20, ‘american-reunion’: 40,
‘footloose’: 60, ‘real-steel’: 60}

The number of scored films by critic varies: there are individuals that gave their opinion on a few dozen movies, and others that took the trouble to evaluate up to four thousand flicks! Note also that the names of the movies correspond with their web pages in For example, to see what critics have to say about the “Karate Kid” and other relevant information online, point your browser to It also comes in very handy if there are several versions of a single title: Which “Karate Kid” does this score refer to, the one in the eighties, or Jackie Chan’s?

Feel free to download a copy of the resulting data [here] (note it is a large file: 1.6MB).

But the fact that we have that data stored locally allows us to gather that information with simple python commands, and perform many complex operations on it.

Let’s see, for example, which critics on the list scored the movie Juno, and how did they like it:

>>> [[critics[index],x[‘juno’]] for index,x in enumerate(scoredMovies)
if ‘juno’ in x]

[[‘Roger Ebert’, 100],
[‘Peter Travers’, 88],
[‘Joe Morgenstern’, 80],
[‘David Denby’, 90],
[‘A.O. Scott’, 90],
[‘James Berardinelli’, 88],
[‘Michael Phillips’, 88],
[‘Lou Lumenick’, 100],
[‘Todd McCarthy’, 80],
[‘Michael Sragow’, 58],
[‘Claudia Puig’, 100],
[‘J.R. Jones’, 70],
[‘Scott Tobias’, 83],
[‘Kirk Honeycutt’, 80],
[‘Ty Burr’, 0],
[‘Stephanie Zacharek’, 80],
[‘Lawrence Toppman’, 75],
[‘Richard Schickel’, 80],
[‘Desson Thomson’, 90],
[‘Liam Lacey’, 75],
[‘Carrie Rickey’, 88],
[‘Maitland McDonagh’, 75]]

This is a particularly interesting example. Juno is one of those movies that I watch once, and stays on my mind for a long time: The topic leads to rich discussions; the characters are deep although stereotypical; I resided in that part of the country, and it is always enjoyable to see again the old sights and take that trip down Memory Lane. For these and many other reasons, I would grant the flick a solid B+ (85 in my scale). Note that there are no critics in this sample that scored exactly like me, although quite a few were close (those that gave it 83 or 88, for instance)

This is the starting point for my algorithm. I will assign a weight to each critic after each movie scored by me. For example, critics not in the previous list will get a weight of zero (since they did not even see the film). Critics in this list that gave June 83 points will have a weight close to 1, and the larger the difference between a critic’s score of June and mine, the smaller the weight. By picking movie after movie and scoring them, I update the weights for each critic (maybe by averaging all the positive values together, although there are better methods).

As for the individual weights for critics that evaluated Juno, we could follow this formula:

\displaystyle{ \omega_{85}(x)=e^{-(x-85)^2/5211}}

The number 85 is of course the score I gave Juno: everybody gets their weight according to the difference between their evaluations and mine. The number 5211 is approximately 85^2/\log(4), which is what you need to grant a smallest weight of 25% to any critic that watched the same movie. You can play with these parameters, of course.

I can now put to practice the algorithm that I proposed in my previous post: Let me start by picking a small random set of movies which I have seen, starting of course with the Blair Witch Project. I use that training set to compute the weight for each critic, and use those weights to assess different movies (some that I have seen, some that I haven’t):

mymovies=dict({'oldboy': 85, 
'juno': 85, 
'vicky-cristina-barcelona': 85,
'pans-labyrinth': 95, 
'indiana-jones-and-the-kingdom-of-the-crystal-skull': 85,
'planet-of-the-apes': 90, 
'8-women': 50, 
'being-john-malkovich': 90,
'pulp-fiction': 95, 
'munich': 95, 
'district-9': 85, 
'frequency': 75,
'the-adventures-of-tintin': 90, 
'sleepy-hollow': 80, 
'signs': 95,
'spider-man-3': 50, 
'space-cowboys': 60, 
'this-is-spinal-tap': 70, 
'proof': 70, 
'the-blair-witch-project': 40})

I can compute now the weights for all the critics by averaging non-zero values of the basic weights I indicated above:


weighthelper = [[index,numpy.exp( (-1)*numpy.log(4)*(mymovies[x] -                 
    yourmovies[x])**2 / float(mymovies[x]**2))] for index,yourmovies in              
    enumerate(movieData.scoredMovies) for x in mymovies if x in 
for datum in weighthelper:

def assessMovie(movieTag,scoredMovies,weights):
    helper=[scoreOf[movieTag]*weights[index] for index,scoreOf                
            in enumerate(scoredMovies) if movieTag in 
    relevantWeights = [weights[index] for index,scoreOf in 
            enumerate(scoredMovies) if movieTag in scoreOf]
    return reduce(lambda x,y: x+y, helper)/float(numpy.sum(relevantWeights))

Let’s test it:

>>> weights

[0.9316894375373113, 0.9440456788129853, 0.8276943837371913,
0.8354699447724799, 0.9096233757112504, 0.9131781397354621,
0.0, 0.792063820900018, 0.0,
0.7905341294370402, 0.8911293907698545, 0.9591397211822409,
0.9122390115047787, 0.9094164789352943, 0.0,
0.8743220704979506, 0.8091610605956614, 0.9830308800891736,
0.9377611752511361, 0.0, 0.8588275926217391,
0.8881609250607919, 0.9503520095147542, 0.0,
0.9806218435940762, 0.8326500757874696, 0.8389774746326207,
0.8743683771430136, 0.8894078063468858, 0.0,
0.0, 0.7329668437056417, 0.916911984206937,
0.41179550863378656, 0.0, 0.9830308800891736,
0.0, 0.0, 0.8543417500931425,
0.7472733165814486, 0.959985337534043, 0.25,
0.0, 0.0, 0.9153304682482231,
0.9213229176351494, 0.7186166247610994, 0.9638336270354847,
0.5553963593823906, 0.9938576336658272, 0.9719197080733987,
0.7084015185939553, 0.9291760858845585, 0.8504427376265632,
0.9554366051158306, 0.9195026707321936, 0.7071067811865476,
0.9047116150472949, 0.9123122074919429, 0.0,
0.9374115172355054, 0.0, 0.0,
0.8695144502247731, 0.8896854138078707, 0.25,
0.817259795695882, 0.0, 1.0,
0.8778357293248331, 0.6359295154878736, 0.9247356530790823,
0.8374453617968352, 0.9830308800891736, 0.8317179210428074,
0.9961672133641973, 0.0, 0.0,
0.0, 0.7925811080710169, 0.909238787663484,
0.9529153979813423, 0.7615303747676634, 0.9916488411455211,
0.9565014890519766, 0.8675413259697287]
>>> assessMovie(‘the-aristocrats’, scoredMovies, weights)

>>> assessMovie(‘the-bridesmaid’, scoredMovies, weights)

>>> assessMovie(‘pi’, scoredMovies, weights)

>>> assessMovie(‘the-artist’, scoredMovies, weights)


Compare these weighted averages with those indicated in, to realize the power of this scheme. And it only gets stronger the more movies I score and include in my training dictionary! But this leads again to the same question that I posed in my previous post, since I cannot figure out how Mathematics will help me choose a minimal set of movies that will guarantee success of any of the proposed algorithms. What do you think?

In a series of follow-up posts, I will compare the results of this method with other data mining schemes. Ideally, I would like to devote a post to each different method. This is a great way of illustrating the ideas behind the beautiful field of pattern recognition.

Script to retrieve critics data

Apologies for the poorly commented (and written!) script. I will probably update it to something more readable and stylish soon.

import urllib, numpy

def retrieveInfo(page):
    # First, obtain the source code of the page
    return source

# Retrieve from up to 100 popular critics
while source.count("<h3><a href=\"/critic/")>0:
    [a,b,c]=source.partition("<h3><a href=\"/critic/")
    print a
    if a.count("/")>0:

# For each of them, compute how many movies they have evaluated
for critic in pages:
    print "\n\n\n***************** "
    print "Gathering information for",critic
    # retrieve the first page of scores for this critic
    criticPage = mv.retrieveInfo( "" + critic +
    # remove the score of trailers, if any
    [criticPage,b,c]=criticPage.partition("<div class=\"module list_trailers\">")
    [a,b,c] = criticPage.partition("<a href=\""+critic+"\">")
    [a,b,c] = c.partition("</a>")
    numberOfPagesWithScores = int(numpy.ceil(int(a.replace(',',''))/100.0))
    print "Movies evaluated:", a.replace(',','')
    print "This person has ",numberOfPagesWithScores,"pages with scores"
    while c.count("data critscore")>0:
        [a,b,c]=c.partition("data critscore")
        [a,b,c]=c.partition("<a href=\"/movie/")
        print dict({movieTitle:score})
    if numberOfPagesWithScores>1:
        for pageNumber in range(1,numberOfPagesWithScores):
            print "\n***"
            print "Page (",pageNumber+1,"/",numberOfPagesWithScores,") for",critic
            c = mv.retrieveInfo( "" + critic +
                    + str(pageNumber))
            [c,b,a]=c.partition("<div class=\"module list_trailers\">")
            while c.count("data critscore")>0:
                [a,b,c]=c.partition("data critscore")
                [a,b,c]=c.partition("<a href=\"/movie/")
                print dict({movieTitle:score})

Data Mining: Practical Machine Learning Tools and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems)

  1. Justin James
    May 17, 2012 at 1:26 am

    Part of your struggle is that you are not accounting for a lot of other factors which lend themselves to something like a Bayesian analysis or a vector search. For example, if you have young kids at home, the Transformers cartoon movie may be something you watch (rated G) but the Transformers movies don’t get watched (rated PG-13). Or perhaps you normally like certain films, but one of them contains a lot of things that insult your religion.

    Those are the kinds of things that are much harder to quantify and identify trends in, because many times there are only rare outliers that throw the while thing off.


  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: