Archive

Archive for May, 2012

Math still not the answer

May 16, 2012 1 comment

I wrote a quick (but not very elegant) python script to retrieve locally enough data from www.metacritic.com for pattern recognition purposes. The main goal is to help me decide how much I will enjoy a movie, before watching it. I included the script at the end of the post, in case you want to try it yourself (and maybe improve it too!). It takes a while to complete, although it is quite entertaining to see its progress on screen. At the end, it provides with two lists of the same length: critics—a list of str containing the names of the critics; and scoredMovies—a list of dict containing, at index k, the evaluation of all the movies scored by the critic at index k in the previous list.

For example:

>>> critics[43]

‘James White’
>>> scoredMovies[43]

{‘hall-pass’: 60, ‘the-karate-kid’: 60, ‘the-losers’: 60,
‘the-avengers-2012’: 80, ‘the-other-guys’: 60, ‘shrek-forever-after’: 80,
‘the-lincoln-lawyer’: 80, ‘the-company-men’: 60, ‘jonah-hex’: 40,
‘arthur’: 60, ‘vampires-suck’: 20, ‘american-reunion’: 40,
‘footloose’: 60, ‘real-steel’: 60}

The number of scored films by critic varies: there are individuals that gave their opinion on a few dozen movies, and others that took the trouble to evaluate up to four thousand flicks! Note also that the names of the movies correspond with their web pages in www.metacritic.com. For example, to see what critics have to say about the “Karate Kid” and other relevant information online, point your browser to www.metacritic.com/movie/the-karate-kid. It also comes in very handy if there are several versions of a single title: Which “Karate Kid” does this score refer to, the one in the eighties, or Jackie Chan’s?

Feel free to download a copy of the resulting data [here] (note it is a large file: 1.6MB).

But the fact that we have that data stored locally allows us to gather that information with simple python commands, and perform many complex operations on it.

Read more…

%d bloggers like this: