If a lot of people go to the movies, does that mean it’s a good movie? Maybe. If a movie wins Best Picture, does that mean the movie is actually good or performed well in the Box Office? Maybe. Let’s take a look at the data.
One of the fun parts of this project is just getting the data. There isn’t a nice “export” button on IMDb. So I needed to webscrape with rvest and after doing that, I put all the data in a nice little table. Look at my GitHub for more information on my strategy of web scraping. The major benefit of webscraping is allowing easier reproducibility. Rather than copy pasting a table into Excel and then cleaning it by hand and taking that time to make human error, webscraping will bypass all of that as long as the website stays in the same format.
Best Pictures (Web Scraping Script)
Top 1000 Box Office Movies (Web Scraping Script)
Hover over with your mouse or tap some of the points to view some info!
Please note there are 26 movies which are both Best Picture winners and are in the Top 1000 Box Office Movies of All Time. Those include:
Also note, there are 11 movies without a lifetime gross listed in IMDb. I know a lot of these are listed in Wikipedia, but I’m only concentrating on data from IMDb. Feeling lazy right now.
Very obviously, on average Best Picture Winners are a lagging a little bit at the box office (Best Pictures: $175,656,143 vs Top Box Office: $448,088,300).
How fun! It looks like a manta ray!
Also notice how IMDb rates Best Picture Winners higher on average than the Top 1000 Box Office Movies (Best Picture: 7.8 vs Top Box Office: 6.8).