Given the recent buzz around this year’s 88th Academy Awards, our team decided to put HPCC Systems to the test. We used the platform to scan IMDb’s movie database1 for the “Six Degrees of Kevin Bacon.” But we didn’t stop there. We wanted to determine who the most connected celebrity up for best actor and actress was and whether that person should be the new figurehead for this 22-year-old parlor game. You think Bacon has connections? Matt Damon has him beat.
We’re always trying to find new ways to process and analyze massive amounts of data. One could argue that’s exactly what three Albright College students – Brian Turtle, Craig Fass, and Mike Ginelli – were doing when a Kevin Bacon movie marathon aired one night. From “Footloose” to “The Air Up There,” they began to realize that Bacon didn’t just appear in a number of movies, he was one well-connected guy because of it. And oh what a web those three students wove. They began to link the most obscure celebrities to Bacon via movies they had in common in as few steps as possible, taking the well-known “six degrees of separation” theory to another level.
To celebrate the Oscars, we also took “Six Degrees” to another level – a much more efficient and reliable one. We set out to prove how quickly we can pull and query the large amounts of data housed in the IMDb movie database file, so we put it through a data processing analysis to determine a number of things, including how the best actors/actresses compared to one another. Our findings conclude that Oscar nominee Matt Damon has surpassed Kevin Bacon with his connections. Not only that, actress Charlotte Rampling isn’t doing so bad herself. Here’s a look at the number of first-degree connections for the nominees:
# of Movies # of Connections Celebrity
86 5,825 Matt Damon
112 5,450 Charlotte Rampling
91 5,134 Kevin Bacon
73 4,918 Cate Blanchett
48 3,709 Leonardo DiCaprio
30 1,950 Jennifer Lawrence
17 1,412 Eddie Redmayne
While the Oscar went to Leo (finally), the real winner of the show is Damon, thanks to the HPCC Systems platform. Outside of Hollywood and in the real world, we’re giving businesses of all sizes an easy way to tap into the value of their data and come up with their own customized types of queries they need to do to assess and better their business. If it’s find the most connected celebrity in Hollywood (Bess Flowers), then so be it.
We make this type of big data analysis possible without the time-consuming and complicated coding that other solutions require because we’re able to quickly handle large volumes of data in many formats (unstructured and structured) and visualize and grasp insights, such as these.
If you ever want to challenge someone to a game of “Six Degrees of [INSERT CELEBRITY HERE],” come to us. In the meantime, we would like to thank the Academy…and IMDb.
1The information used for this analysis can be found on www.imdb.com.