Big Data and Facebook – The Heavenly Pair that isn’t quite in Heaven


Facebook recently revealed some huge stats about Big Data, including system processes of 2.5 billion pieces of content and 500 terabytes of data per day. Imagine pulling 2.7 billion LIKE actions and 300 million photos per day, and scans 105 terabytes of data each half-hour.

Big data is all about insights, and if you are not taking advantage of it, then your enterprise is at a loss. Facebook is at an edge here. By processing data within minutes, Facebook can roll out the new products, understand the behavior of consumers, their reactions, and can modify designs in near real-time. The data is not only beneficial for Facebook but also advertisers and can be tracked for performance across platforms and dimensions, based on age, interests, gender, demography, etc.

Facebook doesn’t even need to push changes to see their impact now; they have the data in front of them to see their effect. Let us look at the amount of data Facebook deals with:

1. 375 billion active Facebook users

2. Over 1 billion are mobile-only users

3. Facebook adds 6 new profiles every second

4. Facebook generates 4 new petabytes of data per day

5. Uploads 350 million photos per day

6. Generates 4 million ‘Likes’ every day

And then there is ‘Project Prism.’ Facebook currently stores its entire live and evolving database in one single center. Others are used for miscellaneous and redundant data. When the existing storage gets too big for the constant incoming data, the entire chunk has to be shifted to another that has to be expanded to fit in. So Project Prism will enable this monolithic warehouse to be physically separate but also be viewed as one data. In layman’s terms, the live dataset can be split up and hosted across Facebook’s data centers in California, Virginia, Oregon, North Carolina, and Sweden.

Users can feel a little easy that Facebook employees cannot sneak peek into their personal data, but that’s how it is.

The Facebook context

We feed Facebook every day with beast mounds of data and information. Every 60 seconds, 136,000 photos are uploaded, 510,000 comments are posted, and 239,000 status updates are posted. This is a LOT of data.

Facebook is arguably the world’s most popular social media network, with more than 2 billion monthly active users worldwide. Therefore, Facebook is like a data wonderland. In a rough estimate, there will be more than 183 million Facebook users in the US alone in October 2019. Facebook is still considered one of the top 100 public companies in the world, with a market value of $475 billion approximately.

Figure 1. Social Media Preference Breakdown

At first, so much of information may not seem to mean very much, but Facebook knows who our friend is, what they do, what do they look like, like, dislike, and so much more. Some researchers will even go put on a leg ad claim that Facebook might know us better than our therapists!

Facebook is the only company apart from Google that possesses such a precise amount of customer information. The more users who log on to Facebook, there is more data amass. Facebook does not stop only in collecting, storing, and analyzing data; it also uses the same data to determine customer behavior:

1. Facial recognition: One of Facebook’s latest invention and investment is its image processing capabilities. Facebook can track its users from other Facebook profiles with image data through user sharing.

2. Tracking cookies: Facebook tracks its users across the web by tracking their cookies. For example, if a user is logged onto Facebook and is also simultaneously visiting or browsing other websites, Facebook can track those websites.

3. Analyzing the likes: A recent study conducted showed that it is viable to predict data accurately on a range of personal attributes that are highly sensitive just by analyzing Facebook likes. A study by the Cambridge University and Microsoft Research shows that merely analyzing the Likes of Facebook can say everything about the user starting from his/her sexual orientation, satisfaction with life, relationship status, drug use or abuse, religion, emotional stability, gender, age, race, political views, among many others.

4. Tag suggestions: Facebook, in the past few years, introduced the tag suggestions feature where they process the user photo through image processing and facial recognition.

Big data and Facebook camisole: the beauty or the beast?

It is not that Facebook is all beauty, and there is no beast, or it is all the good and no ‘bad and the ugly.’ Facebook is guilty of making 2 mistakes:

1. Relying too much on technology like Hadoop and its massive installation, which would mean a highly scalable open-source framework that uses bundles of low-cost servers that solve problems. Facebook’s in-house hardware is for this purpose.

2. Many big conglomerates like Facebook have been using Big Data to use meaningless questions where the underlying answer is obvious.

These are times where social media consolidation is increasing the scope of stored, processed, and monetized data, opening new horizons for the Internet of Things. Last year’s World Internet Day has made us realize that the world is a small place and who knows one day your ‘friend’ might be your coffee maker that you end up having conversations with.


Please enter your comment!
Please enter your name here