Skip to main content

Prediction and Analysis on Foursquare


What is the project about ?

The project's aim was to analyse the network of Foursquare . Foursquare is a mobile app which provides search results to users . The app provides personalised recommendations of places to go near a user's location based on users' previous browsing history , purchases , or check-in history . The app is a location based social network and therefore privacy regarding user's location is at high risk . Our task was to predict users' home / Office and analyse, using the data provided by Foursquare .

What is the methodology used to predict Users' Home / Office ?   

We chose python as our language for extracting the foursquare data . Libraries used are mentioned on the bottom of the page as references .  

Below is the attached poster describing the process  -

What do you mean by comparing empirically ?

We plotted user's tips' location on the world map (using Basemap of Matplotlib ) . There is not any formula to predict any user's home location . It has to be done empirically only . And with that come problems of Accuracy of the data .  There was no Ground truth data . Only possiblility for any ground truth data was of the location provided by a user in his/her bio. But Obviously, one can't  rely on it as we found many cases where bio was not properly written or was left empty or instead of the place's  name , the name of any person was written . 

Then how can you prove the data you collected was Accurate ? 

Ok. So here is a functional diagram of Kinjil Mathur's home/office prediction (used to be VP of marketing of Foursquare) which we showed in our poster presentation .

 1. Collected Kinjil's tips' addresses .
 2. Then plotted them on the map.
 3. Collected the Home city of her friends on Foursquare .
 The home city of friends and tips' addresses were enough to tell that she is from New York . Because, one makes friends predominantly where he works and also that a person eats most of the times from restaurants nearby the Office / Home location . But the task wasn't over . We were determined to find exact location of the person . 
 4 . Collected the postal codes of the tips and sorted according to their frequency . Now, postal code is something that can give you very exact information about a person's location . The most freqquent postal code was mapped and the result's snapshot is given below .

                  Source -  Google Maps

The location traced was 1.4 miles away from the office of foursquare . And this prediction is accurate as there was not any other person named Kinjil and also she was the VP of Marketing in Foursquare when the checkins were done on Foursquare , so obviously she worked in Foursquare office and lived nearby .

We only considered bio of some profiles as accurate and among them too, the data accurate of those whose profile possesed these following characteristics - 

1 . Any person who has a decent public profile and whose data can also be found on other sites on internet .
2 . The data (location data) of such profiles should match the data on other internet sites such as  facebook , twitter etc as people with decent public profile has verified profiles on such sites .
3. People who have large dataset like more than 100 friends, more than 20 tips etc were chosen .

But is that enough to prove the data accuracy as their is still no Ground Truth and things are still in assumption zone ? 

No, obviously one can't just trust such assumptions fully . But one can treat them accurate to a certain level . So , at the time when we were doing the project , at this stage, we too got stuck . We needed something to prove the accuracy of our data . So, we started searching out for the problem on google . And then we saw this research paper named " I Know Where You Live " written by some MIT students (Reference is at the bottom).  Below are two snapshots of their work about the accuracy of empirical analysis of any Location based Social Network .

Source - 

Source -
The first diagram shows the correct responses (empirical analysis of location using ground truth data of known people) which they got with different density of datasets . This shows that the percentage of people giving responses are higher in case of prediction of Office mainly in  low and high density datasets . 
Also, if we consider the above table, it states that the mean accuracy was 69 % for the prediction of workplace . 
A point to note here is that this is not just valid for one social network but all LBSNs (Location Based Social Networks) (as per the research ).

Any problems faced in the project ? 

The main problem that occurs with the Foursquare is that the data of people's location is not that large that it can be used for a smooth and accurate analysis and study which is obvious because people don't really search about any restaurant which is present in their locality and if they do so , many don't care to post . Most of the people care to post only when they are on any trip , which gives the data of one or two locations only. 


From Right to Left - 
1. Sumeet Bhardwaj (Group Leader)
2. Nickey Kumar 
3. Aman Verma
4. Sanidhya Daeeyya
5. Azhar Tak


1. I Know Where You Live: Inferring Details of People’s Lives by Visualizing Publicly Shared         Location Data  (
2. Libraries used:


Popular posts from this blog

White or Blue, the Whale gets its Vengeance: A Social Media Analysis of the Blue Whale Challenge

The Blue Whale Challenge - a set of tasks that must be completed in a duration of 50 days - is an online social media rage. The tasks of the “game” cause both physical and mental harm to the players; the final task is to take his/her own life. The tasks include waking up at odd hours, listening to psychedelic music, watching scary videos, inflicting cuts and wounds on their bodies and the final task is to commit suicide. The game is supposedly administered by people called “curators” who incite others to take the challenge, brainwash them to cause self harm and ultimately commit suicide. Most conversations between curators and players are suspected to take place via direct message but, in order to find curators, the players need a public platform where they can express their desire to play the game - knowingly or unknowingly. Online social media serves as this platform as people post about not just their desire to be a part of the game but also details and pictures of the various task…

Identifying Tinder Profiles on Facebook

Identifying Tinder Profiles on Facebook
In the online world, everything that you ever put is linked and connected. You might think that you’ve put some information on one platform and that’s it, you’re good to go. But you, my friend, are sadly mistaken. With this thought in mind and the privacy concerns linked with Online Social Media, we would like to introduce you to our problem statement: Identifying Facebook Profiles from Tinder Profiles. Given a tinder profile, our aim is to identify the corresponding Facebook profile of that person. We are addressing the linkability issue here and trying to highlight how more information than what you’ve mentioned on Tinder can be picked up from your Facebook profile. For those who don’t know, Tinder is a Dating Platform available for a Mobile Application and a Web App. It shows the geographically close profiles around you and you have an option to right swipe(Like) or left swipe(Dislike) them. When two people right swipe each other then it’s a m…

BuyOut : Look for an account Buying it's Success

In recent years, social media has become increasingly popular as a business and communication tool. One of the newest social media tools is Instagram, created by Kevin Systrom and Mike Krieger. Instagram is an application and a service that allows users to capture and share images and videos with followers, either publicly, or privately to pre-approved followers. Officially launched in October 2010, Instagram gained 1 million users within its first two months, and the app had 700 million active monthly users by April 2017. Instagram has further grown into one of the most popular online social networks, with almost 90% of its user base under the age of 35, which is much lower than other OSNs. This makes Instagram unique and provides a higher level of engagement in the user base as previous research has shown the higher technology adoption amongst younger people and their use of social media.

Businesses are recognizing the importance of social media as a way to engage with co…