Skip to main content

Preventing and Detecting Molestation using Twitter

Crime on women has been increasing on an alarming rate especially cases of Molestation.We tried to do something that can actually make people aware about where they are and know about their surroundings. So, we took one of the most popular social media platforms i.e. Twitter to extract tweets regarding molestation. Domain – India


The main purpose was to extract locations (location of incidence) out of the tweets that we collected and flag those locations into 3 categories.

       Most Prone to Molestation: More than 800 cases per year.

         Relatively Less Prone but quite a few incidents have been reported.

        Relatively very safe - Very few or no incidents have been reported

Methodology :

1) Extracted more than 10K Tweets using HashTags.

#sexualabuse , etc

Challenge : Most of the tweets are not geo tagged .

We came up with a solution. We tried to explore the metadata of the tweets i.e. when
people use hashtags of in the tweet they sometimes mention the location like


We did some pre-processing on the tweet text and the metadata to extract locations
out of it. The preprocessed text which are supposed to be locations are passed to geocoderAPI and we got the latitude and longitude. Still Our Problem Is not Solved. This technique returned 1/5th of the tweets with locations in it. Hmmmm……..

Since most of the tweets contain images, news cutting etc. We used the technique OCR(Optical Character Recognition) for extracting text out of those images and extracting locations out of it.
Image is given Below 

The red circles indicate the locations. We got news clippings from the timeline of police handles and news handles in twitter.
Additionally we used web scraping on the news website to extract locations of certain incidents.We also used a google add on – > Twitter Archiver to actually extract tweets based on our hashtags and filtered location.

And here we are ---------------------------------

The 30 Km Radius shows around you , the place is quite unsafe. You can drag the purple marker to any place you want , and you can see whether you are
safe within the 30 Km radius.

Purple Marker denotes your current location .....

Improvements : Accuracy is low.  Many locations we retrieved suggests certain incidents did not occur at that place.  So to check manually all the posts is impossible . So we need to automate this thing.

Secondly , one thing that can be extended in this. If we can get street level data  precisely , we can provide an alternative walking/driving route to the user given the crime rate of a location. 

Thirdly, if we can find the time of the day when these incidents are occurring then it becomes more effective. 

Poster :

Us :  


Ritaban Basu
Mayur Shingote
Saquib Mohd
Ronak Kumar


Images : -- google.


Popular posts from this blog

White or Blue, the Whale gets its Vengeance: A Social Media Analysis of the Blue Whale Challenge

The Blue Whale Challenge - a set of tasks that must be completed in a duration of 50 days - is an online social media rage. The tasks of the “game” cause both physical and mental harm to the players; the final task is to take his/her own life. The tasks include waking up at odd hours, listening to psychedelic music, watching scary videos, inflicting cuts and wounds on their bodies and the final task is to commit suicide. The game is supposedly administered by people called “curators” who incite others to take the challenge, brainwash them to cause self harm and ultimately commit suicide. Most conversations between curators and players are suspected to take place via direct message but, in order to find curators, the players need a public platform where they can express their desire to play the game - knowingly or unknowingly. Online social media serves as this platform as people post about not just their desire to be a part of the game but also details and pictures of the various task…

Social Bot Detection on Twitch

Twitch is the leading world live streaming video platform for the Gamer’s community. It is a very famous networking site and has close to 100 million monthly unique users. Bots are very prominent on the network due to various financial favors that the gaming platform provides to a user. The main objective of our Project is Detecting Social Bots on Twitch using various techniques such as Meta-data Analysis, Sentiment analysis from Chats on a Channel, and classification using Machine learning.
We started by collecting usernames of 510 channels for which we compared chatters and viewers on that channels live video. We got 51 channels which had chatters>viewers. On those channels, we did Temporal analysis for over a period of 4 weeks. Alongside, we collected their metadata, such as, Follower, Followings, Status, Partner, and total views. We calculated a Score using these features, from which we could conclude that higher the score, higher the probability of an account being a Bot accoun…

Privacy Concerns on Tinder

Mobile dating apps have become a popular means to meet potential partners. Mobile dating application such as Tinder have exploded in popularity in recent years. Most users on Tinder use/have used Facebook as their primary way to sign up. By doing this, Tinder automatically takes user information directly from Facebook, thus saving the need to authenticate the user and user details.  In this project we aim to identify a Tinder profile on Facebook using the information that tinder obtains from Facebook. Below is the information that Tinder takes from a user when they log in for the first time.