Skip to main content

iFROOSN: Incentivised Fake Reviews On OSNs with Yelp as the reference


Yelp is an OSN primarily used to popularise the businesses and give reviews about those business. Yelp can be used as an efficient business expander for many upcoming restaurants/spas/saloons who always look for new customers.






Problem Statement


Our main objective of this course project was to target fake/incentivised reviews on yelp and give a credibility score using which a new user of Yelp can get an overall estimate about the restaurant he/she will visit .We developed an application which required an business ID of yelp as an input and it gave the credibility score as the output along with some inferred results in form of graphs

Dataset


The primary requirement before starting the project was collecting dataset for Yelp business and corresponding reviews and details about the user which post these reviews .The dataset was obtained through Yelp dataset challenge which was available for academic usage and result collections .The database had predefined schema and other data which was not available through schema was web scrapped or collected through API usage.





Data Collection Details


The data available through yelp dataset challenge comprised of over 15 million values and thus for fast retrieval of information and efficient processing data was scaled down to 0.2 million values for each of bussiness ,reviews ,user details .

Methodology


Our process method was a two way strategy comprising of checking for user details and checking for text plagiarism .We gave a score on our application comprising of normalized score values of various parameters and metrics.




So broadly these four metrics were considered while considering for fake review detection .

1. User Rating deviation :It consisted of reducing normalised score of users who gave a score that is hugely deviated from their average score which the user gives

2. Business Rating Deviation :If the business for which fake review detection is targetted showed huge deviation from the overall business score then overall normalised score of that business was reduced significantly.

3. User review plagiarism :In this metric we checked for plagiarism for the review which the user had written and checked if certain review existed in our dataset or not .

Review plagiarism was checked based on these parameters:

i) Levenshtein distance: Consisted of minimum no. of string operations required to convert one string to another.

ii) Jaro Winkler distance: Consisted of minimum no. of string addition ,removal,rotation operations to convert one string to another .

iii) NER: This parameter involved searching if certain entities like name , location, references were found to be same across other reviews as well or not.

4. User Location: This parameter consisted of checking for location of user who wrote the review and comparing it with the location of the business.If the distance was found larger than the threshold then that review was flagged .

Results


Based on our parameters, we were able to find some profiles/reviews that can be classified as fake with high degree of accuracy.

        
        
             

Presentation and Team


Akash Kumar Gautam    (2015011)
Mayank Kumar                (2015055)
Sahil Babbar       (2013082)
Shyam Agrawal       (2015099)

       
  






References


  1. Yelp.com
  2. https://link.springer.com/chapter/10.1007/978-3-319-11119-3_1


Comments

Popular posts from this blog

Identifying Tinder Profiles on Facebook

Identifying Tinder Profiles on Facebook In the online world, everything that you ever put is linked and connected. You might think that you’ve put some information on one platform and that’s it, you’re good to go. But you, my friend, are sadly mistaken. With this thought in mind and the privacy concerns linked with Online Social Media, we would like to introduce you to our problem statement: Identifying Facebook Profiles from Tinder Profiles. Given a tinder profile, our aim is to identify the corresponding Facebook profile of that person. We are addressing the linkability issue here and trying to highlight how more information than what you’ve mentioned on Tinder can be picked up from your Facebook profile. For those who don’t know, Tinder is a Dating Platform available for a Mobile Application and a Web App. It shows the geographically close profiles around you and you have an option to right swipe(Like) or left swipe(Dislike) them. When two people right swipe each other then it’

Inference Attacks On OSN's

INTRODUCTION TWITTER is a popular online social network and microbloging service for exchanging messages (also known as tweets) among people, supported by a huge ecosystem. Twitter announces that it has over 140 million active users creating more than 340 million messages every day [26] and over one million registered applications built by more than 750,000 developers [25]. The third party applications include client applications for various platforms, such as Windows, Mac, iOS, and Android, and web-based applications such as URL shortening services, image-sharing services, and news feeds. Among the third party services, URL shortening services which provide a short alias of a long URL is an essential service for Twitter users who want to share long URLs via tweets having length restriction. Twitter allows users to post up to 140-character tweets containing only texts. Therefore, when users want to share complicated information (e.g., news and multimedia), they should include a UR