Approaches to Establishing the Veracity of Big Data

Speaker:  Vishnu S Pendyala – San Jose, CA, United States
Topic(s):  Information Systems, Search, Information Retrieval, Database Systems, Data Mining, Data Science


In spite of their anthropomorphic role, unlike human beings, technological inventions such as the Web do not have a conscience. Still, there is often more reliance on the big data emanating from them than on the spoken word. It is well established now that the Social Media is capable of changing the trajectory of nations and the lives of billions. Recent news indicates that fraud on the Web has played a substantial role in elections including for the President of the nation, particularly in developing countries in South America and the public discourse, in general. Like with many inventions such as the atomic energy, when envisioning the Social Media, the euphoria of its anticipated appropriate use seems to have dominated the caution to prevent misuse. The result is a substantial misuse of the powerful media to spread falsehood. The talk will discuss ways to detect and possibly prevent the falsehood on the Web using Machine Learning, Statistical and other techniques.

Depending on the availability of time, the topics covered will include approaches drawn from Machine Learning, Formal Methods, Blockchain, Statistical Modeling, and Information Retrieval. The talk will illustrate these approaches through specific use cases of injected attacks on Microblog Websites and Classification of Microblogs. These solutions can be applied to problems such as the cognitive hacking fraud in Presidential elections in Latin America, where Andr‚s Sep£lveda changed the course of the elections by manipulating the information on Microblog Websites. The material will be covered without assuming a lot of math / technical background, so should be easy to understand to anyone who is interested in these topics. The talk will draw from the speaker's book, "Veracity of Big Data: Machine Learning and Other Approaches to Verifying Truthfulness" available at and major libraries, including those of MIT, Harvard, Stanford, CMU, and internationally.

About this Lecture

Number of Slides:  30 - 120
Duration:  45 - 240 minutes
Languages Available:  English, Hindi
