Using Scalable Data Mining for Predicting Flight Delays

Speaker:  Paolo Trunfio – Rende, Italy
Topic(s):  Information Systems, Search, Information Retrieval, Database Systems, Data Mining, Data Science

Abstract

Flight delays are frequent all over the world (about 20% of airline flights arrive more than 15 minutes late) and they are estimated to have an annual cost of several tens of billion dollars. This scenario makes the prediction of flight delays a primary issue for airlines and travelers. This lecture describes design, implementation, and evaluation of a predictor of the arrival delay of a scheduled flight due to weather conditions. The predicted arrival delay takes into consideration both flight information (origin airport, destination airport, scheduled departure and arrival time) and weather conditions at origin airport and destination airport according to the flight timetable. Airline flights and weather observations datasets have been analyzed and mined using parallel algorithms implemented as MapReduce programs executed on a Cloud platform. The results show a high accuracy in predicting delays above a given threshold. For instance, with a delay threshold of 15 minutes the predictor achieves an accuracy of 74.2% and 71.8% recall on delayed flights, while with a threshold of 60 minutes the accuracy is 85.8% and the delay recall is 86.9%. Furthermore, the experimental results demonstrate the predictor scalability that can be achieved performing data preparation and mining tasks as MapReduce applications on the Cloud.

About this Lecture

Number of Slides:  60
Duration:  60 minutes
Languages Available:  English
Last Updated: 

Request this Lecture

To request this particular lecture, please complete this online form.

Request a Tour

To request a tour with this speaker, please complete this online form.

All requests will be sent to ACM headquarters for review.