Why is My Flight Delayed – Index and Motivation

This notebook describes the motivation for our project and provides the index for our following process notebooks.

1 Motivation

1.1 Why is My Flight Delayed?

Flight delays are a major problem. Almost everybody has experienced a delayed or cancelled flight before and knows how annoying it is to wait at airports and maybe even miss important meetings at the destination. Delays are not only a problem for individual customers, but also for the airlines and the US economy in general: In 2010, researchers at the University of California (Berkeley) found that flight delays lead to total costs of more than $32.9 billion!

However, usually customers do not know the reason for their delayed arrival. Therefore, we want to understand what causes the delays and if we are able to estimate the expected delay for a given flight in the future.

A few of the main questions we are planning to investigate in our project are:

  • Is there a difference between airports? What are the airports that are most heavily affected by delays?
  • We would like to ask a similar question for the airlines? Which airlines usually arrive on time? Which airline is the worst?
  • Can we detect any seasonality in the data? Are there more delays in the winter (e.g. because of bad weather) or in the summer (e.g. because of summer holidays)? During the days, at which time should I book my flight to avoid delays?
  • Sometimes flights can't leave the airport because of last-minute repairs or other problems caused by the carrier. Has the age of the aircraft any influence on these carrier delays?
  • Is it possible to build a model that estimates the delay for a given flight?

2 Index

To make the whole project more readable, we created four different notebooks – one for each of the main subsections.

  1. This one (01_Index_and_Motivation.ipynb): Our motivation for the project.
  2. 02_DataAcquisition_and_Preparation.ipynb: Description of our data sources and data wrangling/preprocessing processes.
  3. 03_DataExploration.ipynb: Our exploratory data analysis.
  4. 04_PredictiveAnalytics.ipynb: All predictive models we built.