Python csv read : Airline Route Histogram

Problem Statement:

OpenFlights distribute databases for both airline locations and airline route details. You can download the routes database “routes.dat” from the OpenFlights data page. This database stores every unique flight route that OpenFlights knows about. Take a moment to look at the fields available in the routes data (listed on the OpenFlights page.)

By using both data sources, we can calculate how far each route travels and then plot a histogram showing the distribution of distances flown.

This a multiple stage problem:

• Read the airports file (airports.dat) and build a dictionary mapping the unique airport ID to the geographical coordinates (latitude & longitude.) This allows you to look up the location of each airport by its ID.
• Read the routes file (routes.dat) and get the IDs of the source and destination airports. Look up the latitude and longitude based on the ID. Using those coordinates, calculate the length of the route and append it to a list of all route lengths.
• Plot a histogram based on the route lengths, to show the distribution of different flight distances.

We will first read the data base from the text file available in https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports.dat

and create two dictionaries for latitudes and longitudes.

Now using latitude & longitude of each airport we can calculate the distance of each airline route

To calculate geographic distances we will use function which is available below.

http://opentechschool.github.io/python-data-intro/files/geo_distance.py

Now if the above function is working then we can easily write a program that reads all the airline routes from “routes.dat“, looks up the latitude and longitude of the source and destination airports, and builds a list of route distances

Now our distances are ready we can go ahead to create histogram to display the frequency of flights by distance.

Share the joy
• 1
•
•
•
•
•
•
•
•
•