Make SQL queries and use data

Author

Karl Gregory

Connect to the airlines database made available by Baumer, Kaplan, and Horton (2021) using the following R code:

library(mdsr) 
db <- dbConnect_scidb("airlines") # establish a connection to the database "airlines"

1. Destination map

Choose an airport with faa code beginning with the first letter of your last name (if your last name begins with Q, you are free to choose an airport arbitrarily). Then find the top ten destinations of flights departing from this airport in the year 2013 (the ten destinations occurring with the greatest frequency for flights departing from your airport). Then plot on a map the locations of these airports, overlaying geodesic curves between them and labeling the points with the airport codes. Position at each destination airport a circle, the size of which is proportional to the number of flights departing to this destination. The plot should similar to the one below:

2. On-time departures by destination

Get the percentage of flights which departed on time for these ten destinations and report the results in a table. Your table should look like the table below:

Percentage of flights departing on time to top ten destinations from GRR
Destination Number of flights Percent departing on time
DTW 2188 75.9
MSP 1606 73.2
ORD 1539 63.5
DFW 1339 45.7
ATL 1055 68.2
BWI 797 62.7
EWR 749 43.4
CVG 561 89.3
CLE 530 63.4
DEN 516 53.9

3. Bar plot of plane models/manufacturers

Make a plot like the one below summarizing, for your airport, the plane models most commonly departing (you may make an “other” category, as below, for models having small numbers of departures so that the bar plot does not become too huge). Indicate also, as below, the manufacturer of each plane model. Run the appropriate SQL queries to pull the information from the flights table and the planes table.

References

Baumer, Benjamin S., Daniel T. Kaplan, and Nicholas J. Horton. 2021. Modern Data Science with R. 2nd ed. Boca Raton: Chapman; Hall/CRC Press. https://www.routledge.com/Modern-Data-Science-with-R/Baumer-Kaplan-Horton/p/book/9780367191498.