SQL in Python: Analyzing CIA Factbook Data
Learning to use the sqlite3 module in Python and practicing SQL
Garrett Mayock posted 2019-05-30 21:33:53 UTC
I'm currently earning the "Data Analyst – Python" certification through Dataquest.io and part of their process is including a guided project for each course section. I'm currently reviewing SQL for some interviews so I've jumped ahead in the Dataquest training to review their SQL sections. I do use MySQL in this website and have completed the official 40-hour Microsoft Certified "Querying Microsoft SQL Server" course, so it's more a refresher than anything.
This project is done to show proof that I can create and execute simple and nested SQL queries to use in Python. To show this, I perform some basic analysis on data from the CIA Factbook.
The lesson and this project cover:
- Connecting to .db files using sqlite3, and executing queries using pandas
- Basic queries using SELECT, FROM, and LIMIT clauses
- Using and nesting subqueries
- Filtering using WHERE with single or multiple criteria (AND/OR as well as nested AND/ORs)
- GROUP BY and HAVING clauses
- ORDER BY clauses and the DESC keyword
- AGGREGATE functions such as MAX, MIN, SUM, AVG, COUNT
- Selecting DISTINCT values and returning substrings using SUBSTR()
- Performing arithmetic in SQL queries
Find the data and IPython Notebook in my GitHub repository, linked below.
Source Data for factbook.db: