Run-Through Example
One of the data sources I worked on for the project were public documentation from the Oregon Department of Education (ODE) that detailed state funds allocation to school districts by student types. This document came in the form of structured .pdfs. The primary ‘Data Sciency’ task I undertook during the capstone was to scrape 14 pdfs of 233 pages each.
In this section, I will walkthrough an example of scraping data for one district. All these functions are found in the code/ folder.
Let’s start by looking at one pdf page.
Let’s see the intermediate steps for transforming the data.
And then repeating it over across all districts, and all years.