Today, we will do a 50-minute challenge quiz.
We examine a data set which shows populations for each state, broken
down by reported race as of the 2010 census. The original source data
can be found at census.gov;
we have saved the data for use in this lab as a comma-separated values
file:
raceByState.csv
This file uses the same format as the recent homework question, except that
an additional line has been added at the beginning of the file that
provides text labels for the columns.
You are to write a scripts that reads the raw data from raceByState.csv, analyzes it, and then produces a file report.txt that summarizes the data in the following form:
State Total White Black Native Asian Pacific Multirace Alabama 4779736 3362877 (70.4%) 1259224 (26.3%) 32903 ( 0.7%) 55240 ( 1.2%) 5208 ( 0.1%) 64284 ( 1.3%) Alaska 710231 483873 (68.1%) 24441 ( 3.4%) 106268 (15.0%) 38882 ( 5.5%) 7662 ( 1.1%) 49105 ( 6.9%) Arizona 6392017 5418483 (84.8%) 280905 ( 4.4%) 335278 ( 5.2%) 188456 ( 2.9%) 16112 ( 0.3%) 152783 ( 2.4%) Arkansas 2915918 2342403 (80.3%) 454021 (15.6%) 26134 ( 0.9%) 37537 ( 1.3%) 6685 ( 0.2%) 49138 ( 1.7%) ...
The percentages shown are calculated relative to the total state population. Please note the following aspects about the report:
Take advantage of loops! Not just for the obvious processing of 50 states, but also because there are six racial categories and you want to do the same things for each.
Remember when writing to a file, you do not need to compose and write an entire line at the same time! You can call the write() function many times (for example, for each individual field of a line).
Remember that if you are using %-operator string formatting, you need to use the pattern %% to get an actual % sign in the string.
A summary of styles for left/right justification:
| left | right | |
|---|---|---|
| %-operator | %-9d | %9d |
| format method | {:9} | {:>9} |
| f-string | {:9} | {:>9} |
One member of the pair must submit your project electronically, placing it in the quiz14 folder of their git repository.
Submit source code titled makeReport.py.
Make sure that BOTH students' names are in comments at the top of the source code.
Submit a copy of the file report.txt that your script generates.
Still have time? We dictated the column widths of 20 and 9 by examining the actual data set to see what the longest state name was and the maximum width of a number (or column label). A better approach is to do a first past through the data and determine the minimum width requirement for each individual column, and then do a second pass to generate the output using those widths.