Hands-on Day

Quiz: Files and String Formatting

Overview

Today, we will do a 50-minute challenge quiz.

We examine a data set which shows populations for each state, broken down by reported race as of the 2010 census. The original source data can be found at census.gov; we have saved the data for use in this lab as a comma-separated values file:

raceByState.csv

This file uses the same format as the recent homework question, except that an additional line has been added at the beginning of the file that provides text labels for the columns.


Your Goal

You are to write a scripts that reads the raw data from raceByState.csv, analyzes it, and then produces a file report.txt that summarizes the data in the following form:

State                     Total      White              Black             Native              Asian            Pacific          Multirace
Alabama                 4779736    3362877 (70.4%)    1259224 (26.3%)      32903 ( 0.7%)      55240 ( 1.2%)       5208 ( 0.1%)      64284 ( 1.3%)
Alaska                   710231     483873 (68.1%)      24441 ( 3.4%)     106268 (15.0%)      38882 ( 5.5%)       7662 ( 1.1%)      49105 ( 6.9%)
Arizona                 6392017    5418483 (84.8%)     280905 ( 4.4%)     335278 ( 5.2%)     188456 ( 2.9%)      16112 ( 0.3%)     152783 ( 2.4%)
Arkansas                2915918    2342403 (80.3%)     454021 (15.6%)      26134 ( 0.9%)      37537 ( 1.3%)       6685 ( 0.2%)      49138 ( 1.7%)
...

The percentages shown are calculated relative to the total state population. Please note the following aspects about the report:


Advice

  1. Take advantage of loops! Not just for the obvious processing of 50 states, but also because there are six racial categories and you want to do the same things for each.

  2. Remember when writing to a file, you do not need to compose and write an entire line at the same time! You can call the write() function many times (for example, for each individual field of a line).

  3. Remember that if you are using %-operator string formatting, you need to use the pattern %% to get an actual % sign in the string.

  4. A summary of styles for left/right justification:
    leftright
    %-operator %-9d %9d
    format method {:9} {:>9}
    f-string {:9} {:>9}


Submission


Additional Challenges

Still have time? We dictated the column widths of 20 and 9 by examining the actual data set to see what the longest state name was and the maximum width of a number (or column label). A better approach is to do a first past through the data and determine the minimum width requirement for each individual column, and then do a second pass to generate the output using those widths.


Michael Goldwasser
Last modified: Wednesday, 17 October 2018