Measurement: Unit II Summary and Major Assignments

Summary and Learning Objectives

Unit II has focused on the technical and interpretive skills necessary to translate records into custom measures describing units of analysis referenced in the data, like people, places and things, and how to communicate them both statistically and visually.

Technical Skills

  • Creating aggregate measures;
  • Merging data sets on shared key variables;
  • Importing, manipulating, and mapping spatial data in the form of shapefiles (.shps);
  • Creating various advanced visualizations for two or more variables.

The packages we have learned include:

  • sqldf for querying data frames and creating aggregations;
  • sf for working with spatial data as “simple features”;
  • ggmap for creating maps of spatial data with ggplot2;
  • gridExtra for creating visualizations that include two or more individual graphics;
  • streamgraph for creating graphs of changes in counts across categories over time;
  • ggcorrplot for creating correlograms of multiple correlations between variables;
  • gganimate and gifski for making data animations.

Interpretive Skills

We have learned how to address the “missing ingredients” of a naturally occurring data set, including:

  • Specifying the unit of analysis available from a schema;
  • Defining constructs of interest and isolating the information needed to measure them;
  • Identifying sources of bias and establishing validity;
  • Describing the structure and organization of spatial data;
  • Evaluating the best visualization technique for communicating a particular piece of information.

Unit-Level Assignments

Community Experience Assignment

The community exploration assignments in this book are designed to align skills you have learning with real-world contexts. They are most useful in conjunction with the Exploratory Data Assignments at the end of each chapter, especially when you have been working through them with a single data set. They provide an opportunity to “ground truth,” or really evaluate the assumptions and objectives that have guided your analysis thus far. There will be one in each unit. These can also be combined with a service-learning or capstone oriented course.

This second community exploration assignment will focus on what we can learn about a neighborhood through the lens of public media. Please:

  1. Select a neighborhood based on something notable about the measure(s) that you are developing, e.g., the highest or lowest value, an interesting combination of values, etc.
  2. Explore how this neighborhood is represented and portrayed through news articles, web sites, blogs, community organizations, social media, and other online resources.
  3. Write a 3-5 page virtual walk memo describing the logic for why you visited this place, what you discovered, and what this tells you about the interpretation of your data. It will need to include illustrative images from the media you utilized.

Post-Unit Assignment: Constructing a Novel Metric from Raw Data

This unit of the book has focused on using a theoretical concept to guide the construction of one or more novel metrics from our raw data sets, describing a particular aggregate unit of analysis (e.g., parcel, restaurant, neighborhood), and then communicating the distribution of those metrics. This will culminate in a paper that will be organized in a series of short sections:

  • An Overview of the measure and why it is interesting. About one-two paragraphs.
  • A textual description of how the measure was constructed, justifying any specific decisions that were made (for example, categorization of case types). This will include:
    • A summary of new variables at the record level (i.e., the original database) that were constructed first as part of the overall calculation of your aggregate measure(s).
    • A summary of the new aggregate measure(s) that you’ve calculated and how you have done so.
  • A short description of the new variable’s distribution and/or values. What’s the mean? Where is it highest? Anything else fun or interesting? etc. This should include at least one tabular visualization (e.g., histogram) and a map, if appropriate.
  • An appendix with an annotated R syntax articulating all steps required to create the measure(s) (should be your portion of the R syntax copy-and-pasted; see below).

Suggested Rubric (Total 10 pts.)

Measurement-Concept: 3 pts.

Measurement-Execution: 2.5 pts.

Visualization: 2.5 pts.

Details (Grammar, etc.): 2 pts.