Discovery: Unit III Summary and Major Assignments

Summary and Learning Objectives

Unit III has focused on the technical and interpretive skills associated with inferential statistics, including the ability to generate and evaluate evidence for hypotheses regarding the characteristics of individual variables and relationships between multiple variables. We focused especially on the application of these tools to questions of equity.

Technical Skills

  • Conduct and interpret multiple statistical tests, including:
    • Correlations, which evaluate the strength of association between two numerical variables;
    • t-tests, which compare values on two groups;
    • ANOVA tests, which compare values across three or more groups;
    • Regressions, which use one or more independent variables (numerical or categorical) to predict the values of a single, numerical dependent variable;
  • Create visualizations illustrating the results of these statistical tests, including:
    • Matrices representing the distributions of a set of variables and the correlations between them;
    • Bar charts that represent the differences between groups;
    • Dot plots with linear relationships represented.

The packages we have learned include:

  • Hmisc, which includes a sophisticated way to test multiple correlations simultaneously;
  • Ggally for generating a matrix that visualizes the distributions of a set of variables and the correlations between them;
  • reshape2 for reorganizing data to realize a distinct structure required for a specific analysis or visualization;
  • QuantPsyc, which includes a function for calculating the standardized betas from linear models.

Interpretive Skills

  • Conduct and interpret inferential statistical tests, including:
    • The logic of using samples to reach generalizable conclusions;
    • The reporting and interpretation of both effect size and significance;
  • Differentiate between inequalities and inequities and design and interpret analyses intended to explain the mechanisms driving the latter.
  • Conceptualize independent and dependent variables and building the appropriate model with the appropriate statistical tool when generating and testing hypotheses.

Unit-Level Assignments

Community Experience Assignment

The community exploration assignments in this book are designed to align skills you have been learning with real-world contexts. They are most useful in conjunction with the Exploratory Data Assignments at the end of each chapter, especially when you have been working through them with a single data set. They provide an opportunity to “ground truth,” or really evaluate the assumptions and objectives that have guided your analysis thus far. There will be one in each unit. These can also be combined with a service-learning or capstone oriented course.

This third community exploration assignment will return to the direct experience of a neighborhood. Please:

  1. Select a neighborhood based on something notable in your analyses regarding the direction you anticipate for your final project, with an eye towards providing “ground truth” relative to something that has intrigued or challenged you. You might also consider how this particular “ground truthing” exercise will enable you to evaluate the potential impact or public value of your analyses.
  2. Visit and explore this neighborhood either in person or through whatever virtual tools you find useful (including Google StreetView, BostonMap, public media, etc.), seeking to observe how the characteristics of the data and your analyses manifest themselves in the real world. (Note: A visit that lasts less than a half-hour would be unlikely to generate enough observations to support a high-quality set of insights).
  3. Write a 3-5 page memo describing the logic for why you visited this place, what you discovered, and what this tells you about the interpretation of your data. This last part should include or be followed by a broader discussion of how your perspective on the relevance of your work for communities has evolved as you have progressed through this book or when analyzing these data in general. The memo should include images from your exploration and maps with data describing the region.

Post-Unit Assignment: A Full Research Study

This unit of the book has focused on using inferential statistics to “discover” relationships between variables, especially focusing on how we identify questions and subsequent analyses that are meaningful and can have public impact. This builds upon our previous efforts to reveal basic information and to create custom measures of interest by enabling us to formally test the hypotheses that naturally follow. This paper will bring this work together in the format of a public report, consisting of:

  • A brief Executive Summary that details the main points of your analysis and findings. Think of the audience for this being someone who would benefit from the insights but might not have the time to read the whole paper or the desire to wade through methods.
  • An Introduction that describes the conceptual inspiration for your analysis and why it might be interesting, both conceptually and practically. You might also include hypotheses if appropriate. The Introduction should include a few citations to fully justify and motivate your analyses.
  • A brief Data & Methods section that describes the content of your data set, how you calculated any new measures, and other data sources you used. This is not your complete Methods section, but just enough for a reader to be able to understand the content that follows. It should reference the Appendix (see below).
  • Results & Discussion section, broken up into one or more sub-sections, where you describe your analyses. This will include (1) descriptive statistics (e.g., what is the distribution of a critical measure across the city, what is the average, maximum, etc.), (2) inferential statistical tests, and (3) illustrative visualizations. This part of the paper should strive to tell one or more interesting stories.
    • It does not need to be titled Results & Discussion. The title of the section and sub-sections should capture what you discovered.
  • A Conclusion that briefly interprets what you’ve found and suggests implications for research and policy.
  • A Methodology Appendix that describes the content of your data set and then summarizes how you calculated any new measures and from where you accessed other measures. Keep in mind that this section should not include detail for detail’s sake but should provide the information necessary for an expert to (a) fully understand what you did and (b) replicate the work if they so desired.

Suggested Rubric (Total 12 pts.)

Executive Summary: 1.5 pts.

Introduction: 1.5 pts.

Data & Methods: .5 pts.

Results & Discussion: 3 pts.

Conclusion: .5 pts.

Methodology Appendix: 2 pts.

Visuals: 1.5 pts.

Details: 1.5 pts.