Module Description¶
The visualization module helps understand the differences between correlational research and experimental research. It helps uncover the possibilities and pitfalls of using self-collected and external datasets. In the technical aspect, it supports the implementation of simple functions, helps recognize when to choose and plot different visualizations based on the variable relationships, and utilizes visual channels to display variables. Provides conceptual foundations of how tables are structured, how to apply methods to filter, access, aggregate, and disaggregate information from datasets. Lastly, the module supports students with how to read errors and debug their own code. While using their computational skills, they will analyze the difference between validity and reliability, as well as drawing connections between theory and research. Identify the drawbacks of conclusions from observational data within social science contexts. Explore dependencies of variables, interpret data provenance, positionality, and intersectionality.
Throughout the module, students will learn the following content topics:
Introduction to Visualizations:¶
Visualization types and their purposes
Variable Types
Displaying Categorical Variables
Data representation in Visualization
Skewness
Overplotting
Statistics:¶
Percentiles
Quartiles
Measures of Spread
Interquartile Range
Range
Measures of Center
Median
Mean
Skewness
Variability
Simpson’s Paradox
Aggregation vs. Disaggregation
Visualization Types and Methods¶
Scatter Plots
Line Plots
Horizontal Bar Chart
Histograms
Bin Sizes, Density and Area Principle
Box and Whisker Plots
Using .group(), .bahr(), .hist(), .group(), .plot(), np.arrange()
Table Interpretation and Manipulations¶
Extracting rows from tables .where() and .take()
Filtering via .where()
Chaining .where()
using .group
w/ np.average
w/ aggregation and .select()