Basic Quantitative Toolkit - SPARK: a centre for social research innovation

Spurious Correlations: correlation is not causation

A clear illustration of an important point: even when two variables change in the same way, this tells you nothing about the relationship between them.

Data Wrangling in Stata: Introduction and Review

Learn the fundamentals of Stata syntax and apply it to data wrangling

Primary vs Secondary Data: 15 Key Differences

Compare primary and secondary data to learn what types of conclusions each type allows you to make.

Hypothesis Testing – Analysis of Variance (ANOVA)

This module explains the theory behind the ANOVA and how to conduct one by hand

Code Academy- Learn R

A slow and steady introduction to data analysis and visualisation in R

Stata for Students: Descriptive Statistics

Unsure how best to characterise your data in Stata? This resource is a definitive guide on several descriptive analyses relevant to social science research.

Merging Datasets in R

Joining multiple datasets is essential when working with data from multiple sources. This tutorial explains how to merge data in R in a way that preserves as much information as possible

R for Data Science- Wrangle

This is the definitive guide to data cleaning in R, with chapters organised by the different operations you can carry out on your data

What is tidy data?

How do you create meaning from a compilation of numbers and letters? This resource explains the necessity of tidying data, as a way to create meaning from it.

Creating a Data Frame from Vectors in R Programming (GeeksforGeeks)

Vectors are to dataframes what columns are to tables. This resource explains how to create a dataframe from separate vectors in R with helpful examples.

R for reproducible scientific analysis- Vectors & Data frames

Learn how to structure your data for powerful analyses in R. This resource provides functions that will save you time on data manipulation so you can spend more time analysing and thinking about what your data means.

Samples and Populations

Learn how a sample relates to a population, and various types of samples available to you as a researcher

Virtual Math Lab- College Algebra

This resource meets you where you’re at in your algebra journey, going over prerequisites such as scientific notation before walking through focused algebra modules.

The Correlation Coefficient (r)

Are you unsure how to interpret correlation coefficients or what kinds of data are suitable for these calculations? This resource introduces coefficients and how to calculate one in R

Interpreting Regression Output

A simple breakdown on how to interpret the regression tables produced by statistics software like Stata.

Stats and R- Descriptive statistics in R

Descriptive statistics can summarise the properties of your data and show you what types of analyses would work best; this is crucial to know before conduct a full analysis.

Data Wrangling with R

Learn how to manipulate data using two foundational packages from the tidyverse: dplyr and tidyr.

Internal vs. External Validity | Understanding Differences & Threats

This guide distinguishes between internal and external validity and the threats to each that may arise in your research

Posit Cheatsheets

These handy cheatsheets make it easy to keep up with and refer to all the functions included in important R packages.

Data Viz Checklist

This guide will help you choose the best way to visualise the data you have

Six ways to share your research findings

Knowledge translation is an important final step of your research process; this guide offers guidance on how to be an effective research communicator.

Visualizations that Really Work

The most powerful data visualisations have a clear message- this guide provides thought-provoking questions and frameworks to help you decide what your message will be.

Welcome to the Tidyverse

Learn how to install the tidyverse package and how to use it

Introduction to Normal Distributions

The normal distribution is foundational to many statistical tests in social science research- this resource provides a basic introduction with intuitive examples.

The Binomial Distribution

When an event has two outcomes, it’s outcomes follow a binomial distribution; this resource explains important formulae related to the binomial distribution.

Posit Recipes- Transform Tables

These short, informative lessons are targeted towards specific questions you may have about data tidying. Code snippets throughout make it clearer how to apply them to your own data

Relational data

An example-driven resource that explains the concept of relational data in R

Data Cleaning with R and the Tidyverse: Detecting Missing Values

Learn how to write code that will help you identify missing values in your dataset

Data Wrangling Ex 2: Dealing with missing values

There are a few different ways to add missing data values in R - this blog post explains when and how through an exercise.

UCLA Stata Guide – Stata Learning Modules

Learn how to manage and organise your data using Stata, or review familiar concepts using focused articles

Swirl- Learn R, in R.

The swirl package in R features short, self—paced and interactive tutorials that make learning R an active process.

The idea of significance tests

How likely is it that an outcome has happened by chance? Significance testing gets at this idea and this tutorial explains this using an intuitive example

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Even well-meaning scientists and students can fall prey to misrepresentations of statistical tests; this paper provides guidelines for avoiding common pitfalls

Validity

This chapter explains why validity is an important factor to consider when measuring constructs in social science.

z-Test for a Mean

When should we use a z-test and what conclusions can we make after we conduct one? This textbook chapter explains how to calculate a z statistic with an example.

Disseminating your findings

Planning and thoughtfulness shouldn’t end with data analysis- this guide advocates for a thoughtful approach to research dissemination and offers practical advice on how, depending on your audience and your medium.

Data Collection: Primary Vs. Secondary

Learn about the different data collection methods available in social science research.

To Explain or to Predict

This article provides a thoughtful discussion of the difference between explanatory and predictive modelling, clarifying the distinct implications of each.

Binomial Distribution

Practice calculating probabilities under the binomial distribution with this example-focused guide.

Normal Distribution

Are you unsure how to use the concept of the normal distribution to calculate the probability of a given event? This resource explains how areas under the normal distribution curve relate to probabilities.

Populations and samples

When is it necessary to sample a subset of a population rather than collect data from the entire population? This chapter provides a thoughtful discussion on how to choose.

Hypothesis testing

This resource presents hypothesis testing as a formal way to determine which hypothesis your data supports.

Correlation Doesn’t Equal Causation: Crash Course Statistics #8

When two variables in your data vary together, there could be several reasons why. This video explains this idea using everyday examples.

ANOVA: Crash Course Statistics

This video introduces the ANOVA as a type of model which allows you to compare multiple groups

Regression: Crash Course Statistic

Learn how to build a regression model using an intuitive example

Dealing with Missing Values in R

Data collection is often an imperfect process- this guide describes how to make the most of the data you have and account for any missing values

R tutorial: Descriptive Statistics

Yet another handy capability of the dplyr package is its summarisation and grouping functions. With these tools, you can explore your data at a glance and gather descriptive statistics.

The Normal Distribution: Crash Course Statistics #19

This video introduces important terminology around normal distributions and how we can use them to compare parameters like the mean

What is a sampling distribution?

This video is an engaging and intuitive explanation of the sampling distribution and its usefulness for making inferences.

The central limit theorem | explained with a simple example

This video explains a fundamental concept in statistics; the sampling distribution of the mean will always be normally distributed

Carrying out a test for a population mean

A primer on when to use a z vs t statistic in significant testing

How to know which statistical test to use for hypothesis testing

This tutorial describes the hypothesis tests available to you, and how to choose which to use based on your data.

Z-Statistics vs. T-Statistics EXPLAINED in 4 Minutes

Learn how to distinguish between using Z statistics and T statistics, based on the data you have available to you.

Normal Distribution Explained with Examples

Learn how to solve problems related to the Normal Distribution

Basic Theoretical Probability

These videos and exercises provide a solid foundation in probability theory, which might help you understand the results of statistical tests better.

Confidence Intervals Using the z-Distribution

Learn how to calculate confidence intervals using the normal distribution curve

Central Tendency: Mean, Median, Mode

How do conceptual understandings of the mean, median and mode translate to mathematical notations? This tutorial provides an explanation of the mathematical notations needed to calculate the mean, median and mode.

Dispersion: Variance and Standard Deviation

This textbook chapter will push your conceptual knowledge of measures of spread towards an understanding of their mathematical notations and calculations.

Calculating the variance and standard deviation

This lesson clarifies how the variance and standard deviation are different and what they can tell us about the spread of our data.

Video tutorial on measures of central tendency

You may have an intuitive sense of what an average is, but may be unsure how different measures of central tendency help you arrive at that value. This video clarifies how to calculate the mean, median and mode.

Measures of Spread: Crash Course Statistics

How well does your mean represent the data? Calculating a measure of spread will help to answer this question and this video shows you how.

Introduction to confidence intervals

Learn how confidence intervals can move us from a calculation of a sample mean towards an estimate of the true population mean

Confidence Intervals: Crash Course Statistics

Learn how confidence intervals can help you make predictions about your data, with a pre-defined level of certainty.

Measuring center in quantitative data

How do you distinguish between all the various measures of central tendency? Which of them is best for your data? This tutorial explains describes how to they differ and how to calculate them all.

Resource Listing