This repository contains reproducible code accompanying the book chapter “Multilevel analysis for quantitative communication scientists”. The scripts demonstrate a complete multilevel modeling workflow using simulated data to analyze cyberbullying risk among high school students (inspired by Festl et al., 2015).
The example examines how individual characteristics (e.g., school lunch benefit status, permissiveness toward cyberbullying) and grade-level attributes (e.g., grade level, grade composition) relate to cyberbullying risk among high school students (grades 9-12) across multiple schools.
|> support)The script example2_simulation_models_R.R uses
pacman for automatic package installation and loading.
Required packages:
- lmerTest # Multilevel modeling with Satterthwaite degrees of freedom (DDF)
- performance # ICC calculations and model diagnostics
- ordinal # Ordinal regression for categorical predictors
All packages will be automatically installed on first run if not already present.
supplemental_materials/
│
├── README.Rmd # This file
├── example2_simulation_models_R.R # R analysis script (generates data + fits models)
├── example2_models_STATA.do # STATA analysis script (fits models only)
└── Example2_simulationData.csv # Simulated dataset (generated by R script)
source("example2_simulation_models_R.R")
The script will: - Install required packages (if needed) - Generate simulated data - Fit all models progressively - Display results in the console
If you have downloaded the Example2_simulationData.csv
file from our repository, then you just need to:
# Load pre-generated data
sim_data <- read.csv("Example2_simulationData.csv")
Prerequisites: You must first generate the simulated
data using the R script (or downloading the
Example2_simulationData.csv file).
Example2_simulationData.csv (or download it directly)do "example2_models_STATA.do"
The STATA script will: - Import the CSV data - Create all necessary variables - Fit all models progressively - Display results with proper F-tests and pseudo-R² calculations
For pedagogical purposes, we recommend running the scripts section by section:
# Open the script `example2_simulation_models_R.R` in RStudio
# Run each section using Ctrl+Enter (Windows) or Cmd+Enter (Mac)
# Or use the "Run" menu to execute code chunks
* Open the .do file
* Highlight sections and run using Ctrl+D (Windows) or Cmd+Shift+D (Mac)
Recommended workflow:
The scripts produce organized output with clear headers:
========== DATA GENERATION COMPLETE ==========
Sample size: XXX students
Number of schools: 10
Number of grades: 40
Average students per grade: XX
========== STEP 1: UNCONDITIONAL MODELS ==========
[Variance decomposition results]
========== STEP 2: BUILDING MULTILEVEL MODELS ==========
--- MODEL 1: Grade Level Effects ---
[Model results]
--- MODEL 2: Add Level 1 Within-Grade Effects ---
[Model results]
...
For each model, pay attention to:
The scripts use CMC to avoid smushed effects (conflated within- and between-cluster effects):
# Level 1: Center on grade mean
L1_predictor_CMC = L1_predictor - L2_grade_mean
# Level 2: Center on school mean
L2_grade_mean_CMC = L2_grade_mean - L3_school_mean
Why this matters: Ensures level-1 effects represent pure within-grade relationships and level-2 effects represent pure between-grade relationships.
Although data has three levels (students > grades > schools), we use a hybrid approach:
Rationale: School differences not of substantive interest (L3 was considered incidental, as discussed in McNeish & Wentzel, 2016), but must be controlled.
Variance explained is calculated by comparing variance components:
Pseudo_R² = (Variance_baseline - Variance_current) / Variance_baseline
Important: For random slope models, compare to the model before adding the random slope to avoid overestimating explained variance—Our understanding of pseudo-R² measures in multilevel analysis differs from that presented by other authors, such as Rights & Sterba (2023).
Student-Level (Level 1): - StudentID:
Student identifier - y_CMC_Model: Cyberbullying risk score
(15-45, higher = more risk), labeled as CB -
L1_x1_Comp_Cat: School lunch benefit (1=Paid,
2=Reduced-price, 3=Free) - L1_x2_Comp_Round: Permissiveness
toward cyberbullying (10-40)
Grade-Level (Level 2): - GradeID: Grade
identifier - L2_x3_GradeLevel: Grade level (9, 10, 11,
12)
School-Level (Level 3): - SchoolID:
School identifier (1-10)
lmer() from lmerTest packageVarCorr()mixed command with || gradeid:
notationsmall option for F-tests with
Satterthwaite DF (not chi-squared)estat icc for intraclass correlationsestat recovariance for random effects
correlationsmixed y x || group:, reml dfmethod(satterthwaite) # Fit model
estat icc # Compute ICC
testparm x1 x2 x3, small # F-test for multiple predictors
test x1 x2, small # F-test for specific predictors
lrtest model1 model2 # Likelihood ratio test
Important: Always use the small option
with test and testparm to get F-tests instead
of chi-squared tests. This is consistent with Satterthwaite degrees of
freedom.
Modify the data generation parameters in the R script:
sim_data <- data_gen(
nSchools = 10, # Number of schools
min_StuPerGrade = 25, # Minimum students per grade
max_StuPerGrade = 30 # Maximum students per grade
)
For different random realizations:
set.seed(123) # Change from default 777
Capture all output to a file:
# In R
sink("example2_output.txt")
source("example2_simulation_models_R.R")
sink()
# In STATA
log using "example2_output.log", replace
do "example2_models_STATA.do"
log close
Problem: Convergence warnings in R
boundary (singular) fit: see help('isSingular')
Solution: This may occur with small samples. For the simulation, this is acceptable. In real data, consider:
Problem: Package installation fails in R
Error in install.packages...
Solution: Install packages manually:
install.packages(c("lmerTest", "performance", "ordinal"))
If you are working in RStudio, you can look in the toolbar for the
Tools → Install Packages. In the window that
appears, type the package you need (for example, lmerTest)
→ Install.
Problem: STATA convergence warnings
convergence not achieved
Solution: Try simplifying the random effects structure or checking predictor scaling.
Problem: Script runs very slowly
Solution: Check system resources, reduce nSchools parameter
Problem: File path errors
file not found
Solution: - R: Use
setwd("path/to/folder") or RStudio Projects - STATA: Use
cd "path\to\folder" with backslashes on Windows
If you encounter issues:
R.version.string
(must be ≥4.1.0)update.packages(ask = FALSE) in Rgetwd() in R,
pwd in STATAWithin-grade effect (Level 1, CMC predictor):
Between-grade effect (Level 2, grade mean):
Contextual effect:
The interaction term shows moderation:
L1_x2_CMC:L2_x2_Mean_CMC
Interpretation: “The within-grade effect of individual permissiveness is stronger (or weaker) in grades with higher average permissiveness”
Random slope variance (τ²U₁):
Insert new predictors in the data generation section:
# Add new level-1 predictor
sim_data2$new_predictor <- rnorm(nrow(sim_data2), mean = 0, sd = 1)
# Include in model
L1_predictor + new_predictor
Compare CMC vs. grand-mean centering (GMC):
# Grand-mean center at constant
sim_data$L1_x2_GMC <- sim_data$L1_x2_Comp_Round - 25
Save model results for later use:
# In R: Save final model
saveRDS(model5_full, "final_model.rds")
# Load later
model5_full <- readRDS("final_model.rds")
# In STATA: Save estimates
estimates save "Model5_Final", replace
# Load later
estimates use "Model5_Final"
The scripts automatically print session info at the end:
In R:
sessionInfo()
In STATA:
display "STATA version: " c(stata_version)
display "Date: " c(current_date)
This includes:
Fixed at set.seed(777) in R for reproducibility. Same
seed = same results.
If you use this code in your research, please cite:
Padilla, B., Dorman, E., & Hoffman, L. (2026). Multilevel analysis for quantitative
communication scientists. In Data Analysis. De Gruyter.
Books:
Online Resources:
For questions about this code or the methodology:
Primary Contact: Bladimir Padilla
Email: geraldobladimir@gmail.com
Bug Reports: Please contact me directly with the details of the issue
This code is provided for educational and research purposes. You are free to use, modify, and distribute with appropriate citation.
Version 1.2 (March 13, 2026)
Version 1.1 (March 2, 2026)
Personally, I am grateful to Professor Lesa Hoffman for her tireless work and helpful comments on this and many other papers/projects. I believe her website (https://www.lesahoffman.com/index.html) is an invaluable resource for those interested in learning about multilevel analysis and other statistical techniques.
Last Updated: 2026-03-13
Script Version: 1.2
Compatible R Versions: ≥4.1.0
Compatible STATA Versions: ≥16.0