11.4 Exercises#

Exercise 1: Preparation and data exploration#

Load the semopypackage and the HolzingerSwineford1939 dataset. If you want to keep your dataset small and organized, you can use .drop() to remove the columns x1, x4and x7.

# Uncomment the following line to run in Google Colab
# !pip install semopy
import semopy

# Import your data here

# Drop the unnecessary columns/variables

Exercise 2: Fit a CFA model#

Fit a CFA model with 3 latent variables. x2 and x3 should load onto visual, x5 and x6 shoud load onto text, x8 and x9 should load onto speed. Assume all latent factors to be uncorrelated with each other. After Specifying the model, fit it and inspect the model estimates as well as the model fit measures.

# Specify the model

# Fit the model

# Get the estimates

# Get the fit measures

# Visualize the model, visualize the 0 correlation between factors 

Exercise 3: Fit a SEM model#

Adapt your model from above to include a structural part, meaning a unidirectional association on the level of latent variables. Print the model estimates and the model fit statistics. Does the CFA or the SEM model provide better fit? Provide an explanation for your conclusion.

# Specify the model

# Fit the model

# Get the  estimates

# Get the fit measures

# Visualize the model

Voluntary exercise 1: Higher level factors#

Go back to your CFA model and add a higher level factor onto which all latent variables load onto. Name it intelligence. Does the higher level factor improve model fit?

# Specify the model

# Fit the model

# Print the model estimates

# Print the model fit measures

# Visualize the model

Voluntary exercise 2: Advanced models I#

Now go back to your SEM model and modify it in a way such that the factor variance of the speed factor is fixed to 1. How does that affect the interpretation of the loading associated with that factor?

# Specify the model

# Fit the model

# Print the model estimates

# Get the fit measures

# Visualize the model

Voluntary exercise 3: Advanced models II#

Re-load the dataset again, this time without deleting any variables. Specify and evaluate a model that tests the following hypothesis:

  • x1,x2 and x3 should load onto visual, x4,x5 and x6 shoud load onto text, x7,x8 and x9 should load onto speed.

  • visual and text load onto a higher level factor called intelligence.

  • intelligence explains 100% of the covariance between visual and text.

  • intelligence predicts speed.

# Load the entire dataset

# Specify the model

# Fit the model

# Get model estimates

# Get fit statistics

# Visualize the model