80% of Analysis Is Data Preparation
Experienced researchers know: clean data produces valid results. Rushing through data entry and preparation is the most common source of avoidable errors in statistical analysis. This guide walks you through every step in SPSS.
Defining Variables (Variable View)
Open SPSS and switch to the Variable View tab. Fill in the following columns for each variable:
- Name: Short, no spaces (e.g., age, gender, q1)
- Type: Numeric or String
- Measure: Scale (continuous), Ordinal, or Nominal
- Label: Full variable name (e.g., "Participant Age")
- Values: Category codes for nominal variables (e.g., 1=Male, 2=Female)
- Missing: Define a missing value code (e.g., 99 or -1)
Data Entry Tips
For large datasets, enter data in Excel and import into SPSS (File β Import Data). For paper-based surveys, consider double data entry to detect input errors: enter data twice independently, then compare for discrepancies.
Missing Data Management
Use Analyze β Missing Value Analysis to examine missing data patterns.
- <5% missing β Listwise or pairwise deletion is acceptable.
- 5β20% missing β Consider mean imputation or multiple imputation.
- >20% missing β Consider excluding the variable or proceed with caution.
Outlier Detection
- Boxplot: Analyze β Descriptive Statistics β Explore β Boxplot. SPSS automatically flags outliers (o) and extremes (*).
- Z-scores: |z|>3.29 indicates a univariate outlier.
- Mahalanobis distance: For detecting multivariate outliers in regression.
Reverse Coding
Negatively worded items must be reverse-coded before reliability or factor analysis: Transform β Recode Into Different Variables. For a 5-point scale: recode 1β5, 2β4, 4β2, 5β1.
Computing Subscale Scores
Use Transform β Compute Variable to create subscale means: e.g., MEAN(q1, q2, q3, q4). This automatically handles missing values according to your specified options.
