Statistical Notes

Methods Articles

The following are methodological articles that we have found useful.

Title	Short reference	Description
A Conceptual and Empirical Examination of Justifications for Dichotomization	DeCoster, Iselin, & Gallucci (2009)	Despite multiple articles describing the negative effects of dichotomization, the practice continues to be used by researchers. This article contacted authors who had published articles using dichotomized variables and asked for their justifications for dichotomization. The authors then explored these justifications logically and through Monte Carlo simulations. In the strong majority of circumstances, the original continuous measures provided superior performance to artificially dichotomized measures.
An Education Researcher's Guide to ChatGPT: How it Works and How to Use it	DeCoster, Bailey, & Keys (2024)	This article provides a gentle introduction to using ChatGPT to facilitate research using examples drawn from the field of education. A significant focus is placed on “prompt engineering”, providing concrete strategies for crafting effective inputs to optimize AI responses. This guide aims to equip education researchers with the knowledge needed to effectively use ChatGPT to make their work more robust and efficient. It also reviews the issues and concerns associated with using generative AI to ensure researchers receive appropriate and ethical outputs.
Archiving for Psychologists: Suggestions for Organizing, Documenting, Preserving, and Protecting Computer Files	DeCoster, O'Mally, & Iselin (2011)	A file archive is the organizational and procedural structure that guides the way computer files related to a research project are named, organized, documented, and saved in a directory structure. This article reviews the abstract demands that the file archive for a research project must meet and then provides a concrete example of a file archiving structure that would meet these demands.
Caution Regarding the Use of Pilot Studies to Guide Power Calculations for Study Proposals	Kraemer et al. (2006)	This study argues that effect sizes obtained from pilot studies should not be used as the basis for power analyses. The limited sample size of pilot studies means that the effect size estimates they produce are very imprecise with standard errors that can be notably larger than the effect size itself.
Contemporary Issues in the Analysis of Data: A Survey of 551 Psychologists	Zuckerman, Hodgins, Zuckerman, & Rosenthal (1993)	The authors asked published psychologists to answer five yes/no questions about common statistical issues. On average, the respondents had an averate accuracy of 59%. The questions dealt with how reliability and sample size affect power, how interaction effects should be interpreted, testing contrasts, and using effect sizes, issues that regularly come up when analyzing data in the social sciences.
Effect sizes and p values: What should be reported and what should be replicated?	Greenwald et al. (1996)	As part of a defense of null-hypothesis statistical tests, the authors note that a finding with p = .05 has only a 50% chance of being significant in a replication. They propose changing our significance level to .005 for single studies, which would have an 80% chance of producing a p <.05 on a repliccation.
On the Unnecessary Ubiquity of Hierarchical Linear Modeling	McNeish et al. (2017)	Argues for the use of cluster-robust standard errors and generalized estimating equations instead of hierarchical linear modeling.
In Defense of External Invalidity	Mook (1983)	Resarch conducted in the laboratory is often criticized for the artificiality of its setting. This article argues that the generalization to the real world is commonly not the purpose of most laboratory research. Instead, its purpose is to test theoretical predictions that theories make about what should happen in the lab. In this case, laboratory research may accurately test theories without any consideration of external validity.
Opportunistic Biases: Their Origins, Effects, and an Integrated Solution	DeCoster & Sparkses (2015)	When researchers explore their data in multiple ways before deciding what to present, it introduces an "opportunistic bias" artificially increases their chances of obtaining large or interesting effects. This article explains how some common practices lead to opportunistic biases, reviews their negative effects, and proposes an integrated solution to reduce their influence on scientific research.
Scientific Utopia: I. Opening Scientific Communication	Nosek & Bar-Anan (2012)	Suggests changes to academics focused on open access and open review of scientific articles.
Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth OverPublishability	Nosek, Spies, & Motyl (2012)	Suggests changes to academics focused on fair distribution of scientific credit and improving the replicability of research.
Systematic Data Validation: Improving Education Research by Improving Data Quality	DeCoster, Francis, Lee, & Rubinstein (2024)	Valid data is necessary to derive valid inferences from research and can only be guaranteed through data validation. Despite the importance of this process, a survey of education researchers indicates data validation is used inconsistently. To facilitate wider adoption of data validation, this article identifies key principles of data validation, identifies specific aspects of data sets that should be checked, and makes recommendations for how to address identified issues. Additionally, it provides practical strategies for tracking and managing data validation and discusses how these practices should be reported.
The Abuse of Power: The Pervasive Fallacy of Power Analyses for Data Analysis	Hoenig & Heisey (2001)	The authors explain that post-hoc power analyses lack value because they provide no more information than what is available in a p-value. In fact, observed power has a 1 to 1 relationship with p-values.

Statistical Notes

Our staff has put together the notes below.

Title Last Revision Comments

Introductory Statistics 8/1/98 Based on Moore's The Active Practice of Statistics

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures
1/11/06
Explains the different methods of testing group differences. Contains information on between-subjects, within-subjects, and mixed ANOVA, as well as their nonparametric equivalents. Includes sample SPSS code for all analyses.

Applied Linear Regression set 1
11/13/07
Based on Cohen, Cohen, West, and Aiken's Applied Mulitiple Regression/Correlation Analysis for the Behavioral Sciences. Contains more theoretical detail and includes sample SPSS code.

Applied Linear Regression set 2 8/1/98 Based on Neter, Kutner, Nachtsheim, & Wasserman's Applied Linear Statistical Models. Contains more mathematical detail and includes sample SAS code.

Transforming and Restructuring Data
Also available: SPSS examples 5/14/01 Explains efficient ways of transforming data (including tips on getting normal distributions), as well as information about how to change the unit of analysis of a data set. Includes an overview of programming with arrays and loops in SPSS and SAS.

Meta-Analysis 7/31/09
Describes procedures for quantitatively summarizing the results from mulitple studies. Focuses on d and r effect sizes.

Psychological Research Methods 5/9/01 Based on a class taught by Jamie DeCoster at the Free University Amsterdam.

Scale Construction 3/16/10 Detailed notes on how to build and test a scale, including sections on validity and reliability analysis.

Using ANOVA to Examine Data from Groups and Dyads
Also available: HLM overheads
4/12/02 Uses a flowchart to explain how to determine analysis for data from groups and dyads. Includes a section on how to calculate the intraclass correlation coefficient.

Mediation Bibliography
4/12/06
Describes several important references related to testing mediation. Includes links to three mediation websites.

Factor Analysis

Also available: CFA fit statistics
8/1/98
A basic theoretical introduction to exploratory and confirmatory factor analysis.

Data Analysis in SPSS
2/21/04
Explains how to perform and interpret the output of a number of different analyses in SPSS, including ANOVA, MANOVA, regression, logistic regression, and factor analysis.

Data Preparation in SPSS 8/15/12 Explains how to use the menus and syntax to perform basic data preparation in SPSS.

Excel 2003 for Researchers
12/28/05
An overview of Excel 2003 features from the perspective of a researcher. Includes sections on formulas, importing and exporting files, and the Analysis Toolpak.

Excel 2007/2010 for Researchers 9/07/10 An overview of Excel 2007 and 2010 features from the perspective of a researcher. Includes sections on formulas, importing and exporting files, and the Analysis Toolpak.

Restructuring Data from Computerized Experiments
12/29/05
Explains how to use SPSS's Data Restructure procedure to easily transform data in univariate format (where each line corresponds to a trial) to multivariate format (where each line correspons to a subject).

All of the above notes are in pdf format and can be read using Adobe Acrobat. Go to Adobe's website if you want to download a free copy of Acrobat Reader.

Feel free to distribute copies of the notes to anyone you think might find them useful. Please contact us before using them for any academic (i.e., teaching a class) or professional purposes.

Back to the Stat-Help.com home page.

Title	Last Revision	Comments
Introductory Statistics	8/1/98	Based on Moore's The Active Practice of Statistics
Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures	1/11/06	Explains the different methods of testing group differences. Contains information on between-subjects, within-subjects, and mixed ANOVA, as well as their nonparametric equivalents. Includes sample SPSS code for all analyses.
Applied Linear Regression set 1	11/13/07	Based on Cohen, Cohen, West, and Aiken's Applied Mulitiple Regression/Correlation Analysis for the Behavioral Sciences. Contains more theoretical detail and includes sample SPSS code.
Applied Linear Regression set 2	8/1/98	Based on Neter, Kutner, Nachtsheim, & Wasserman's Applied Linear Statistical Models. Contains more mathematical detail and includes sample SAS code.
Transforming and Restructuring Data Also available: SPSS examples	5/14/01	Explains efficient ways of transforming data (including tips on getting normal distributions), as well as information about how to change the unit of analysis of a data set. Includes an overview of programming with arrays and loops in SPSS and SAS.
Meta-Analysis	7/31/09	Describes procedures for quantitatively summarizing the results from mulitple studies. Focuses on d and r effect sizes.
Psychological Research Methods	5/9/01	Based on a class taught by Jamie DeCoster at the Free University Amsterdam.
Scale Construction	3/16/10	Detailed notes on how to build and test a scale, including sections on validity and reliability analysis.
Using ANOVA to Examine Data from Groups and Dyads Also available: HLM overheads	4/12/02	Uses a flowchart to explain how to determine analysis for data from groups and dyads. Includes a section on how to calculate the intraclass correlation coefficient.
Mediation Bibliography	4/12/06	Describes several important references related to testing mediation. Includes links to three mediation websites.
Factor Analysis Also available: CFA fit statistics	8/1/98	A basic theoretical introduction to exploratory and confirmatory factor analysis.
Data Analysis in SPSS	2/21/04	Explains how to perform and interpret the output of a number of different analyses in SPSS, including ANOVA, MANOVA, regression, logistic regression, and factor analysis.
Data Preparation in SPSS	8/15/12	Explains how to use the menus and syntax to perform basic data preparation in SPSS.
Excel 2003 for Researchers	12/28/05	An overview of Excel 2003 features from the perspective of a researcher. Includes sections on formulas, importing and exporting files, and the Analysis Toolpak.
Excel 2007/2010 for Researchers	9/07/10	An overview of Excel 2007 and 2010 features from the perspective of a researcher. Includes sections on formulas, importing and exporting files, and the Analysis Toolpak.
Restructuring Data from Computerized Experiments	12/29/05	Explains how to use SPSS's Data Restructure procedure to easily transform data in univariate format (where each line corresponds to a trial) to multivariate format (where each line correspons to a subject).