Psychology of Scientific Integrity: Graduate Syllabus

A broad yet technical introduction to scientific integrity crises

Posted Aug 18, 2016

THE PSYCHOLOGY OF SCIENTIFIC INTEGRITY: THE GRADUATE SYLLABUS

Required Readings

NOTE: All readings below are links, so if you are interested, you can read most of the articles in the original.  Occasionally, because of journal copyright issues, you can only see the abstract; or, you might need a subscription to a mainstream news outlet.  In most cases, however, if you are affiliated with a college, you can access the source through their library system.

PART I: INTRODUCTION

What's the Problem?

Magician Will Fern**
Source: Magician Will Fern**

Freedman (November, 2010). Lies, damn lies, and medical science.  The Atlantic.

Neuroskeptic (2012).  The nine circles of scientific Hell. 

Lehrer, J. (December 13, 2010).  The truth wears off.  The New Yorker.

What is Science Supposed to be?

https://en.wikipedia.org/wiki/Science

https://en.wikipedia.org/wiki/Scientific_method

https://en.wikipedia.org/wiki/Data_sharing

PART II: REPLICATION AND ITS DISCONTENTS

Is Psychological "Science" Irreplicable?

OSF (2015). Estimating the reproducibility of psychological science.  Science, 349, doi: 10.1126/science.aac4716.

Jussim (2012).  Unicorns of social psychology. Psych Today.

Jussim (2012). Social psychological unicorns: Do failed replications dispel scientific myths? Psych Today.

Henrich et al (2010).  The weirdest people in the world? Behavioral and Brain Sciences, 33, 61-135. 

Funder (2012). The perilous plight of the non-replicator. Funderstorms.

Dreber et al (2015). Using prediction markets to estimate the reproducibility of scientific research.  PNAS, 112, 15343-15347.

Are We Sociopaths, Incompetent, or Both?

Simmons et al (2011).  False positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359-1366.

Lee Jussim. You calling me a sociopath?
Source: Lee Jussim. You calling me a sociopath?

Johns et al (2012).  Measuring the prevalence of questionable research practices with incentives for truth telling.  Psychological Science, 23, 524-532.

Schimmack, U. (2012).  The ironic effect of significant results on the credibility of multiple study articles.  Psychological Methods, 17, 551-566. 

Vul et al (2009).  Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspectives on Psychological Science,4, 274-290.

Wicherts et al (2011).  Willingness to provide data is related to the strength of the evidence and the quality of reporting statistical results.  PLoS One, 6(11): e26828. doi:10.1371/journal.pone.0026828

Wait: I'm a Scientist, Everything is Just Fine 

Gilbert et al (2016).  Comment on "Estimating the reproducibility of psychological science."  Science, 351, 1037.

Lee Jussim. Just because I am waste deep in snow, at 11,500 feet, on a remote Colorado mountain, it does not mean everything is not fine.
Source: Lee Jussim. Just because I am waste deep in snow, at 11,500 feet, on a remote Colorado mountain, it does not mean everything is not fine.

Fiedler, K., & Schwartz, N. (2015). Questionable research practices revisited.  Social Psychological and Personality Science, 7, 45-52.

Between Panic and Complacency...

Jussim (2016).  Are most publish social psychology findings false? Psych Today.

Inzlicht (2016). Reckoning with the past. Getting Better.

Simonsohn (2016).  Evaluating replications: 40% full is not 60% empty.  Data Colada.

Part III: BEYOND REPLICATION

Science Is Self-Correcting, Right?

Ioannidis, J. (2012).  Why science is not necessarily self-correcting.  Perspectives on Psychological Science, 7, 645-654.

Jussim, L. (2015). Slow & nonexistent scientific self-correction in psychology. Psych Today.

Jussim (2016). Is it offensive to declare a psychological claim wrong? Psych Today

It's Not Just Psychology

Orenstein (April 25, 2013). Our feel-good war on breast cancer. NYTimes.

NYTimes (2013).  Common knee surgery does very little for some

Loeb, A. (2014).  Benefits of diversity.  Nature Physics, 617-617.

O'Boyle et al (2014).  The chrysalis effect: How ugly initial results metamorphosize into beautiful articles.  Journal of Management, 19.

Outsiders by design, Freakonomics podcast.

At Least Statistics are Rigorous, Clear and Objective, Right?

Gelman & Loken (2014).  The statistical crisis in science.  American Scientist. 

Lee Jussim.  That's not a rocket, its a French press coffee maker.  It's design and presence in a Utah desert are extraordinarily statistically unlikely.
Source: Lee Jussim. That's not a rocket, its a French press coffee maker. It's design and presence in a Utah desert are extraordinarily statistically unlikely.

Fraley & Vazire (2014).  The N-Pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power.  PLoS One, 9: e109019. doi:10.1371/journal.pone.0109019.

Nuzzo (2014).  Scientific method: Statistical errors.  Nature, 506, 150-152.

Westfall et al (2014).  Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.  Journal of Experimental Psychology: General, 143, 2020-2045.

How to Tell Scientific Fact from Fiction: P-Curving and Other Methods

Masciampo & Lalande (2012).  A peculiar prevalence of p-values just below .05.  Quarterly Journal of Experimental Psychology, 65, 2271-2279.

Simonsohn et al (2014).  P-curve: A key to the file-drawer.  Journal of Experimental Psychology: General, 143, 534-547.

Simonsohn et al (2014).  P-curve and effect size: Correcting for publication bias using only significant results. Perspectives on Psychological Science, 9, 666-681.

Franco et al (2014).  Publication bias in the social sciences: Unlocking the file drawer.  Science, 345, 1502-1505.

Simmons et al (2012).  A 21 word solution.  Dialogues.

Bakker et al (2012).  The rules of the game called psychological science.  Perspectives on Psychological Science, 7543-554.

The Problem of Interpretation: Bad Conclusions Based on Good Data

Shihong Khor
Source: Shihong Khor

Vazire (2014).  The simpleminded and the muddleheaded.  Sometimes I'm Wrong.

Abramowitz et al (1975).  Publish or politic: Referee bias in manuscript review. Journal of Applied Social Psychology, 5, 187-200.

Eagly, A. (1995).  The science and politics of comparing men and women.  American Psychologist, 50, 145-158.

Inbar & Lammers (2012).  Political diversity in social and personality psychology.  Perspectives on Psychological Science, 7, 496-503.

Jussim (in press). Précis of Social perception and social reality: Why accuracy dominates bias and self-fulfilling prophecy.  Behavioral and Brain Sciences.

Jussim, Crawford, Stevens, & Anglin (2016).  The politics of social psychological science: Distortions in the social psychology of intergroup relations.  In P. Valdesolo and J. Graham (eds), Social Psychology of Political Polarization. 

Steele & Aronson (1995).  Stereotype threat and the performance of African Americans.  Journal of Personality and Social Psychology, 69, 797-811.                        -- presented as a case study in how to misinterpret data

Jussim (2016). What explains demographic gaps? Psych Today.

--------------

This entry, like the prior one, was inspired by this blog post by Sanjay Srinivasta. Sanjay has been at the forefront of improving psychological science, and his post was a faux syllabus for a course titled Everything is F****d: The Syllabus. Sanjay defines scientific practices as f*****d when they present "... hard conceptual challenges to which implementable, real-world solutions for working scientists are either not available or routinely ignored in practice."

Week after week, Sanjay's faux course presents readings arguing that some aspect of what we took for granted as "good" science in psychology is f*****d. This includes experiments, reviews, statistics, meta-analysis, replication, and more.

The thing is, I have been teaching two actual courses on basically the same topic since 2014. The one presented here is my graduate course.  My prior post, of my undergrad syllabus, is more of a good, general overview for the interested lay reader with either a basic college education or who is widely read and reasonably numerate (you do not actually need a college education for either).

The grad course, however, has some general sources, but many technical readings.  Nothing holy about them, but they are the ones I have assigned.  I taught this course Spring 2015, and, since then, I have discovered all sorts of interesting sources (a whole bucketful on Sanjay's site).  Many will probably be incorporated into the next time I teach the course.  

Overview

This graduate methods class will focus on the rapidly evolving changes  in understanding as to what constitutes best practices in the conduct of scientific research.  By "scientific research," I mean every aspect of the practices involved in the production of scientific knowledge.

I use the term “scientific integrity” to refer to two related but separate ideas: 1. The personal honesty of individual scientists in the conduct and reporting of their research; and 2. Developing robust bodies of conclusions that are valid and unimpaired.  Obviously, dishonest or misleading practices can impair science.  Whereas personal dishonesty can explain problems such as data fraud, such instances are extremely rare, and not the focus of this course.

It is the second meaning of the term scientific integrity that will be the focus of this course.

Even when researchers suffer no lack of personal integrity, conventional practices common in their field may produce findings that are misleading or invalid. In this sense, scientific integrity closely corresponds to conventional understanding of the term "validity" though the focus of this course differs from the traditional review of types of validity (face, internal, external, ecological, etc. -- such forms, presumably, are covered by our regular methods course)

Science is about "getting it right" (SPSP Task Force, see first week's readings).  It is about generating claims and conclusions that are true.  This includes: 1. The generation of valid new knowledge; and 2. Development of the tools needed to determine, from existing research findings, which conclusions are true and which are not.

Just because some claim is published does not make it "true."  Once this is recognized, a natural question becomes, "How can we distinguish what is true from what is not true?"  That is a job for new methods and statistics, new uses for old methods and statistics, and conceptual tools for determining how to identify systematic errors and  biases within a scientific literature.

A Brief but Important Tangent

Many social psychologists balk at declaring some claims “wrong.”  This seems to occur because it is seemingly often perceived as a personal attack on the claimant to declare some scientific claim to be “wrong.”  See Is it Offensive to Declare a Psychological Claim Wrong? for more details on this.

Nonetheless, this course will emphasize the role of Popperian falsification in the creation and evaluation of scientific knowledge.  In this context, the only way forward involves identification and ways in which data falsifies scientific claims.  Furthermore, getting it righ, inherently includes the skills to recognize of when we have gotten it wrong. Consider the following, from Richard Alpert, the former President of the National Academy of Sciences (quoted in The Economist, 2013):

“And scientists themselves need to develop a value system where simply moving on from one’s mistakes without publicly acknowledging them severely damages, rather than protects, a scientific reputation.’”

Thus, developing the scientific, scholarly, intellectual, and methodological tools to determine true from untrue scientific inferences, claims and conclusions will be a central focus of this course.

Course Goals

This course has three main goals: 1. To understand sources and manifestations of suboptimal practices in scientific research (irreproducible results, irreplicable studies, unobtainable data, invalid or unarticulated statistical analyses and methodological procedures, misleading or overstated conclusions, etc.), 2. To review reform attempts currently in progress, and to critically evaluate which threats to scientific integrity they appear to target, how successful they are likely to be, how to empirically evaluate their success, and to identify which threats to scientific integrity they are unlikely to address; and 3. Provide an introduction to the conceptual and empirical tools currently available for determining how to figure out which existing claims emerging from which existing bodies of evidence are valid.

Class Structure

Most classes will involve discussion of that week's readings (see Grading, below).  However, when new methodological or statistical techniques (e.g., tests for excess significance, funnel plots, p-curving, incredibility indexing, N-Pact factor assessment, confidence intervals [not new in the grand scheme of things, but newly required and, perhaps, unfamiliar to many students] are introduced, I may lecture for part or all of a class.

Grading

Discussion Leading: 20%

There will be several readings assigned each week.  1-2 students will be required to: 1. Generate a set of discussion questions; 2. Circulate them at least 3 days before class; 3. Initiate, lead, and focus the discussion of that week's readings. 

Participation: 20%

Like all my other classes, except 522 (2nd grad stat class), participation is a required and integral part of the class. 

Summaries: 10%

Primarily to insure that they generally actually read the required articles, they will be required to provide a short summary of each article.

Major Paper: 50%

The major paper will involve using one of the new techniques for assessing research validity (e.g., providing a funnel plot or p-curve regarding studies in a meta-analysis); using a simple technique for evaluating research quality (e.g., errors or incompletenesses in areas of research), and/or obtaining data from a published source and evaluating whether the results can be reproduced (note: I use the term "reproduce" to refer to the answer to the question: After obtaining published data, if one does the exact analysis the authors describe, does one obtain the exact same result?).

** Some might argue that, in contrast to some of what passes for "scientific" psychology, at least Magician Will Fern admits that he creates illusions.