Post by Ben Seipel, University of Wisconsin-River Falls/California State University, Chico; with Gina Biancarosa, University of Oregon; Sarah E. Carlson, Georgia State University; and Mark L. Davison, University of Minnesota.

Reading is an amazingly simple, yet complex construct with a modest goal: understanding.

For most students, the ability to read and understand text and symbols demonstrates the seamless integration of several subskills. These fundamental skills include the abilities to recognize letters, map sounds to those letters, piece written words together fluently, understand the meaning of individual words and phrases, utilize background knowledge, and generate inferences. When any one of these reading subskills falters, so does comprehension. Consequently, for students who struggle with reading, educators and parents need to know why comprehension is faltering. The purpose of this blog post is to review on-going issues with traditional reading comprehension assessments, describe the development of research-based tools that address those issues, and introduce a new diagnostic assessment that utilizes advancements in identifying reading comprehension struggles: MOCCA (Multiple-Choice Online Causal Comprehension Assessment).

For decades, there have been reliable measures and interventions to help students who struggle with the lower-level components of reading. For example, students who struggle with mapping word sounds to written words can be identified with assessments such as the Woodcock-Johnson IV, the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) phoneme segmentation fluency measure, or even teacher-designed assessments. After taking such assessments, students can receive tailored phonemic awareness or phonics instruction to improve their abilities. Similarly, students identified by a vocabulary test (e.g., Peabody Picture Vocabulary Test or Expressive Vocabulary Test) as lacking sufficient vocabulary can receive vocabulary instruction. For higher-level components of reading—specifically reading comprehension—there are many measures such as the Nelson-Denny Reading Test, Iowa Test of Basic Skills, or virtually any state-selected reading comprehension test. Traditionally, these reading comprehension measures have been able to determine whether or not a student struggles with comprehension, but unable to determine the root cause of the struggle. Although these standardized assessments are useful for knowing how students are performing, they do not help teachers determine how to help struggling readers. 

Fortunately, recent developments in cognitive psychology and educational measurement have led to new tools that are coming online to aid administrators, teachers, students, and their parents in the identification of comprehension processing issues in order to then develop corresponding interventions or modify comprehension instruction.

Recent research has found that students who struggle with comprehension, but don’t struggle with the lower level components of reading (e.g., decoding, fluency), do not struggle in the same ways (Oakhill & Cain, 2012). To understand how these students struggle differently, it is important to recognize that a key component of comprehension when reading a narrative text is the ability to follow a causal chain of events in a story and generate causal inferences. For example, consider the actions of the main character in the following mini-narrative text:

Carina, a firefighter, was relaxing and watching her favorite show on Netflix. She was also enjoying her big bowl of strawberry ice cream when all of the sudden she was called to the fire station for an emergency. Before she left, she went to the freezer. 

While reading the story, a good comprehender can infer that Carina did not eat all of her ice cream and had to place the ice cream in the freezer to prevent it from melting. A poor comprehender may not be able to make that causal connection while reading. Instead, a poor comprehender may be reading the text superficially and find no gaps requiring connections to missing information or may be trying to make connections, but the connections are to extratextual information (e.g., personal associations, elaborations). In either case, a causal inference would not be generated.  

Research indicates that when poor comprehenders are asked about the causal connections in a story, they struggle in at least two specific ways (Carlson, Seipel, & McMaster, 2014; Rapp, et al, 2007; Seipel, Carlson, & Clinton, 2017). Findings from think-aloud studies (where a student reads a text aloud and then states what s/he is thinking) show that one group of poor comprehenders tend to generate paraphrases or repeat text verbatim, but do not generate causal connections. Thus, during a think aloud with the example text above, a “paraphraser” might state that Carina went to the freezer” without indicating why (a paraphrase is not generally considered an inference, but it is text-based). The other group of poor comprehenders tend to make elaborations (a.k.a., lateral connections), idiosyncratic connections to their personal background knowledge (i.e., extratextual information). In the text example above, a “lateral connector” or “elaborator” might indicate that “Strawberry ice cream usually has real strawberries and artificial flavoring.” Additionally, there is evidence that these two groups of students respond differently to whole-class instruction and intervention (McMaster et al., 2012). The instructional strategy that is best for paraphrasers needs to be modified for lateral connectors. Thus, when it comes to interventions, one size does not fit all, and having appropriate assessment tools could enable teachers to choose the most appropriate intervention for each student.

It is important to note that paraphrases, text repetitions, personal associations, and elaborations are not bad comprehension processes or strategies—even good comprehenders use these processes during reading. In fact, specific classroom instructional reading strategies such as close reading rely on some of these skills. However, when these processes are used to the detriment of developing and maintaining causal connections in a text, comprehension will falter. Furthermore, when a reader does this repeatedly, they need an intervention. It is also important to note that using think-aloud protocols to identify student comprehension problems is time and energy prohibitive. Classroom teachers generally do not have the time or resources to collect, code, and analyze think alouds for an entire class to understand how students are processing a text.

Another set of advancements that have occurred in the field is the refinement of online testing, cognitive diagnostic testing, and statistical models. Greater access to computers and online testing has made testing easier to administer with quicker results. Additionally, cognitive diagnostic testing has made advancements demonstrating that cognitive processes can reliably be identified and classified with distractor-driven and hierarchically-ordered multiple-choice assessments. Consequently, new and refined statistical models are enabling researchers to classify students based on patterns of their incorrect responses.

With these progressions in the field, we developed and designed MOCCA to be an easy-to-use, untimed diagnostic assessment for classroom use based on item structure and comprehension processing content.

First, with regards to item structure, MOCCA uses a familiar multiple-choice format that is easy to administer, score, and interpret. It also capitalizes on the structure and predictive validity of reading maze tasks. In a traditional maze task, every nth word is deleted and replaced with three choices. One choice is the correct word, and the other two words are generated at random. MOCCA takes this maze approach to the next level by deleting a whole sentence from a paragraph. From three carefully crafted responses (further described below: causally coherent inference, paraphrase, lateral connection), the student is asked to select the sentence that best completes the paragraph. In order to answer a traditional maze item correctly, a student need only comprehend the text at the sentence level. However, with MOCCA, a student must comprehend the text at the discourse level. Figure 1 shows a practice item from the test directions. 

Ben Seipel, MOCCA
Figure 1. MOCCA practice item
Source: Ben Seipel, MOCCA

Second, with regards to item content, MOCCA is designed around decades of think-aloud research on cognitive processes used during reading comprehension of narrative texts. Specifically, each MOCCA item is a separate seven-sentence story built around a causal chain of events. As indicated above, the sixth sentence of each story is deleted and replaced with three response choices. The three responses are designed to mimic the types of responses found in think alouds that are either typical of good comprehenders (a causally coherent inference) or the processes of the aforementioned struggling comprehenders (a paraphrase or a lateral connection).

The “correct” response completes the causal chain of a story (i.e., closes the gap when the sentence is missing from the text). The second choice is a paraphrase of the main or updated goal of the story. The third choice is a lateral connection or elaboration of the fifth sentence in the story but does not complete the causal chain of the story. Because each item is an independent story with consistent response types, MOCCA provides three separate scores that can help diagnose comprehension difficulties: a correct score, paraphrase score, and lateral connection score. In turn, these scores can be used to distinguish students who predominantly use paraphrases or predominantly use lateral connections and may need different instructional strategies. Because each item contains all three response types, overall raw scores and subscores can be calculated to determine a propensity to a comprehension process while reading.

MOCCA has been carefully constructed over a three-year period. We began by constructing 160 items at each grade level. All items were vetted and reviewed by 3rd, 4th, and 5th-grade teachers to ensure the content, vocabulary, and readability were grade appropriate and unbiased. We also corroborated and extended these findings using Coh-Metrix (McNamara, Louwerse, Cai, & Graesser, 2013). Coh-Metrix is an automated tool for calculating a variety of linguistic features such as readability, lexical diversity, and cohesion. Next, based on statistics from a pilot study in Spring 2015, the set of items was reduced to 120 in each grade and some of these items were rewritten. After a field test that included an evaluation of item fairness by gender and ethnicity, the items were revised even further in Spring 2016. MOCCA has three forms each for grades 3 to 5. Based on data from a normative sample, the forms will be equated within and across grades in spring 2018 so that, for purposes of progress monitoring, students can be tracked longitudinally without administering the same form twice. Within a grade, stories are assigned to forms so that the average story reading level and the number of words is as nearly equal as possible. Within the reading level and number of words constraint, stories were randomly assigned to forms within a grade. All forms have 40 stories (items) and all stories have exactly seven sentences with one missing (i.e., the sixth sentence). For each grade, story reading levels range from one level below grade to one level above grade. For instance, Grade 3 forms contain stories with reading levels from grades 2 to 4 with a mean of 3.0 on the Flesch-Kincaid scale.

Given the online, computer-based administration, we are able to monitor comprehension efficiency. Although measuring comprehension efficiency was not an original design feature of MOCCA, it quickly became a desired tool for our research team and for teachers. Consequently, we have developed features to determine whether a good comprehender is fast or slow. This is important because traditionally slow readers were assumed to be poor comprehenders, but this is not the case. MOCCA scores and comprehension efficiency rate have helped teachers identify students who are comprehending well but may need assistance in developing comprehension fluency. Finally, an ongoing process is the procedure for identifying and classifying struggling comprehenders through the development of Item Response Theory (IRT) models. Because MOCCA yields three distinct scores (causal coherent, paraphrase, and lateral connect), this has led to the development of new, complex IRT models that use multiple scores. Currently, we are in the final stages of a nation-wide, demographically and regionally representative sample to validate MOCCA. Testing fatigue and resistance to using more testing in schools from teachers and administrators is one challenge that we are currently facing. Although this is understandable, it is unfortunate because it limits the ability to validate new assessments such as MOCCA. Only through validated assessment (either standardized, diagnostic, or otherwise) can students be identified for the services and interventions they need. We will continue to communicate and collaborate with teachers and administrators in order to help alleviate their concerns.

Beyond this challenge, we remain hopeful that MOCCA will be useful and used by teachers. We anticipate that future iterations of MOCCA will be more efficient (e.g., computer adaptive), more genre inclusive (e.g., narrative and information texts), and serve a larger population (e.g., grades 2 to adult). Future research will be directed toward improving MOCCA itself and developing guidance for teachers, students, and parents as to how instruction and learning can be optimized with the data provided by MOCCA.

Please view our YouTube Channel for a video preview of MOCCA. Those interested may also follow MOCCA on Twitter for news and updates as they become available.

This post is part of a special series curated by APA Division 15 Past President Bonnie J.F. Meyer. The series, centered around her presidential theme of "Welcoming and Advancing Research in Educational Psychology: Impacting Learners, Teachers, and Schools," is designed to spread the dissemination and impact of meaningful educational psychology research. Those interested can learn more about this theme in Division 15's 2016 Summer Newsletter.


Carlson, S. E., Seipel, B., & McMaster, K. (2014). Development of a new reading comprehension assessment: Identifying comprehension differences among readers. Learning and Individual Differences, 32, 40-53.

McMaster, K. L., van den Broek, P., Espin, C. A., White, M. J., Rapp, D. N., Kendeou, P., Bohn-Gettler, K.M., & Carlson, S. (2012). Making the right connections: Differential effects of reading intervention for subgroups of comprehenders. Learning and Individual Differences, 22(1), 100-111.

McNamara, D. S., Louwerse, M. M., Cai, Z., & Graesser, A. (2013). Coh-Metrix version 3.0. http://cohmetrix. com.

You are reading


Is It Mind Reading? Interpreting Inference Interference

MOCCA can be used as an effective reading comprehension diagnostic tool.

Individualized Comprehension Instruction in K-2

Making inferences is the cornerstone of reading comprehension.

Making Psychology a Brand Education Stakeholders Can Trust

Educational videos can help bring high-quality research into the classroom.