I have just come back from the Association of Psychological Science (APS) meetings, where I heard an interesting symposium chaired by Philip Ackerman (Georgia Tech). In that symposium Angela Duckworth (U. Pennsylvania) presented a paper with a message that I think ought to be widely distributed, for it touches on what some think is a paradox in the field.
(Truth in packaging..Dr. Duckworth's thesis impressed me because I agree with it! See my book HUMAN INTELLIGENCE, and the part about testing. I wish I had heard Dr. Duckworth before the book was published, for she said the ideas better than I.)
The thesis..which applies to educational achievement testing and personality testing as well as intelligence testing..is that we can only do so much with what Robert Mislevy (U. Maryland) has called the "drop from the sky" method of testing. In "drop from the sky" testing an examiner poses a set of questions to the examinee, and does so out of context of the examinee's normal life, and does so in a limited time. (Timed vs. Power tests are not the issue here. The typical power test still has to be completed in a testing session of from one to three hours.) Binet, and over a century of research after him, has shown that you can evaluate some of a person's cognitive capabilities by the "Drop from the sky" method. You can evaluate a person's superficial knowledge of, say, the American Civil War by asking questions like "who the leaders were" and "who won the battle at Gettysburg?" What you can't do is ask a question like "What were the mix of social, economic, and religous issues that led to the war?" You can also evaluate a person's capacity for attention and memory...within brief limits. You cannot evaluate a person's ability to order his or her efforts to achieve different goals (including studying over partying), or the person's ability to look at things from various perspectives. More generally, you cannot test any capability that is only displayed over time. Yet these capabilities are extremely important to a person's cognitive capabilities. Here's a historic example.
I once wrote a paper contrasting the leadership styles of Abraham Lincoln and Captain Bligh (of Mutiny on the Bounty fame, and the movies did him wrong! He was a distinguished commander in the Napoleonic Wars and died a Vice Admiral. But I digress.) I pointed out that Bligh was superb at sizing up situations and taking action but he was terrible at weighing nuances of situations over time. Lincoln was just the opposite. I would never want a Bligh in the White House, but I would not want to be on an airplane with a Lincoln in command during a storm. Our assessment procedures in intelligence, education, and personality, are oriented toward evaluating Bligh characteristics, not Lincoln characteristics.
What Duckworth did was to apply this sort of reasoning, without the historical allusions, to modern education. She pointed out that intelligence tests draw on the capacities that are useful in taking a "high stakes" test, such as the NO CHILD LEFT BEHIND TESTS. (I'd add that they also provide useful information for predicting decision making ability in brief time, high stress periods. outside the educational field.) Grades, on the other hand, can provide information about behavior over time. But grades have their problems, not the least being a reflection of the social interactions between the grader and the student. Duckworth pointed out that sweet, reasonable students may not be terribly bright. I'd add that I can think of some really obnoxious people who are very bright. What we need to do is to develop a way to watch, and evaluate, cognitive behavior that develops over time, but that is not influenced by social interactions. (These are important in themselves, but should be evaluated separately.)