American children do much better identifying the correct answers to simple scientific tasks than using evidence from their experiments to explain those answers.
The National Assessment of Educational Progress, often called the Nation's Report Card, asked students in grades four, eight and 12 to perform actual experiments to apply principles they learn in the classroom on a practical level. The results of the 2009 tests were released this week.
"That tells us that our science teaching isn't getting us as far as we need to go," said Chris Dede, professor from the Harvard Graduate School of Education.
Katherine Carroll, an 11th- and 12th-grade chemistry teacher in Waterboro, Maine, said even her best students struggle to explain their conclusions in the lab reports they turn in for her class. She found them more accustomed to questions with one right answer.
"Teachers have moved towards teaching more knowledge, as opposed to the understanding behind that knowledge," Carroll said.
Like Carroll, Dede said kids' difficulty explaining is old news to most teachers and parents, but this is the first time they have concrete evidence demonstrating the problem.
"Having something that is more than just anecdotes, that is rigorous research across a wide range of students, is very helpful because it's a better form of evidence on which to make decisions," Dede said.
The first test, called Hands On Tasks, allotted students 40 minutes to conduct experiments with physical objects. This allowed for a richer analysis of their understanding of the subject than pencil-and-paper tests can provide, according to Alan Friedman, chairman of the National Assessment Governing Board's Assessment Development Committee.
HOTs, however, are nothing new. NAEP tests used them as far back as 1996.
Friedman said the second type of test, Interactive Computer Tasks, went beyond what had previously been measured, testing how students ran their own experiments in simulated natural or laboratory environments with the ability to go back, adjust variables and correct their mistakes on a computer.
"This is a set of skills which in the real world is invaluable," Friedman said, "and which before this we'd never been able to know if students could do this or not."
Though Friedman said the computer tests are "dramatically more expensive" to design, traditional assessments cannot measure these same skills.
During ICTs, just more than a quarter of high school seniors could both select and explain their correct answers about heating and cooling. Double that amount -- 54 percent -- in the eighth grade group could support correct conclusions with evidence, but only 15 percent of fourth-grade students could do the same in their experiment.
The computer tasks eliminated limits of geography and time, so students could virtually see, for example, how a plant given a certain amount of sunlight would grow without waiting days or weeks to see the actual process.
Though the tests raised significant questions about students' abilities to apply scientific knowledge to the real world, they at least seemed to enjoy taking them, according to Peggy Carr, associate commissioner at the National Center for Education Statistics.
Carr usually observes students losing interest in the traditional NAEP tests. "Not so with these assessments," Carr said.
In the hands-on tasks, female students in every grade outdid their male counterparts by 2 to 4 percentage points on average. Girls also scored slightly better than male students in grades eight and 12 on interactive computer tasks.
This gender gap shows a reversal from the traditional NAEP tests in which eighth-grade boys scored at least four points higher on average than their female peers in 2009 and 2011.
White and Asian/Pacific Islander students outperformed black and Hispanic students in the hands-on tasks, and Asian/Pacific Islander students achieved higher scores on average than other students in all grades' computerized assessments.
The lowest-scoring group in both assessments was 12th-grade black students. They answered 19 percent of computerized questions correctly, whereas their Asian-Pacific Islander counterparts passed 33 percent.