Want to understand your child’s test scores? Here’s what to ignore
Now that the first month of school is over, parents can get ready for the next milestone of the school year – they will soon get reports of the state tests their children took last year.
My estimates show that approximately 26 million students in public schools took statewide tests in reading and math last year. Many of them also took statewide tests in science. These tests provide important information to parents about how well their children are doing in school.
However, my research also shows that when parents receive their child’s test score report, they may have a tough time separating the important information from the statistical gibberish.
What’s more, the results might not even give them accurate information about their child’s academic growth.
Is your child ‘proficient’?
The No Child Left Behind law, enacted in 2002, required all states to set “achievement level standards” in reading and math for grades three through eight, and for one grade in high school, typically 10th or 11th grade. States were also required to develop tests to measure students’ level of “proficiency” on each test.
The new federal law passed in December 2015, the Every Student Succeeds Act (ESSA), will continue this practice.
As a result, the test reports parents receive classify children into achievement levels such as “basic” or “proficient.” Each state decides what these classifications are called, but at least one category must signify “proficient.”
These achievement level categories are described on the test score reports, and so this information is easily understood by parents. For example, I find it helpful each year to see if my sons reach proficiency in each subject area.
But children’s test scores in a given year, and their achievement level, are not the only information reported in some states. A new statistical index, called a “student growth percentile,” is finding its way into the reports sent home to parents in 11 states. Twenty-seven states use this index for evaluating teachers as well.
Although a measure of students’ “growth” or progress sounds like a good idea, student growth percentiles have yet to be supported by research. In fact several studies suggest they do not provide accurate descriptions of student progress and teacher effectiveness.
What does it mean?
What exactly are “student growth percentiles”?
They are indexes proposed in 2008 by Damian W. Betebenner, a statistician who suggested they be used as a descriptive measure of students’ “academic growth” from one school year to the next. The idea was to describe students’ progress in comparison to their peers.
Like the growth charts pediatricians use to describe children’s height and weight, student growth percentiles range from a low of one to a high of 99. However, their calculation involves a lot more error than physical measurement such as height and weight. Our research at the University of Massachusetts Amherst indicates substantial error in their calculation.
Student growth percentiles are derived from test scores, which are not perfectly accurate descriptions of students’ academic proficiency: Test scores are influenced by many factors, such as the questions asked on a particular day, students’ temperament, their level of engagement when taking the test or just the methods used to score their answers.
Each student’s growth percentile is calculated using at least two different test scores, typically a year or more apart. The most recent test scores of a student are then compared to the most recent test scores of students who had similar scores in previous years. This is to see which of those students had higher or lower scores this year.
The problem, however, is that each of the calculations carries some measurement error. Further calculations only compound that error. So much so that the results end up with twice as much error. No statistical sophistication can erase this error.
The question is, why are so many states using such an unreliable measure?
Using it for accountability
The use of student growth percentiles is due in part to a desire to see how much students learn in a particular year, and to link that progress to accountability systems such as teacher evaluation.
In 2010, the Race-to-the-Top grant competition invited states to come up with innovative ways of using test scores to evaluate teachers, which paved the way for this new measure of “growth” to be quickly applied across many states.
However, the use of student growth percentiles began before research was conducted on their accuracy. Only now is there a sufficient body of research to evaluate them, and all studies point to the same conclusion – they contain a lot of error.
In addition to our research at the University of Massachusetts Amherst, research on the accuracy of student growth percentiles has been conducted by education nonprofits such as WestEd, Educational Testing Service and other research institutions. Researchers J.R. Lockwood and Katherine E. Castellano recently concluded that “A substantial research base already notes that student growth percentile estimates for individual students have large errors.”
However, many states seem to be unaware of these research findings. Massachusetts even goes so far as to classify children with growth percentiles less than 40 as “lower growth” and children with growth percentiles greater than 60 as “higher growth.”
Measuring teacher performance
As I mentioned earlier, 27 states are using student growth percentiles to classify teachers as “effective” or “ineffective.” Research on the use of growth percentiles for this purpose indicates they could underestimate the performance of the most effective teachers, and overestimate the performance of the least effective teachers – the exact opposite of what these states are trying to do with their teacher evaluation systems.
A recent report by WestEd evaluated the use of student growth percentiles for evaluating teachers and concluded they “did not meet a level of stability” that would be needed for such high-stakes decisions.
Let’s go back to traditional measures
I believe student growth percentiles have taken us a step backwards in the use of educational tests to improve student learning.
Traditional measures of children’s performance on educational tests, such as whether they are “proficient” in a given year and their actual test scores, give a good idea of how well they performed in math or reading in a particular year.
These traditional percentile ranks are still reported on many educational tests, just like they were when we as parents were in school. Traditional percentile ranks compared us to a national or state group in a given year, rather than comparing us to how other kids in the nation or state were “growing” across different tests they took in different years, as student growth percentiles attempt to do.
Given what we now know about student growth percentiles, my advice to parents is not only to ignore them on their children’s test score reports, but also to contact their state department of education and ask why they are reporting such an unreliable statistic.
Developing measures of how much students have learned over the course of a year is a good goal. Unfortunately, student growth percentiles do not do a good job of measuring that.