This is a second in a two-part piece about the current state of educational testing in our K-12 schools. The first half covered the reliability of standardized testing, whether we should be using standardized tests for younger children, and digital educational design. Click here to read Part 1.
The tests don’t test what we think they test
My informant pointed out a huge problem with her third graders taking the test: much of the test had no audio component, and assumed that they could all read and write well enough. But as anyone who has taught children to read will tell you, some kids just learn later. They don’t learn worse and it has nothing to do with their intelligence overall. Late readers are not less successful in life.
But here’s what she had to say about her group of kids: “There was no audio component to the math, so a lot of the test was really a reading test. If they couldn’t read the paragraphs, they couldn’t answer the questions. And they sure as heck couldn’t write a paragraph.” The Common Core assumes that if you understand something, you should be able to write about it. (I won’t get into the question of why any reasonable 8-year-old would actually want to write about math!) But clearly, the less able readers were not being tested on their understanding of math—they were being tested on reading, which depressed their math scores.
On top of that, this test is also a test of a skill most kids don’t learn until middle school: typing. “And OMG they have no typing skills. I’m not sure a 3rd or 4th grader needs typing skills in general, but they were not ready to type for a grade. It was painful to watch.” Again, their math skills took a backseat to something the test designers didn’t even take into account. If we really wanted to find out their mastery of math, we’d let the teachers read the instructions out loud and type for the kids, or install voice recognition software so they could dictate.
Standardized tests have been around for a long time, and over those long years, we have learned a lot about them. Here are some things we know about the tests themselves:
- They are inherently biased. They can be made better and better through tinkering, but they can never reach the stated goal of being instruments that find out “what a child knows” because some children, for a variety of reasons, will never do well on them regardless of their mastery of a subject.
- They are not good predictors of much of anything except how well a child will do on his next standardized test. The SAT, a much better test than any ever designed by a state government, is retooling itself because of the much-publicized research that shows, conclusively, that a good SAT score predicts absolutely nothing. Except, maybe, a good GRE or LSAT score!
- They do not measure the worth of a teacher. Great teachers have all sorts of effects on their students’ lives, but improving their students’ standardized test scores is not a given effect. You can have a great teacher who does amazing things with kids who does not bring up their test scores.
- They do not measure the effectiveness of a school. There are so many other factors that are as important or even more important than test scores. Test scores are one tiny factor that administrators can use to judge schools, but they are not the most important factor by far.
Yet our unreasonable expectations of this test are that it will somehow:
- Be better at testing all children at their own level. See the point above about the inevitable bias. These tests won’t do any better than other tests. Sure, the kids who have trouble tracking from a test booklet to the correct bubble to fill in might do better, but these tests will inevitably end up biased against some other group of kids.
- Predict a child’s success outside of test-taking. No, these tests will not predict any such thing. They will merely predict how well the child will do on the next standardized test. Period.
- Show how well a teacher is teaching. This is absolute idiocy and any idea that teachers should be punished or rewarded based on test scores is rooted in a deep cultural distrust of teachers, not in any sound educational theory. Some teachers may indeed bring up their students’ test scores, but I sure hope those teachers are also doing something useful for their students.
- Give us a way to “rate” schools. I have my own personal way to rate a school. I walk into the school and watch. In a great school, the students will be happy and relaxed. Yes, they may also be deeply focused on what they are doing, but that doesn’t mean they’re not also happy and relaxed. The parents will enjoy the school and feel welcome there. The teachers will feel energized to come to work; they will feel a partnership with the school administrators, other teachers, their students, and the parents. None of these important factors is represented in a composite test score. Yes, the score is a useful piece of information, but it alone does not rate a school.
Until we as a culture deal with our unreasonable expectations, it doesn’t matter how “good” the test is. A standardized test is a measure of how well students take standardized tests. In other words, it’s a measure of how much vocabulary they have heard in their few years on this earth. It’s a measure of what their parents discuss at the dinner table, assuming they have parents, a dinner table, and food to put on it. It’s a measure of how often the people they spend the most time with (and this is not teachers) talk about numbers in real life so that they become comfortable with number sense before being required to learn other skills that build on number sense.
A standardized test is also a measure of a child’s personality—nervous, anxious children don’t test as well regardless of their background. A child who didn’t have protein with breakfast won’t test as well. A child in the first day or two of coming down with the flu won’t test as well as she would otherwise. A child who lives daily with the fear that his older brother will be shot by his friends won’t test as well as he should. A child who is told he is too stupid to learn won’t do well on tests, and a child who has been overpraised about her intelligence (ironically enough) won’t test as well.
In conclusion, there are simply too many factors within the messiness of one person’s little life to put such weight on the results of a test. Sure, let’s make a better test, because we always need to improve the information we gather. But let’s not think that this test is going to solve any educational problems we have. It’s just a test, imperfect, limited in scope, and vulnerable to bias and technical problems. Education is just too important and complex to be judged by such a narrow, flawed instrument.