Rating and testing Part III: So....what about those test scores?

In the first two posts in this series, we talked about about two factors that might influence student scores: school resources and student family income. Now we're going to tackle a key cornerstone of modern school reform: standardized testing.

If you recall, the theory of the case of school reformers is that standardized tests will provide parents with a way to compare schools, and then make an informed decision about where to send their kids. In this post we're going to look at standardized testing.

A lot of the complaints about standardized testing are about its effects on teaching and learning. Opponents of high-stakes standardized testing will argue that it only measures a limited range of skills, in a way that doesn't necessarily reflect a student's true talents. They may argue that it doesn't measure crucial skills like creativity, problem-solving or leadership, that resources devoted to ELA/math test prep crowd out other subjects like science or social studies. They point out the amount of a school year devoted to test-taking and preparation, and say that given the high stakes, teachers feel forced to teach to the test, boring students and leaving less time for innovative teaching approaches that truly engage students. All of these are useful criticisms, but to me they miss an essential point.

Standardized tests were never meant to evaluate schools, teaching or learning.

I'm not an education scholar, so instead I'm going try to summarize W. James Popham's article on standardized tests, and why they shouldn't be used to measure education quality. Popham is an educational assessment expert, former president of the American Educational Research Association (AERA), founding editor of Educational Evaluation and Policy Analysis journal, and Obama-era appointee to the National Assessment Governing Board. In short, this guy has not only studied assessment, he is one of THE guys who studies assessment. Here's my summary of what he has to say about using standardized tests to measure a school's teaching and learning.

One point of standardized tests is to compare students to each other, so the goal of test-makers is to find questions that about 50% of students will answer correctly - and so help them differentiate students from each other. This drives a lot about test design.
To make tests a reasonable length, test-makers can only test some areas on a content area. Unfortunately, this does not always match up with what students have been taught (one study found that 50-80% had NOT been covered).
To make matters worse, questions that many students get *right* are not that useful to test-makers - they don't help spread out student scores and differentiate between students. So areas that most teachers would agree are important and cover in detail might not make it on to the test: "the better the job that teachers do in teaching important knowledge and/or skills, the less likely it is that there will be items on a standardized achievement test measuring such knowledge and/or skills."
He argues that standardized tests measure three kinds of knowledge: 1) what is taught in schools, 2) students' "native intelligence", and 3) students' out-of-school learning. The first item measures academic knowledge, but the second two measure different kinds of knowing. Some questions in the second category are more about logic than academic skills. And questions in the third category rely on students' knowledge about the world in general. Why would test-makers include the last two categories of questions? Popham argues that they both are extremely effective at differentiating among students. Test-makers know that students come in with varying levels of logical/verbal problem-solving ability, and a wide range of economic backgrounds - so they know that these questions will help spread students out over a wide range of scores.

To me, this last item seems particularly damning. It implies not only that standardized tests are not well-suited to measure the effectiveness of teaching, but that test-makers actively choose to include questions that disadvantage low-income kids. I'm not sure how they can justify that to themselves, but once you know this, I'm not sure how *we* can justify continuing to use them. Popham argues that standardized tests are useful to help a student understand their strengths and weaknesses, or growth over time, but should not be used to measure educational quality.

So. The educational reform theory states that standardized tests should be used to evaluate schools, and teachers within those schools, no matter the details about what kinds of students are being taught. These scores should be used to label schools as either successful or failing, which in turn will help parents decide where to - or not to- send their child to school. These choices will determine school's funding, with the logical end being that low-score, "failing" schools will shut down and high-score, "succeeding" schools will have packed classrooms and generous budgets. But what has happened?

Across our state, teachers have spent enormous amounts of time and energy trying to prepare students for testing, in order to get scores up. Charter schools, understanding that every dollar was at stake, have made choices that resulted in them serving *much* smaller numbers of English language learners, low-income students, or students with moderate to severe disabilities - all of whom were left in more concentrated numbers in district schools. Boston charter high schools also have high rates of attrition - many students who start don't finish - calling into question the glowing graduation, MCAS and college-acceptance rates that appeal to parents. Are they doing a better job teaching - or are they simply educating a different, and more select, cohort of kids than the district? Schools with low scores fell into a vicious cycle of losing students, losing funding, losing resources and teachers, receiving worse test scores, and losing more students, until they closed or fell into state receivership. In the meantime, this meant that low-income kids at low-scoring schools were receiving even less investment than kids at higher-scoring schools. Whole schools have been shuttered and the entire staff fired over low test scores. Whole schools have been threatened with closure or state takeover if they don't raise scores.

At the end of the day, test scores may tell us about the teaching at a school. But it's much more likely that they tell us about the demographics of the students at that school. While poor students may need more resources to thrive academically, it is not atypical for them to be in schools with less than average resources (like teacher ratios, libraries, or experienced teachers), which does not set them up for success. What's more, the tests themselves are designed for poor kids to score lower. Intentionally.

This means that poor kids are more likely to be clustered at designated "low-performing" schools, and I would guess that they'd be more likely to experience school closure and state takeover, more likely to have under-enrollment and the budget cuts that come with that, less likely to have the services they need, more likely to experience schools in tumult, or teachers trying to regroup after being reformulated or fired during a "turnaround" process, more likely to be given away to disastrous charter management after state takeover.

WHAT. THE. HELL?!

And at the end of the day, I still don't know where to send my kid. Or rather, I just have very little to go on from the state/city. My selection process was to post a question in a neighborhood Facebook group asking local parents to tell me about their local elementary schools. Any nearby school that got a positive review got added to my list, and I submitted that to the BPS K1 lottery. What I found was that, not knowing all of this about test scores, I did shy away from schools with the lowest score status. There was one in particular that stayed off my list until I heard from another parent that the school was beloved, with parents that raved about the community and teaching. How could this be? How could a school that is literally labeled "lowest performing" and in danger of state takeover be beloved? After all that I've learned, I can only conclude that it's because scores often don't tell us much about the quality of a school - environment, culture, innovative teaching, etc. It has a high risk of just telling us about the incomes of the families who go there.

If you want to read more, check out this 2012 Massachusetts statement against high-stakes testing, signed bu 172 MA education professors or researchers. They discuss impacts of high-states testing, policy recommendations, and share the research behind their position.

Sources:

Why Standardized Tests Don't Measure Educational Quality (W. James Popham in ASCD)

SELECTIVE OUT-MIGRATION OF LOW -ACHIEVERS? (Boston Teachers Union Report)
Massachusetts Charter Schools and Their Problems With "Attrition" (Education researcher Jersey Jazzman blog)
Massachusetts Charter Schools and Their Problems With "Attrition" (Education researcher Jersey Jazzman blog)

A Billion in the Bank, Cuts in our Schools: BPS Funding Explained

(Schoolyard News)

All Together Now

Search This Blog

Rating and testing Part III: So....what about those test scores?

Labels

Comments

Post a Comment

Popular posts from this blog

Rating and Testing Part II: Could it Be Poverty Itself?

What to do with $400 Million?

Introducing All Together Now Blog