Skip to main content

Rating and testing Part III: So....what about those test scores?

 In the first two posts in this series, we talked about about two factors that might influence student scores: school resources and student family income. Now we're going to tackle a key cornerstone of modern school reform: standardized testing. 

If you recall, the theory of the case of school reformers is that standardized tests will provide parents with a way to compare schools, and then make an informed decision about where to send their kids. In this post we're going to look at standardized testing. 

A lot of the complaints about standardized testing are about its effects on teaching and learning. Opponents of high-stakes standardized testing will argue that it only measures a limited range of skills, in a way that doesn't necessarily reflect a student's true talents. They may argue that it doesn't measure crucial skills like creativity, problem-solving or leadership, that resources devoted to ELA/math test prep crowd out other subjects like science or social studies. They point out the amount of a school year devoted to test-taking and preparation, and say that given the high stakes, teachers feel forced to teach to the test, boring students and leaving less time for innovative teaching approaches that truly engage students. All of these are useful criticisms, but to me they miss an essential point. 

Standardized tests were never meant to evaluate schools, teaching or learning. 

I'm not an education scholar, so instead I'm going try to summarize W. James Popham's article on standardized tests, and why they shouldn't be used to measure education quality. Popham is an educational assessment expert, former president of the American Educational Research Association (AERA), founding editor of Educational Evaluation and Policy Analysis journal, and Obama-era appointee to the National Assessment Governing Board. In short, this guy has not only studied assessment, he is one of THE guys who studies assessment. Here's my summary of what he has to say about using standardized tests to measure a school's teaching and learning.

  • One point of standardized tests is to compare students to each other, so the goal of test-makers is to find questions that about 50% of students will answer correctly - and so help them differentiate students from each other. This drives a lot about test design. 
  • To make tests a reasonable length, test-makers can only test some areas on a content area. Unfortunately, this does not always match up with what students have been taught (one study found that 50-80% had NOT been covered). 
  • To make matters worse, questions that many students get *right* are not that useful to test-makers - they don't help spread out student scores and differentiate between students. So areas that most teachers would agree are important and cover in detail might not make it on to the test: "the better the job that teachers do in teaching important knowledge and/or skills, the less likely it is that there will be items on a standardized achievement test measuring such knowledge and/or skills."
  • He argues that standardized tests measure three kinds of knowledge: 1) what is taught in schools, 2) students' "native intelligence", and 3) students' out-of-school learning. The first item measures academic knowledge, but the second two measure different kinds of knowing. Some questions in the second category are more about logic than academic skills. And questions in the third category rely on students' knowledge about the world in general. Why would test-makers include the last two categories of questions? Popham argues that they both are extremely effective at differentiating among students. Test-makers know that students come in with varying levels of logical/verbal problem-solving ability, and a wide range of economic backgrounds - so they know that these questions will help spread students out over a wide range of scores. 
To me, this last item seems particularly damning. It implies not only that standardized tests are not well-suited to measure the effectiveness of teaching, but that test-makers actively choose to include questions that disadvantage low-income kids. I'm not sure how they can justify that to themselves, but once you know this, I'm not sure how *we* can justify continuing to use them. Popham argues that standardized tests are useful to help a student understand their strengths and weaknesses, or growth over time, but should not be used to measure educational quality. 

So. The educational reform theory states that standardized tests should be used to evaluate schools, and teachers within those schools, no matter the details about what kinds of students are being taught. These scores should be used to label schools as either successful or failing, which in turn will help parents decide where to - or not to- send their child to school. These choices will determine school's funding, with the logical end being that low-score, "failing" schools will shut down and high-score, "succeeding" schools will have packed classrooms and generous budgets. But what has happened? 

Across our state, teachers have spent enormous amounts of time and energy trying to prepare students for testing, in order to get scores up. Charter schools, understanding that every dollar was at stake, have made choices that resulted in them serving *much* smaller numbers of English language learners, low-income students, or students with moderate to severe disabilities - all of whom were left in more concentrated numbers in district schools. Boston charter high schools also have high rates of attrition - many students who start don't finish - calling into question the glowing graduation, MCAS and college-acceptance rates that appeal to parents. Are they doing a better job teaching - or are they simply educating a different, and more select, cohort of kids than the district? Schools with low scores fell into a vicious cycle of losing students, losing funding, losing resources and teachers, receiving worse test scores, and losing more students, until they closed or fell into state receivership. In the meantime, this meant that low-income kids at low-scoring schools were receiving even less investment than kids at higher-scoring schools. Whole schools have been shuttered and the entire staff fired over low test scores. Whole schools have been threatened with closure or state takeover if they don't raise scores. 

At the end of the day, test scores may tell us about the teaching at a school. But it's much more likely that they tell us about the demographics of the students at that school. While poor students may need more resources to thrive academically, it is not atypical for them to be in schools with less than average resources (like teacher ratios, libraries, or experienced teachers), which does not set them up for success. What's more, the tests themselves are designed for poor kids to score lower. Intentionally. 

This means that poor kids are more likely to be clustered at designated "low-performing" schools, and I would guess that they'd be more likely to experience school closure and state takeover, more likely to have under-enrollment and the budget cuts that come with that, less likely to have the services they need, more likely to experience schools in tumult, or teachers trying to regroup after being reformulated or fired during a "turnaround" process, more likely to be given away to disastrous charter management after state takeover. 

WHAT. THE. HELL?!

And at the end of the day, I still don't know where to send my kid. Or rather, I just have very little to go on from the state/city. My selection process was to post a question in a neighborhood Facebook group asking local parents to tell me about their local elementary schools. Any nearby school that got a positive review got added to my list, and I submitted that to the BPS K1 lottery. What I found was that, not knowing all of this about test scores, I did shy away from schools with the lowest score status.  There was one in particular that stayed off my list until I heard from another parent that the school was beloved, with parents that raved about the community and teaching. How could this be? How could a school that is literally labeled "lowest performing" and in danger of state takeover be beloved? After all that I've learned, I can only conclude that it's because scores often don't tell us much about the quality of a school - environment, culture, innovative teaching, etc. It has a high risk of just telling us about the incomes of the families who go there. 

If you want to read more, check out this 2012 Massachusetts statement against high-stakes testing, signed bu 172 MA education professors or researchers. They discuss impacts of high-states testing, policy recommendations, and share the research behind their position. 

Sources: 

Comments

Popular posts from this blog

Rating and Testing Part II: Could it Be Poverty Itself?

This is the second article in a series. See Part I . I think the next question we need to ask ourselves is: is there something about poverty itself that interferes with learning?  There are many pop-psychology explanations of why poor kids do worse in school: poor kids don't do as well as rich kids because their parents are too busy to help them (or not well-educated enough), or that  they don't have access to enrichment activities . These are the more generous ones. Others simply lean on negative stereotypes about low-income families: they don't value education, they are trapped in a culture of poverty, or, my personal favorite, the debunked theory that they simply don't talk to their kids enough (the "word gap theory" [insert eye roll here].  So let's unpack some of these a little. First off, let's just knock off the beating up on families of low-income kids. Yes, parents may be busy but poor and/or marginalized families care just as much, if not mo...

What to do with $400 Million?

 Apparently, the city of Boston is getting $400 million in federal relief for its schools in the near future. Debate rages among parents - should we spend it on art programs? Middle school sports? How can we plan for such a large amount of spending?  I, of course, think we should spend it on equity. Here's an idea: $150 million for funding long-term positions where they are most needed: teachers in inclusion classrooms and AP classes in non-exam high schools, librarians, counselors, etc. And let's fund them for 10-20 years, not just one year.  $100 million for deferred maintenance. Let's fix up our buildings today so that we can pay for teachers, books and innovative programs tomorrow. Fix leaky roofs, rebuild when needed - let's not lose any more buildings because we failed to take care of them! $25 million split among all Boston schools. Or divided on a per-pupil basis. Either way, give some "play money" equally to each school to decide what they want to do...

Rating and testing Part I: "failing schools"

 I think one of the biggest takeaways from making the BPS timeline is how much test scores matter. Our state test - the MCAS - plays a majority role in a school's ranking (Tier 1, 2, 3, or 4),  or if it is turned over to state receivership (State level 5). It determines if a school is labeled "underperforming," which, given our current budget allocation where money follows students , can mean decreased funding if parents choose to send their students elsewhere (which, of course, can mean decreased resources for the remaining students, which can mean decreased scores, lower ratings, fewer students, etc., etc. etc. until it closes or is put in receivership).  At the very least, it can lead to parents typing away in frustration on social media, "These schools are failing our students!" Some would say they have led to over 70% of BPS schools being moved, closed, reformulated or otherwise disrupted in the past 20 years . I think it's worth it to ask: given all th...