Thursday, January 31, 2013

We Know Better (Part II): No More Research!

Previously, I wrote about our failure to learn from the successes of other countries, namely Finland, and their educational reform efforts that now provide international models for success. Our competitive nature got the best of us, and we missed the true lessons in our quest to be #1.

Not only have we ignored the important lessons in international practice, we have also dismissed research conducted over the last several decades that gives us powerful methods with sound reasoning to improve teaching and increase learning. Maya Angelou says, "We do what we know to do. When we know better, we do better." This is not the case in education.

Long before we started racing to the top, ranking educational systems by state, and comparing ourselves to Finland and Singapore, we were provided with some very compelling research and meta-analyses that show how to increase student achievement scores on standardized tests without using standardized tests. Yes, we have almost 40 years of evidence that shows what teacher practices and student abilities increase learning. Additionally, we have at least ten years of educational research that says when we tend to the social emotional learning of students, standardized test scores go up.

This research has been translated in many languages, cited thousand of times in educational journals and during staff meetings. This work appears in books for teachers, administrators, and all involved in the education of our children.

For the sake of space, I want to focus on three bodies of research on teachers and students that provide solid evidence that we can increase student achievement scores on standardized tests by focusing on important beliefs and practices that have absolutely nothing to do with standardized tests.

1. Students that are engaged in authentic assessments do better on standardized tests. When students have assignments that require higher-order thinking, in-depth understanding and elaborate communication and that have a real-life connection to their lives, students perform better on standardized tests than those who are not given authentic assessment. It is not enough to require students to think more deeply, they must see the relevance of what they are doing in order to engage.

For African-American students in high poverty schools, the effects of authentic assessment are even greater. 

2. Students who are taught particular strategies can gain up to 45 percentile points on standardized tests. The top three strategies that have the biggest gain are straight-forward and attainable by any classroom teacher. First, students must be able to identify similarities and differences, including being able to use similes and metaphors. Second, when students learn how to take notes and summarize, they are able to synthesize and analyze information, higher-order thinking skills. Third, my personal favorite, students need to be acknowledged and recognized for their effort. Thirty-five years of research shows that students believe their efforts lead to success.  Therefore, when their efforts are acknowledged, they learn more.

These strategies are available for any teacher at any time, given adequate resources and support.

3. Research and work done over 30 years on improving teaching show that increases in student achievement are predictable by two equally important reasons. First, teachers must that believe their students can learn. More than just this belief, teachers must have deep content knowledge. Even after taking into account all of those things that are used to excuse poor performance, teacher belief and teacher knowledge are the greatest predictors of student success.

The teacher remains the single most important predictor of student success.

I was an English language arts teacher of high school students for 11 years. I went into teaching because I absolutely adored teenagers and because I had a passion for literature, for reading, for writing. I wanted to share that passion with students.

I don't believe there are many teachers who enter this profession that feel any differently. Not about their speciality, not about their students. We have evidence that shows how beliefs and practices are critical in increasing student learning.

But what have we done? We have stripped teachers of any self-efficacy or autonomy. We have put into place policies and procedures that prevent teachers from doing what they do best - teach. Instead, we have increased class sizes, removed school psychologists and guidance counselors, reduced administrative support, and now hold teachers in a national accountability spotlight. We have attempted to strip them of any sense of professionalism and leave them demoralized. We hold teachers responsible for more than what is humanly possible. Most disturbing is that we know that the teacher is the key to student learning, yet we stop them from teaching.

We know what works. We have more than 30 years of research that makes one thing crystal clear:  In order for teachers to do what they do best, we must provide them with the time, the structures, the resources, and support to do just that. We need to hold teachers up, not keep them down.

The last thing we need to improve education in the United States is to allocate money for additional research. We do not need a new program or person to suggest a fix to a problem that already has been solved. And the very last thing we need is another standardized test or any initiative that takes teachers away from their students any more than we already have done.

We know where the magic occurs, and we know that it is not really magic.

We know better.

Tuesday, January 22, 2013

We Know Better (Part I): No Need for Competition!

Over the past few years or so, and with increasing frequency, international comparisons have been made about student achievement, and thus, the quality of schools, across the globe. The American response to these comparisons reflects two very different perspectives.

On one hand, our lawmakers have used these comparisons to note the weaknesses of our system. They then propose reform efforts that race to the top in order to leave no child behind. Over the last 10 years, we have watched how our policymakers have moved from looking at student achievement to looking at individual states and now to individual schools and teachers. In Michigan we went from the Michigan Curriculum Framework to Grade-Level Content Expectations (GLCEs) and now to the Common Core State Standards in less than 20 years. The Michigan Educational Assessment Program (MEAP), our measure of quality, has morphed over the last decade, reflecting these changing standards. Now, we anticipate yet another standardized test, one that is computer adaptive, that will allow us to make comparisons across the states involved in the Smarter Balance Assessment Consortium. Meanwhile across the nation, additional tests are imposed that aim to measure the progress of schools and teachers as they make these substantial improvements so that we can be at the top of the world. Billions of dollars have been  given to individual states and to individual schools who have shown great promise in reform efforts, that have reduced achievement gaps.

These well-intentioned and obscenely-funded reform efforts have missed the mark. Policymakers, so impatient, so quick to blame, and so intent to put the United States at the top of the world, have lost sight of some very valuable insight provided by these top-achieving countries. This competitive mind-set allows for decisions to be made that are not in the best interest of our schools.

Educators, on the other hand, have looked at this from a very different lens. We have sent observers and researchers to places such as Finland, in order to see what is happening in the schools. Linda Darling-Hammond, Diane Ravitch, and many teaching experts and educational advocates have already described these school systems and what makes them successful. It takes only a Google search of "Finland education" to see that many have reported on these aspects. From a science perspective to the business industry, we have some powerful and consistent evidence on why Finland's reform efforts have been so successful and student achievement remains the top of the world.

The disconnect between the educators and policymakers, basically what we know and what we do, is daunting. For the sake of this post, I am going to limit what Finland has taught us to two simple ideas.

First, the Finnish educational system did not set out to be "The Best" in the world; instead, they sought to provide "The Best" educational experience for every child, regardless of region, background, and any other factor that is used to excuse low achievement.

Oh, we Americans with our short attention spans and competitive natures. When the US continued to fare so low in international comparisons that the test scores got the attention of our policymakers, we immediately formed task forces and subcommittees. The appalling inequities of our educational systems had been ignored for decades. It was only until the spotlight was directed to US schools scoring lower in international comparisons that we began the race. Finland was probably uninterested in how it stood globally, as the efforts were on Finnish schools and children. I heard once that when Finnish school reformers set out to make substantive changes in their educational systems, they used the research that was already conclusive on effective schools. No additional moneys or committees were needed to determine the best direction, as plenty of evidence existed that provided that direction. This brings me to the second idea.

Finland's educational reform efforts did not change with leadership, politics, or conventional wisdom; these efforts are more than 40 years old.

Compare that to the US educational reform efforts that are renamed, discarded, generated, revamped, and replaced after failing to give many current efforts adequate time to see how and if they are working. In my almost 30-year career in education, I have seen fads come and go, entrepreneurs rise and fall, and standards and assessments change so many times. We latch onto the latest and greatest and the glitz and the glam.  We learn new terms that become the rage, from "curriculum mapping" and "differentiated instruction" to "formative assessments" and "flipped classrooms." This is not to suggest these ideas are not powerful ways to improve the educational experience of students. Finland began an endeavor that made fundamental changes in the business of schools. If curriculum mapping or flipped classrooms helped the progress toward these fundamental changes, I am sure they became tools and strategies that were incorporated toward the end goal. A key issue here is that Finland had an end goal, and nothing detracted from the progress of that goal, despite four decades of reform work.

Recently I saw an article with a headline similar to this: What Will Finland Do Next? My response? Finland will probably do nothing other than stay the course; that is, to work toward providing the best education for every student, regardless.

In order to make fundamental changes in the American educational system that lead to increased student achievement, we need to replace the scoreboard mentality for an abundant one, one that believes that not only is a high quality educational system attainable, but also that it will be available for every student in this country, regardless. Once we adopt this mentality, we need to step away from the microwave and know this effort will not bring immediate gratification. It is not for the faint of heart, nor for those who seek profit and fame in this process. The hard work will be done by those who have our children at the heart and at the forefront of any reform effort.

Quick fixes have done nothing more than further erode our educational systems. Long-term, sustained efforts toward a clear vision of student success is the way we bring fundamental changes in American schools.

We already know what to do. After all, we are the ones we've been waiting for.

Wednesday, January 16, 2013

Where Is Our Outrage Over Non-Writing Writing Assessment?

When someone enters the teaching profession as an English language arts teacher, it is with eyes wide open. One of the biggest challenges these teachers face is learning to manage the paper load; essays that are frequently traded from teacher to student and back come with this territory. ELA teachers are known for taking stacks of papers with them on vacations. Report card markings are brutal, and marathon essay grading at the end of semesters are common. While teachers do become more efficient in grading essays over time, the process of evaluating writing consumes much of their time.

Despite the time and effort involved in evaluation, ELA teachers continue to require students to write. It remains one of the important methods by which students show how they understand logic and organization and how they connect with literature. In writing students show more than knowledge of the rules, they demonstrate this ability within a context. Because most of the writing in ELA is literature based, students also show how  they critique, make inferences, and develop arguments.

ELA teachers have students write, and then they assess the strengths and weaknesses of the writing. The assessment event is not over at this point, however, for they have students make edits and revisions in writing. These teachers understand this critical step in writing.  It allows them to see even deeper into the students' command of the English language, from mechanics and organizational skills to vocabulary and sentence construction. In addition to all of these benefits, ELA teachers understand that the writing process is a powerful way to see how student thinking is changing.

Cue music. Enter the large-scale standardized assessment developers.

Ever fixated on statistics that indicate validity, reliability, item difficulty, etc., the large-scale standardized assessment developers immediately face new realities. Before they can generate numbers to crunch, they must take into account the process of building high-quality generic, trait-specific rubrics. They identify the need to establish inter-rater reliability and the importance of preventing evaluator fatigue. These assessment developers quickly understand that evaluating actual writing takes a great deal of human input. With increased human input, assessment developers have increased costs, for example in training and monitoring. With humans evaluating writing, test results are not immediately provided to users. The developers embrace the conclusion that it takes much more time and money to properly evaluate student writing than the test developers are willing to invest.

And so, it takes very little time, if there is any consideration at all, for these standardized test developers to make the decision to forgo actual writing in order to assess writing. The final product? Non-writing writing assessments.

Every time I read that paragraph, I chuckle.

I would rather not spend any more time thinking about non-writing writing assessments, for a host of reasons. Unfortunately, these tests are being used at an increasing rate and for higher stakes, even though we know better. For that reason,  I raise two very basic questions.

What is really being measured in non-writing writing tests?

Doing a quick scan of a few standardized testing sites, including statewide assessments, I noticed that non-writing writing assessments are called a variety of things, from "language" and "English/language arts" assessments to "writing" and "communication arts" tests. Furthermore, these tests are made of multiple-choice items that ask students to choose options that show they understand rules and procedures of writing. My personal favorites include "Which of the following sentences uses a comma correctly?",  "Which of the following sentences is written in past tense?", and "Which of the following is an adverbial clause?"

In addition to punctuation and grammar, the non-writing writing tests claim to assess students on their composition/writing skills and on writing structure. For example, one assessment given widely across the nation, evaluates students on these skills by answering correctly questions such as "Which words can we use to make the sentence more interesting?" and "Which of the following would be used to develop this idea into a poem?". Because students are forced to choose between a set of options, interesting is defined by the test maker and an accurate response means either the student chose the answer they thought would be correct or the student got the item right by chance. The question about creating a poem, a work of art at its highest form, is reduced to a set of steps predetermined by the developer. It is probably safe to assume that creativity, style, voice, and the ELA teacher are nowhere to be found during this process.

What information does the non-writing writing assignment actually provide?

The current "must give" assessment provides a continuum for scores that range across grade levels. The reports list the students' scores with their current grade level and then categorizes those scores into five possibilities: far below basic, below basic, basic, proficient, advanced. I have yet to find what how these labels were determined and what they actually mean.

In addition to that not-so-informative report, the assessment developer provides other reports that claim to provide a window into student potential. Seriously. Teachers, parents, and students are able to track "growth" within and across years. Even more amazing, these results allow them to compare student progress with other students in a district, building, or classroom. For the ELA teacher, these non-writing writing assessment reports help them to adjust their curriculum and instruction in order to meet the needs of their students. I need someone to show me how this is possible.

I know little about language acquisition, and I can only imagine how our collective wisdom grew as humans began a verbal exchange. The Socratic method continues to be a powerful way to evaluate higher-order thinking and problem-solving skills and has done so successfully for over 2000 years. Compare that to the standardized test that first appeared in the US only 100 years ago and,  not-so-coincidentally, was followed by the first standardized multiple-choice assessments given to students a short time later.

The written language stores collective wisdom. It creates a space for solving world problems and for starting world wars. The writing process is complex, messy work. Evaluating that process is no less arduous. The written work is powerful and serves so many purposes, from documenting history to considering possibilities. Pushing our students to become strong writers has so many benefits, for them and for us.

Where is our outrage over writing assessments that do not assess writing? What will it mean for the future if we continue this madness, making claims on language usage that are not based on using language?

As I say so many times these days, Maya Angelou says that we do what we know to do, and when we know better, we do better. We must do better.

Tuesday, January 15, 2013

Data: A Love Story

I was recently engaged in a lively #edchat on Twitter, and one of my Tweeps asked if the term data means the same thing as the term information. This is the gist of my not-so-reverent response:

Once upon a time, we made good decisions based on solid evidence. Then one day, someone said, "Data." The End.

Oh, how we are enamored with data. We love collecting them, talking about them, and using them to drive decision making. We aggregate, analyze, and map them. We are so in love with data that we use them to rationalize every major decision made in education, from the classroom to the board room.

Unfortunately, our love affair and extended honeymoon with data have blinded us to the realities and limitations of data. We have yet to wake up during an item analysis and ask ourselves, "What in the world have we done?" We need to separate ourselves from the allure of data and the presumed answers they provide, and take a step back to look at data with fresh eyes.

For example, we might consider the actual word data.  For the record and from my former English teacher's lens, the word data  is plural and the singular form is datum or data point. Growing up, I remember pronouncing the word as DA-ta. By the time I was in my doctoral program in the 1990s, I was guided to pronounce the word as DAY-ta. Although I have no empirical evidence to support this, I privately theorize that our switch from DA-ta to DAY-ta happened at the same time we replaced the word test with the word assessment.

Fresh eyes would also allow us to take data from the glamorous and powerful podium and recognize them for what they are - numbers, characters, and others bits of information. On their own, and without context, they have no meaning. As an example, consider this data point: 15. When 15 is noted as the number of years old, we might picture a teenage boy or girl. Add more context, such as the age of a grandmother's home computer, and we get a different visual of an enormously large, heavy and slow PC, connected to a dot matrix printer. Stripped down and unplugged, data begin to lose their charm.

Seeing data without its star power would also force us to ask questions about the substance and the quality of the data, something we rarely do, especially when talking about assessment results. Before we jump into comparisons and trends,we must begin an analysis. This analysis does not have to be a sophisticated or complex manipulation of the numbers in order to make meaning. In fact, most of that sort of analysis is provided to us by the test originator. Instead, the first stage of our analysis is asking questions about both the test and the data (test results).

Some initial questions might be:
  1. What is the purpose of this assessment? Given the purpose, does the assessment measure what it intends to measure? How do you know?
  2. What additional information might be needed to meet the purpose of the assessment?
  3. How is this assessment related to the standards we have set in this building or district? How are the standards aligned?
Our love affair with data has caused us to assume test results come from quality assessments that measure substantive knowledge and abilities. We accept as a given that these assessments are also based on a set of clear and articulated standards. The biggest assumption we make is that the assessment and its results are  aligned our own classroom and district standards. These assumptions also encourage us to remain victims of data (test results), instead of being consumers of the results.

The answers to the initial questions might keep data at an appropriate status. For example, as consumers of the test, you might find that the test does not have clearly articulated standards. If this is the case, you might resist putting too much, if any, weight on the test results as it relates to your classroom or building. After all, if you don't know what the test is measuring, the test results will provide you with irrelevant instructional information. You might also find that the assessment does not live up to its purpose and that it needs additional data before you make assumptions about the results of the test. If so, you will be reluctant to read too much into the results until you have supplemental information that provides a more complete picture of student learning.

The most important finding would be to realize the assessment may be completely unrelated to what is happening in your classroom and in your building. If that is true, then the assessment may be serving a different purpose altogether. In this day of educator evaluation, this assumption is the most dangerous of all.

As we become consumers of assessments, we realize that we have the power and ability to decide what data are needed to inform our practice in order to increase student learning. It is likely that as we become more confident in our ability to evaluate the quality and substance of data, that we become proactive in choosing assessments that will provide us the information we need to be better educators.

As we are able to identify what kind of data are needed to help us improve what we do, we can move from assessment victim to assessment consumer to assessment developer. The results of these assessments will provide us data that is finally worthy of our love.

Monday, January 14, 2013

Abuse, Misuse, and Overuse of Standardized Tests

Fifty years ago American students and teachers were subjected to the administration of standardized assessments on a semi-regular basis. A portion of a school day was repurposed for the administration of the test. Most understood the need for standardized assessments; these tests had a particular purpose and provided meaningful information to policy makers and chief educational leaders. Furthermore, the results of these tests provided the means to make comparisons across buildings, districts, and even across states. The test results could be also be filtered by categories such as gender, ethnicity, and special populations. The information was general, but powerful at the bird's eye level only. After all, if a teacher wanted to know how she could improve her practice to better meet the needs of her individual students, she would be looking at evidence of learning at the classroom level.

Something alarming happened, though, over the last 15-plus years that caused a shift in how we use the results of these tests. Over time, the standardized test has been stretched so far and the results given so much weight that the standardized assessment has been abused, misused, and overused. We have lived through a strange transformation of a tool initially intended for policy makers be stretched into allegedly providing valuable instructional and curricular information to teachers. This simply is not the case.

In order to understand why large-scale standardized assessments cannot provide valuable information to teachers, we must remember the original purpose of the standardized assessment. Standardized assessment is administered and scored in the same way. Directions to students are uniform for any student taking the test. Testing conditions are also as standard as possible. This assessment experience allows a considerably large population of students to take the test, and the results of the test are then replicable and generalizable. This means that we can make overall statements about how a large number of students do on this test, and we can repeat this experience next year and be confident that the test is measuring the same things. Despite common practice and understanding, these tests do not have to be timed or in multiple-choice form. The important characteristic is that they are administered and scored in a consistent way.

The reality of our educational system and of our society is that we consistently look for the most "efficient" way. The efficient way to assess our schools usually involves multiple-choice items on multiple forms of a test that can be electronically scored. The results are a jackpot for psychometricians, a North Star for policy makers, and a learning thermometer for Superintendents. The results, however, do not provide any useful information for teachers of individual students in their classrooms. The standardized test was never intended for this purpose.

The claims of this assessment  to be valuable to the individual teacher emerged when we lost sight of the purpose of the assessment. To explain, when developing these and all other assessments, the purpose and appropriate use of the assessment must be the first consideration. Determining a clear and appropriate use of the test is the most important step of assessment development. This determination includes not just the plan for how the test is used, but also the identification of the users of the test. In essence, the purpose of the assessment answers the question of why the assessment is needed.

All who are affected by the assessment, including students and teachers, must understand why an assessment is given.

There are districts across the country that are attempting to evaluate teachers and administrators using tests that were never intended to be, nor can they ever be, useful at the classroom level. Even more distressing is that we are subjecting our students to yearly (at minimum) assessments that were never intended to provide anything but a bird's eye view.

If individual teachers are to be evaluated by how much their students learn, then standardized tests will never fill this role. Instead, we must use assessments that allow students to demonstrate what they know and are able to do. We must use assessments that include self-reflection so that students show ownership in their learning and can communicate the "then" to "now" journey. Most importantly, we must use assessments that provide teachers with rich information used to make instructional decisions that result in increased student learning.

Learning is complex and messy. The most efficient way to measure it must honor its complexity.

Next up? Our love of data.

Sunday, January 13, 2013

Multiple Issues About Multiple-Choice Items

It's amazing how a late-night email to Diane Ravitch grew into a charge for me. As I wrote before, my friend Christine asked me why I was upset about the use of the NWEA MAP , especially when it would likely replace the MEAP, Michigan's statewide assessment. I wrote to her the following statements:

The NWEA MAP is a computer adaptive, standardized test that uses selected-response items that were written from national standards. The test items are aligned post hoc to state standards and the test results are used to measure student growth in language, reading, and mathematics.

I underlined the concerns I had about the claims attached to tests like the NWEA MAP and the Michigan Education Assessment Program (MEAP). In hindsight, I omitted many other issues, such as teacher evaluation, cut scores, data, and proficiency. I will return to all of these in time. Today's topic for consideration is the selected-response item, also known as the multiple-choice item.

Assessment experts, psychometricians, and others who look for item difficulty, item discrimination, and test validity and reliability will sing the advantages of assessments comprised of multiple-choice items. Their evidence for these claims are based on statistical analyses where p-values, Pearson Product Moment correlations and reliability coefficients dominate the discussions. To the assessment experts these measures indicate how solid the assessment is. While important considerations for those involved in the multi-billion-dollar test industry, this evidence is nearly meaningless to most educators

Advocates of the multiple-choice assessments claim other advantages. For example, because large populations of students take the tests, test developers can spread the content over different forms of the test, a process called "adequate sampling." To the classroom teacher, this helps to explain why students can have different versions of the same test during a mass administration of standardized paper and pencil tests. Furthermore, they are more objective than open-ended or performance items; there is little bias in evaluating these items and they easily lend themselves to electronic scoring. The greatest advantage to using multiple-choice tests is that they are a faster and cheaper way to gauge student achievement than by performance or writing assessments. Great educators probably never factor faster and cheaper into anything they do.

Every stakeholder involved in the education of our children has a need for assessment, but one assessment cannot serve every purpose. A large-scale multiple-choice assessment can sweep across large populations of students and give a quick pulse of progress. This information is good for policy makers, Superintendents, and others needing a bird's eye view. It cannot, however, provide the insight needed at the classroom level, the place where detailed information about individual students becomes the basis for changes in instruction. Changes in instruction bring about improved teaching and increased learning, the heart of what we want to see in our schools.

The results from multiple-choice tests rarely if ever show what students are thinking when they choose an option. Because we are unable to see the reasoning behind the choice, we are unable to see where students went wrong. Whether the items are A-D or A-E, the possibilities remain for students to select the correct answer by chance. Students are forced to make a choice among options, prohibiting students from offering up a different response. Again, this fails to provide useful information about individual students.

Furthermore, our standards are, for the most part, written in such vague and broad language that multiple-choice items cannot get at the essence of what is expected of our students. The multiple-choice item is limited in how deep it can go. Factoids and trivia lend themselves easily; however, if we want our students to demonstrate deep and complex thinking, it will take more than checking an option.

A Google search for the origin of the multiple-choice test will likely give Frederic Kelly the credit for bringing this item type to education, and he did so only a century or so ago. Educators with some extra time on their hands might be interested in how its popularity increased exponentially to the current day. It took little time for college admissions tests to forgo the once all-essay tests and morph into assessments like the ACT and SAT of today.

Sure, there are additional limitations to using the multiple-choice items,but there is a greater issue here.

We look to Finland as a model for excellence in education, given their near-the-top status on international tests year after year. We have studied their model. The Finnish students rarely take multiple-choice assessments. I repeat, the Finnish students rarely take multiple-choice assessments. Their achievement is determined, instead, by demonstrating what they know.  Despite not having to take semi-yearly and yearly multiple-choice tests from grades 3-10, the Finnish students outperform most of the world on a primarily multiple-choice test. We have research here in the states that shows this same phenomenon. When students have demonstrated what they know and are able to do through authentic and performance assessment, they perform better on standardized, multiple-choice tests.

If we glean no useful information about individual students from tests like the NWEA MAP and we know that students can excel on international standardized tests when they are engaged in school- and classroom-based authentic assessments, why do these multiple-choice assessments exist at all?

Maya Angelou says that we do what we know to do and that when we know better, we do better. Not in this case.

Next up? Standardized testing.

Saturday, January 12, 2013

Concerns about Online Assessment? Yes! It's CAT.

imageWhen I forwarded my desperate email to Diane Ravitch to my good friends and kind listeners, my friend, Christine, always so observant and such a careful reader, wondered what my concerns were with the NWEA MAP assessment. I realized that I had about 100 concerns within that email, and that if I am to help grow the national conversation, I need to take my rant down a notch. In this first blog, I am going to try to clarify my concerns with online tests, such as the NWEA MAP test because of computer adaptive tests.

Advocates of computer adaptive tests (CAT) say that the program behind the assessment tailors the test to the student's ability. No longer are students frustrated by an exam with items that are too difficult. Teachers and students are given immediate results; no longer do we have to wait for months before test results are given. The tests have RIT reporting that allows all to see how student learning grows over time and over years. Educator evaluation is required by law for every teacher in every building in Michigan. These MAP tests are being used to evaluate teachers in some districts, especially as pre- and post-tests are incorporated into classrooms. The CA tests are being used across the nation with growing numbers. They are purported to be cost effective, efficient, and objective.

I have several concerns about the CAT, especially with regard to equity. For the sake of time, I will limit this rant to three.  First, the process of taking a computer adaptive assessment can easily become a psychological issue that has absolutely nothing to do with demonstrating knowledge. I have watched 30-40 students at a time take CA tests, up to 150 students. Students who are not easily compliant and are sick of assessments find out quickly that if they choose random answers that the questions get easier and easier and the program soon kicks them out of testing. For students who are anxious and want to a good job, they begin to know in short time that they are getting items wrong, because the items get easier and easier. For struggling students, this reinforces the fact that they will likely get a "bad grade", further digging the hole of hopelessness. The longer the student is taking the test, the better they are doing. So. students who want to get a "good grade" on the test will be anxiously anticipating each question, self-assessing how they are doing on the tests. This becomes less an opportunity to demonstrate knowledge, and more a psychological "man vs. test" scenario.

Second, most of these CA tests are being made by companies who exist to make a profit. The testing industry is a multi-billion-dollar industry. The largest of these companies is Pearson that now owns the national lion's share of textbooks, student information systems, assessments, and other hugely profitable products. While NWEA is a non-profit organization, it is partners with Pearson. I have learned that when you want to know who has the power and is driving the bus? Look for the money trail.

Third, and most concerning, students do not have the ability to review their answers on a CA test. When they have finished with an item, they go on to the next. This flies in the face of what we know about learning and demonstrating knowledge. How many times have you been allowed to go back and make revisions on high stakes assessment? Even with the GRE (paper and pencil still in 1996), I had an opportunity to go back and review my answers. For these types of assessments one corrected answer can make a huge difference in the overall score. For any paper I write, I go back and make revisions and edits until I feel that it is 'ready.' Even then, I find mistakes afterward. The only time I have been unable to go back and check my work is when I was subjected to yearly standardized tests as a child. The final product or performance is an end result of revision, editing, and reviewing.

Our children must understand that self-correction is an indication of learning. With the stakes placed on these tests, students' inability to review their answers, correct mistakes, and make revisions fails to give us information about how students' thinking has changed. Thoughtful and careful test-taking is difficult, if not impossible.

Next up? Standardized testing.