Sunday, August 03, 2008

Guest post by Heather Johnson


I was interested when Heather asked me if she could do a guest post. The title link (above) links back to the blog where one can find some of Heather's other posts.

While I agree with all of Heather's concerns about testing - curricular narrowing, too many eggs in the single test basket, etc. - I am even more worried that status measures (proficiency levels) are terrible indicators of how well a school/grade/classroom is educating a particular set of students. For all of their flaws, standardized tests could be used to provide more useful information for improvement.

I also agree with Heather that high stakes tests are here to stay. I'm pretty sure that performance pay linked to value-added measures in just over the horizon for many states and districts.


The Pros and Cons of Standardized Testing

The debate rages in educational circles: is standardized testing a fair assessment for measuring students? Everyone has their ideas on the issue and many people are still on the fence about the merits of standardized testing. The “No Child Left Behind Act” has put an emphasis on this system of testing. Legislators believe that we need to have a raw score for the students and this is how we should rate all students in the United States. Good or bad; this is going to be around for the long haul. In order to help you find your stance on this contentious issue we’ve come up with a list of the pros and cons of standardized testing.


Reliability – a standardized test is reliable if the questions and answers are dependable. If students answer the same questions in the same fashion then a test is deemed dependable and the research that can be gleaned from such results is valuable.

Validity – a test is considered to hold validity if educators can make proper assessments of their students based on the information learned from their test results. The notion behind a standardized test is to find students flaws and correct them in the future.

Trends – educators need to be able to see trends developing in students’ answers on a broad scale for the standardized test to be considered important. We are not simply looking to see who is smart and who is not; we are trying to pinpoint areas in the curriculum where students can improve and how we, as educators, can help them improve.


Diminished Time in the Classroom – with the advent of the standardized test it’s meant that teachers need to teach their students how to take a test properly. This means that there will be hours of practice tests and teaching students how to follow instructions properly. This takes away from actual teaching of subjects.

Teaching to the Test – this is one of the most common complaints among educators, parents and students alike when the subject of standardized testing arises. Some teachers feel they have to teach specifically to the material they believe will be on the test and department heads will shift a given curriculum to follow suit. This diminishes the role educators have in forming lesson plans.

Unfair to students – many students become extremely nervous for any sort of test. When they are aware of the weight a test will have, such as standardized test, their stress levels soars through the roof. The results of these tests then become skewed.

Whether you like it or not, standardized tests are here to stay and all we can do is hope the students are always the center of our attention.


This article is contributed by Heather Johnson, who regularly writes on Alabama teacher certification courses. She invites your questions and writing job opportunities at her personal email address: heatherjohnson2323 at gmail dot com.

Friday, August 01, 2008

Projected Growth and Value-Added in Ohio

This is so cool. A policy discussion that differentiates between growth and value-added.


Wednesday, July 23, 2008

Ohio Governor wants to talk about wacky ideas

Governor Strickland wants put everything on the table for education reform. One of his points was:
"What if we created a value-added system that measured results and compensated teachers for improving student achievement?"
Well...he's more than half way there. The state now provides value-added analysis and will have access to classroom-level data. Four of the state's largest districts are recipients of federal Teacher Incentive Fund grant funds.


Tuesday, July 22, 2008

Colorado's State Growth Model Goes Live

Colorado's growth model is official now. A number of outlets have picked up the press released (linked to above in the title) and discussed how this new analytical information can inform policy making in the state.

Colorado Charter Schools Blog has a post and links to a page with all of the charter schools in the state. Papers in the state begin to discuss how to read the diagrams and how parents might use the results (The Examiner, The Daily Sentinel, The Reporter Herald).

This level of visibility and dialog is a welcome addition to the national discussion of the appropriate usage of growth model data.


Monday, July 14, 2008

Data Quality Campaign 2007 Survey on Longitudinal Systems

One of the most important contributions of the Data Quality Campaign has been to gather solutions and examples of policy making from state agencies across the 50 states. In particular, the work done in Florida to update its already impressive PK-Workforce system to take advantage new technologies and more flexible thinking about appropriate data use provides a broad backdrop for possible innovations across the country.


Friday, July 11, 2008

NSBA's Blog summarizes how parents feel about testing

BoardBuzz posting a link to an Associated Press Poll about issues in educational problems and possible solutions. One of the interesting findings is the gap between fairly solid support for tests as instruments of school accountability but a belief that classroom work better represented student learning.


Tuesday, July 08, 2008

Collaborative provide V-A Training Support in Ohio

The Ohio Resource Center is an interesting collaborative that combines scholars from the different schools of education in in Ohio and is supported by the legislature and Ohio Board of Regents. The "about" page describes the Center as follows:
ORC's resources are available primarily via the web and are coordinated with other state and regional efforts to improve student achievement and teacher effectiveness in preK–12 mathematics, science, and reading. The website is organized around Ohio's academic content standards.
What I find interesting, as a V-A guy, is that the Center has also developed a three module set of lessons on understanding value-added outcome metrics. It builds up knowledge about assessments and their uses, in general, and then goes on to describe how one should use V-A metrics.

A big thumbs up for the ORC!


Saturday, July 05, 2008

Science Cannot Be Secret

This blog post does a great job laying out some of the core concerns about complexity and proprietary information in accountability systems. Our mantra has been "simple is better, unless it's wrong". In the VA work we do, we might prefer a simpler model that has "strong" assumptions. However, we think it is important to develop model enhancements that deal with cases in which the assumptions cannot be met. We look at the results of both models to see what difference violating the assumption makes. I very much agree with the author of this blog that this is a policy decision and that the educational agency (state, local, etc.) must be engaged with the model design and be able to understand and explain any and all modeling choices.


Wednesday, July 02, 2008

Value-Added doesn't get any more real than this

This is the face of classroom value-added on the ground. The Ohio Department of Education and Battelle for Kids (who provides training for the interpretation of VA scores) have used classroom level scores to identify teachers who produce large gains in student test scores.


Monday, June 30, 2008

VA and mobile students

I've been trying to pull together how the different districts and states using VA measures for accountability are dealing with Mobile kids. The differences in how the issue is discussed in the three states currently using SAS's EVAAS for state-wide accountability. Bill Sanders has addressed the issue very clearly in his writings around the EVAAS model. The model being used by EVAAS can handle missing observations that are generated by mobile students. In all of the jurisdictions for which I can find documentation, there is a rule for who falls into the accountability system and his therefore included in a particular year's model. Students who do not meet the definition are excluded from the growth model.
  • Ohio Rules - (page 1 of ODE VA FAQ and FAQ at Battelle for Kids)
    "How does high attrition or mobility affect the value-added measure?

    Schools and districts are accountable for students enrolled at that school for a full academic year. Only students who are continuously enrolled from October count week through March testing will be included in the analysis. The Ohio Department of Education will match students test scores across years and schools using the SSID.

    The two FAQs seem to imply that mobile students are included, but only when they meet the above definition. It is possible that some non-zero percent of students fall out of the analysis every year.

  • Pennsylvania Rules - (there are more detail proposed rules here - but they do not differ on this issue)

    It seems that in Pennsylvania, the district - not the school - is accountable for the performance of students who do not attend for a "full academic year". As in the case of of Ohio, the EVAAS system explicitly takes into account missing data for individual children.

    (page 21 of the Pennsylvania Consolidated State Application Accountability Workbook)
    "Schools, LEAs and educational entities are accountable for mobile students in the same manner as they are for other students. The “full academic year” criteria are applied to all students. In Pennsylvania, it is not uncommon for students to move from one school to another within the same district during an academic year. In these instances, the school in which the student is enrolled at the time of the assessment bears responsibility for test administration; however, the district, rather than the school, will be accountable for the student’s performance."

  • Tennessee rules (several districts (pdf page 2 and other external sites) refer to 150 day enrollment requirement before the test )
My concern is for the implications this has for high mobility districts - particularly the large urban settings with high mobility rates. While I can see how techniques for dealing with missing data can be used to make good classroom and grade estimates, there might be a related incentive to not focus as consistently on the learning needs of mobile children - given the pressing needs of those fully included in the accountability system. A model that includes "dosage" or proportional assignment of student growth is what we should be shooting for. This would be relatively easy in a district in which most of the mobility is school-to-school within the district. However, as state-wide data system improve, it should be relatively easy to track student mobile within a state and get access to their test data and current school. A dose-based model does he math right and provides consistent incentives to school staff.


Friday, June 27, 2008

Accountability in Higher Education

There have been discussions of using standardized tests in undergraduate institutions as a part of an institutional accountability system. A working committee was established by the ational Association of State Universities and Land-Grant Colleges (NASULGC) and the American Association of State Colleges and Universities (AASCU). They developed a framework that is called the Voluntary System of Accountability (VSA). The VSA can be implemented with a series of different examinations (C-Base, CLA, CAAP, MAPP, GRE and ACT WorkKeys). There are currently not enough questions in common across these assessments to support a simple value-added model. There is the notion, however, that VA measures are the goal for the test-based elements in the accountability system.

The piece linked to by the title suggests that this approach follows the recommendations of the Spellings Commission too closely. Tests will not capture the range of what undergraduates are expected to learn across a wide range of subjects. The Association of American Colleges and Universities (AACU) has established the Valid Assessment of Learning in Undergraduate Education (VALUE) project to expand assessment efforts beyond tests to include an e-portfolio approach that would document both growth and breadth of student learning and development.


Tuesday, June 24, 2008

Obama and McCain on VA and and teacher pay

There don't currently seem to be enormous differences between the two presumptive presidential nominees on student assessment, school reform, and the use of VA results. Both support public charter schools and programs to bring highly qualified teachers to low performing schools. Obama seems to be more open to a wider range of performance measures (including teacher knowledge, observed practices, etc.) than McCain when discussing teacher performance pay. Obama also puts more emphasis on early childhood development efforts.


Saturday, June 21, 2008

Adaptive tests as the assessment fix for NCLB's narrow approach to testing

In the Value-Added Research Center work with districts and states, we have run into a number of instances of the NWEA MAP being used untested grades and subjects to fill in gap years between NCLB-mandated tests. This approach is particularly appealing in districts engaged in teacher and or school incentive projects. The ability to include more teachers in grade-to-grade growth models is appealing across the board. Administrators like the equity and external validation of external measures. Many educators like the respect given to tested subject and prefer not to have their performance measured by walk throughs or other observational measures only.

On the other hand, there is still great deal of research being done on the validity and reliability of growth models based on computer adaptive tests (over 2300 hits). Student under an adaptive test regime do not take the same form of the test. Much of the science around understanding growth rely on students taking the same form of a test. However, when one looks at the working being done on VA use of adaptive tests on gets 4 hits.

We are hoping to be able to work with one or more districts using the MAP to see how well this works in practice. We are also looking a districts using quarterly diagnostic assessment to predict performance on the annual high stakes. Likewise, we are likely to work with one or more districts who want to use the PLAN-ACT series of tests for measuring high school productivity.

There is certainly plenty of work to do.


Thursday, June 19, 2008

Colorado Growth Model Introduced

Colorado has been working for some time on a state-wide growth model. In early March, the state issued a press release and made a number of documents available on their web site.
  • Technical Report on a Colorado’s Academic Growth Model (pdf)
  • Presentation to district assessment directors (pdf)
  • Changes to the accreditation process presentation (ppt)
The Technical Report includes both the authorizing legislation and a technical paper by external consultant Damian Betenbenner. Betebenner and his colleagues at the Center for Assessment in Dover, NH are generally on the simpler is better side of student performance modeling recommendations.

I am not an economist, although I do play on on TV. However, I know from personal experience that we've been repeatedly put in the position of evaluating simple systems and the unintended consequences that flow from such models. As hard as one imagines it might be to do growth modeling well, it's probably 2 orders of magnitude harder than one imagines. In particular, what looks good to an outsider looks very different to a teacher or principal whose job performance or bonus is going to be based on that analysis. Educators suddenly discover a preference for complex models when the simpler models turn out to be unfair to some large portion of the adults in a system.


Wednesday, June 18, 2008

Coloado District Explains new state growth model

A district assistant superintendent does a nice job presenting the new Colorado growth model and how it differs from previous accountability measures.


Monday, June 16, 2008

UK Education officials stuggle to explain attainment versus growth

Educators and policymakers in the US are not the only folks to struggle with explaining what can appear to be contradictory outcome measures. A recent report identifying failing schools described a wide range of growth performance. In particular, this story points to 30 schools who were in the top 5% on attainment measures (number of GCSEs) but failing on growth.

This is something many people have trouble discussing. The notion of "controlling" for prior ability and demographic characteristics gets confused with expectations. What controls do is level the playing field by making a fair comparison. From a social policy point of view, positive or negative coefficients on for gender, economic status, or race have nothing to do with expectations. They are are the growth equivalents of attainment gaps. If economically disadvantaged children in the 5th grade show on average that their growth in test scale score is 4 points lower than non-poor students, this is the performance gap. This tells us how well we are doing helping students overcome the educational impact of the non-school economic resources. Policy should be focused on improving the rate of learning growth to erase the growth gap.

The challenge presented by interpreting high attainment versus low growth is actually not that hard to overcome. We all know of schools that are good at recruiting families and students with high prior test scores. Many adds for new homes included references to the attainment levels of local schools. High prior school-level attainment (and school magnate programs) tends to attract families with high attainment students. Recruiting students with high prior attainment is the simplest way to be a high performance school under an accountability system that focuses on attainment. A growth model, instead, controls for prior attainment and teases out what learning was delivered in that year. A school can be very good at recruiting while being not very good at challenging good students. The two things are quite different.


Friday, June 13, 2008

Bill Sanders responds to public criticisim of his VA model

Ed Week covered a dispute (May 6, 2008) between Bill Sanders (SAS Inc.) and Audrey Amrein-Beardsley (Arizona State University). Assistant Professor Amrein-Beardsly published (Ed Researcher) her own analysis of data collected in a study Sanders conducted on the effectiveness of board-certified teachers in North Carolina.

What this exchange confirms is that smart, well meaning people can come to entirely different conclusions. In particular, the argument that a simple model is required for VA to be accepted is completely at odds with our experience that VA models have to be quite complex to be fair. There is no simple answer to the transparency-equity argument. It is a normative paradox that leaves scholars red-faced and exasperated on both sides of the argument.


Wednesday, June 11, 2008

U. S. Secretary of Education Margaret Spellings Approves Additional Growth Model Pilots for 2007-2008 School Year

The next round of approvals of state Growth Models.
Washington, D.C. — U.S. Secretary of Education Margaret Spellings today announced approval of two high-quality growth models, which follow the bright-line principles of No Child Left Behind. Michigan is immediately approved to use the growth model for the 2007-2008 school year. Missouri's growth model is approved on the condition that the state adopt a uniform minimum group size for all subgroups, including students with disabilities and limited English proficient students, in Adequate Yearly Progress determinations for the 2007-2008 school year. (7thSpace)


Tuesday, June 10, 2008

New York regional education board provide VA PD

The Capital Region Board of Cooperative Educational Services held a day long session on value added measures on May 28, 2008. Sponsors of the session included New York State School Boards Association (NYSSBA), New York State Council of School Superintendents (NYSCOSS), School Administrators Association of New York State (SAANYS), and Battelle for Kids in Ohio.

I ran across the blog linked to the posting title as I was looking for people around the net explaining VA to others. I am keenly interested in how folk exposed to VA discussions understand them and explain them to others. This is one of the better explanations I've found. It uses a particular form of reporting for its explanation structure - one found in Battelle materials. I'm on the lookout for other approaches such as this year's VA versus this year's attainment compared to last year's attainment with this year's VA. The two graph's tell different stories. Neither is wrong, but they have different purposes. I am hoping that the sophistication of the analysis used in PD efforts broadens to include a wider range of program and school evaluation questions.


Sunday, June 08, 2008

Houston's VA PD and the criticism of complexity

Whatever I think about the difficulty of training teachers and administrators to understand and use value-added measures, I agree with my colleague Rob Meyer who consistently argues that simple is better, unless it's wrong. I really like the quote from Bill Sanders at the end of the RedOrbit post linked above. Bill use a great teaching aid as well:
"I'm not going to trade simplicity of calculations for the reliability of the information," he said. "Before groups of teachers, I often hold up a cell phone and I say, 'I don't have a clue what's inside this, but I have to have trust that when I punch the numbers, it's going to call the right number.' "
There are two challenges when working with educators on understanding and using VA measures. First, one has to expose how the simplicity of attainment masks its underlying inadequacy as a performance measure. Second, one has to show that the more complex analysis used in VA models is more fair and gives educators credit for improving student learning no matter where the student starts across the range of prior ability.

There is no way to dodge the complexity bullet if we want to be fair to students and educators.


Friday, June 06, 2008

Houston's Delivers new Value-Added numbers at part of performance pay system

Houston abandoned it's local performance measurement system after data quality and communication problems last year. The Houston Chronicle described the new approach at beginning of this school year. The new program, called ASPIRE, was designed to answer the core problems of the local system - clear rules, transparency around data quality, and equitable access to financial incentives. ASPIRE is a supported by value-added analysis provided by SAS EVAAS and professional development, online support services, and leadership consulting provided Battelle for Kids. This reworking of the HISD approach to performance pay was supported by the Broad and Bill and Melinda Gates Foundations.

We'll need to watch over the next few days and weeks to see how these effort play out.


Wednesday, June 04, 2008

School board member digs in on tough issues

Wow. A recently elected school board member is really thinking hard about using VA results. Our local school board has been scheduling monthly VA briefings for the assessment subcommittee as we work to develop a statewide VA model for Wisconsin.

We are working to integrate our work with the Milwaukee and Madison school districts to show how the results and analysis differ in large and mid-size districts. We are also working with a regional service agency to explore the best ways to report results to small districts. Small districts have unique challenges for statistical analysis of VA given their low student counts. We will be looking at grouping similar rural districts into "quasi" districts to extract more explanatory power from the models.


Monday, June 02, 2008

Education Week on Using Value Added Data

Ed Week reported on a meeting at the Urban Institute that was a summary/policy implications round up of a much more technical meeting held here in Madison in late April, 2008.

One of the most important things delivered by the April meeting was a concerted attempt to translate between economists, statisticians, psychometricians, and sociologists. There are is an emerging consensus on the minimum requirements for a value added model used for high-stakes decisions. As the number of scholars engaged in real world analysis converge on a set of recommendations, we should be able to form a more consistent, non-technical explanation of VA model features and assumptions. One of the biggest stumbling blocks facing the wide spread adoption of VA models is the perception that ordinary people cannot understand them. We are getting close to the prerequisites for a coherent set of explanations.


Saturday, May 31, 2008

So, why have I been so busy?

I've been working with VARC Director Rob Meyer to grow our research center. We are engaged in a range of work in Milwaukee Public Schools, Chicago Public Schools, a series of Teacher Incentive Fund projects, as well as basic research on extensions of our value-added model.

The work includes a series of program evaluations, professional development around the use of VA measures, random assignment experiments, improvements in operational systems, and the use of VA models for quarterly diagnostic assessments. The current focus is on adding a number of tweaks to our model that uses two prior scores to compute VA measures. The complexity of including corrections for measurement error, mid-year testing, retention in grade, etc. has required lots of graph paper and white board makers.

As for the development of the team, at the last PhD count, we were up to 2 in sociology, 2 in industrial engineering, 2 in educational leadership, 1 in statistics, and 6 in economics. We are probably going to add a couple of more MAs in economics and statistics and a PhD in applied mathematics. I can also tell you that Fortran is alive and well here. We are hoping to start working with the Condor Project to address some of the computational problems of more complex VA models.

As the guy in charge of strategic planning and human capital development for this merry band, it's been an exciting 18 months.


Massachusetts Education Commissioner interested in VA

The new Commissioner for Education in MA is interested in VA for school and program evaluation. This notion that attainment alone does not tell enough of the story is popping up all over.


Ok. I'm going to try it again


I actually liked doing these posts. The pressure of starting up a major research and technical assistance center made this effort look like the back breaking straw at the end of a long day. I am at the end of a growth phase and have added quite a few staff members. My goal is actually to bring in a few guest bloggers amongst the VARCians to contribute analysis and materials.