Saturday, December 14, 2013

College and career readiness - what?


Who wouldn't think college and career readiness is a good thing?  Of course students should be prepared for college and career.  But what does this mean?  And why are we talking about  college and career readiness as if they are the same things?

First, let's consider college readiness.  Is this the same thing for all colleges?  How about for all majors even at the same university?  Does making sure all students can simplify radical expressions and solve quadratic equations mean they are all ready for every college program in every college?  What about technical college?  Business college?  Do all colleges have the same criteria for admission?  How can we have one set of standards to prepare all students for college when the requirements and expectations for different universities and different programs vary so greatly?

Now think about career readiness.  It's even more mind-boggling to think about preparing students for all careers with the same set of standards.  Seems to me we're hammering a lot of square pegs into round holes. How are we preparing the photographers, the artists, the dancers, the musicians, the athletes, the carpenters, the builders, the public servants when we expect everyone to meet the same expectations geared primarily to students with strong verbal aptitude?  

Nearly everything in the common core is geared to verbal skill and reasoning.  Reading complex text, writing to text, citing evidence from text.  How about the careers that lean more to concrete, visual-spatial intelligence?  Social-emotional intelligence?  Musical intelligence?  Students are not standardized.  Why should their education be?

And another thing - why do we spend so much time identifying and quantifying student weaknesses, focusing on what they can't do?  Is there room to also find what they can do?  Can we encourage them to grow and develop other skills and knowledge besides the ones in the standards?

I'm all for preparing students for college and careers.  I just happen to think that the best way to do this is to encourage students to discover and follow their passions.  To become lifelong learners.  To be equipped as confident, competent citizens.  To take risks.  To spread their wings.  I am not sure we can do that when we use the same standards for every child.

I don't have the answers.  Just lots of questions.  But perhaps we need to have some dialogue about where we're going with our obsession with standards and high stakes testing before we lose an entire generation of children.

What the FIP?

Student academic growth is a hot topic these days.  Many states are using student growth measures to evaluate teachers.  As I examine data with educators, one recurring question is "What can I do to impact student growth??"  Sometimes this query is a manner of professional curiosity, but sometimes, it is accompanied by anxiety, frustration, and even panic, since 50% of a teacher's evaluation is based on this metric.

My usual answer is:  FIP.



What is FIP, you wonder? Glad you asked.  FIP stands for Formative Instructional Practices and it's hands-down the best way to improve teaching and learning that I've seen for a long time.  In fact, FIP isn't new.  It combines "unpacking the standards," developing clear learning targets, appropriate use of  feedback, collecting and using data to inform instruction, and the best of formative assessments all in one nicely organized initiative.  

Much of the training and work for FIP comes from Battelle for Kids (BfK), a non-profit organization headquartered in Columbus, Ohio with a known track record for excellent, research-based educational products.  BfK has developed a series of online modules to guide teachers as they learn about and implement FIP in their classrooms.  FIP schools have found that, not only can they improve their student growth scores, but they notice increased student engagement, self-confidence, and student ownership of their own learning.  

I'm a firm believer that students will not take responsibility for their own learning until we give them responsibility.  FIP modules give practical strategies teachers can use to increase student ownership.  BfK also has several subject-specific modules in which teachers can see what actual procedures can be set up to manage this system.

My personal experience with FIP comes to me through my grandson - I'll call him Joey (not his real name or he would kill me).  He moved to a FIP school his 8th grade year.  Prior to his enrollment at Adams Middle School in Johnstown, Ohio, Joey hadn't done so well in school. He suffered from all kinds of medical problems that impacted his hearing and language development as a young child, which has an impact on his learning to this day.  Adams Middle School is a FIP school.  All of the teachers understand and implement FIP.  The school schedule is structured to allow students extra time to study and reassess if they don't achieve mastery on their first summative assessment.  The overwhelming sense of anyone entering the building is that all professionals are involved in helping each student learn.  The amazing thing to me as a grandma is that Joey can now tell me exactly what he needs to study for tests.  He can articulate what he is learning in class.  And, best of all, Joey is a successful student with confidence to take risks and learn new things.  Oh, Joey still fails tests the first time he takes them sometimes, but he sticks with it, studies, and ends up mastering the material so that he can get B's and C's....and even a few A's in his classes. He no longer feels stupid - and for a grandma, that's priceless.

No, I am not an employee of BfK, nor do I receive any benefit from plugging their work.  I'm just an educator interested in what works and a grandma who loves learning. 


Principal as Zombie

Watching a zombie movie trailer the other day, it struck me how similar the walking dead appear to the principals with whom I work in central Ohio.  I'm regularly hearing comments like, "It used to be more fun than this."  or "I miss how I used to be able to interact with my staff and students."


Ohio has implemented a new evaluation system called the Ohio Teacher Evaluation System (OTES).  There are two parts to this evaluation: 50% is based on student growth measures and 50% is based on a teacher performance rubric.  The rubric is well-defined and based on known best practice.  I have much hope that it could be a tool that stimulates professional discussion and facilitates professional growth.  It would, that is, if we had time to actually do it.

Principals must manage a building of hundreds of students, supervise dozens of staff members, and interact with parents on a regular basis.  They attend ball games, plays, concerts, and Board meetings after hours.  On top of this already full agenda, the state of Ohio has added nearly 200 additional hours of work to every principal (and many, many more hours than this to some) by requiring that every teacher be evaluated every year.

Teacher performance does not drastically change from one year to the next.  Requiring a full-blown evaluation with multiple observations, conferences, and "walk-throughs" each year for every teacher is overkill, causing principals to be so overburdened that they can't take time to work the process in an effective manner.  Requiring evaluations every 2-3 years would allow for more time to actually make the process work.

Senate Bill 229, now in the Ohio House, would be a step toward setting up a system that could be used to improve teaching and learning.  In this bill, principals would only evaluate skilled teachers every other year and accomplished teachers every three years.  Teachers who are developing or ineffective would still be evaluated annually.  As educators, we want to do our very best.  Reducing the number of teachers that must be evaluated each year would make the task more manageable.  And maybe my principal friends wouldn't look so much like zombies.

When growth is not growth

Unless you're living in a cave, you know that teachers in many states are being evaluated and rated based on the scores their students achieve on standardized tests.  I have written about why this is a bad idea here, here, here, and here. Today, I'm going to explain yet another part of the system that is patently unfair to teachers.


Once upon a time, there was an excellent teacher in an excellent school district who taught excellent students.  In fact, the students were so excellent that they achieved far higher than most other students in the state in every way.  They competed in tests of scholastic aptitude, they excelled in debate and music, and of course, they always scored in the very top of statewide standardized tests.  

This year, the excellent teacher in in the excellent school had seven of these excellent students in her excellent classroom.  When the standardized test were given, she was quite confident that her students would do well, even though three of them had been out late the night before at a concert.  When the results were released, however, it was found that four of the excellent students had excellent scores that were the same or a little better than the scores they had the previous year.  Unfortunately, they had scores so near the top in the past that their total gain was very small.  One can't score better than 100%, after all, and all of the students had begun at scores of 93-99.  

What's really unfortunate, however, is that the three excellent students who always did excellent work and had excellent academic accomplishments didn't have such excellent scores on the state tests administered the day after the concert.  In fact, all three of them had scores that dropped anywhere from 12-28 points on the bell curve.  When the State averaged net gain for this group of excellent students, they found that the average change in growth was negative 8.  

The excellent teacher in the excellent school with the excellent students was devastated because, based on this data, the district was determined to be a FAILURE when it comes to student growth with their gifted and talented students.  

This story is based on a real situation in a real school district in Ohio.  And herein lies yet another problem with using Value Added Measures in determining teacher effectiveness.  Average is not always the best representation of a set of data.  Let me give you another example.  Let's say 100 teachers are in a room, and we want to calculate the average income of the population of the room.  Looking at the wages of each teacher, we determine that the average annual income is $45,000.  Now, let's say Bill Gates walks into the room.  His annual income is $3,710,000,000.  We recalculate the average salary in the room and find that the mean annual income is now $36,777,230.  Do you believe that calculating the average income gives us a truly representative, accurate look at this data?  

Of course not.  That's why using average is NOT a fair and accurate practice when there are students who are near the ceiling of the test score and/or there are outliers not representative of the overall student growth.  There is only so far that students can go up when they start near the top, but their ability to drop in score is out of proportion to what they can gain.  We simply can't use this as an accurate measure of student growth, and most certainly, we can't use this as a measure for a district's or a teacher's accountability.  

VAM: Size matters

What if I tell you that two different teachers can get the exact same growth scores on the exact same test and have completely different Value Added scores?  Possible?  As it turns out, yes. Fair?  I'll let you decide.

As a Value Added Leader (VAL) and educational consultant in Ohio, I have the opportunity to work with many teachers in several districts to look at their value added teacher level reports.  In case you have missed the news, Ohio determines educator effectiveness by measuring how much students in teachers' classrooms "grow" on mandated standardized tests.  Simply put, student scores are placed on a bell curve and then compared with where they place on the bell curve the following year.  Those changes in placement on the bell curve (Normal Curve Equivalent scores,or NCE scores) are averaged across a classroom to get a mean NCE gain.  In order to be "most effective," a teacher's students must have a mean NCE change of at least 2 standard errors above the mean growth score.

That's a lot of math talk, I know, but let me explain a little about "standard error" to those not familiar with statistical math concepts.  Standard error is basically the confidence I have in the data.  If I have a LOT of data, my standard error is small, since I have more confidence in the data.  When I have fewer data points, I'm not so confident and so the standard error is larger.  There are a couple of factors that have direct impact on the size of the standard error - size of the population and range of the scores.


To put this in terms of a classroom teacher's rating, teachers in middle schools typically have 120 or so students and elementary teachers have maybe 25.  Special education teachers or gifted intervention specialists have even fewer, and if two or more teachers work with the same students, their numbers are decreased even further since the students are "linked" to all of the teachers who contribute to instruction.  The teachers with more students will have a small standard error and the ones with fewer students have a large standard error. So what?  The problem is that this becomes a big deal when determining a teacher's effectiveness rating.  Let's look at an example that I encountered at a school just yesterday.

Two middle school math teachers, one general ed and one special ed, co-teach a class of sixth grade math.  We will call them Mrs. A and Mrs. B.  They did an outstanding job, and their students, all low-performing students in the past, did quite well on standardized tests.  Their mean NCE change was about 5.  Another teacher in the same building, Mr. C,  teaches three classes of the same subject each day, and had similar results  - mean NCE gain of 5.  In other words, their students grew the same amount on their standardized tests - the teachers all produced equal "growth" in terms of how our legislature defines growth.  The standard error of Mrs. A and Mrs. B was 4.9.  Mr. C with similar results has around 70 students and a standard error of 1.9.  Remember, more students, more confidence in the data.  Same growth, different standard errors because of a difference in the size of the teachers' classes.  Mrs. A and Mrs. C have the smaller class and they both "link" to all of their students and so they each get credit for only 50% of their students' results.

In Ohio, the "most effective" teachers are those with a gain index of 2.0, that is their mean NCE change is 2 or more standard errors above the mean.  Teachers with a gain index of 1-2 are "above average". These are the teachers whose mean student gain is between 1 and 2 standard errors above the mean.  "Average" teachers are plus or minus 1 standard error from the mean.  Teachers between 1-2 standard errors below the mean are "approaching average" and those with mean student change in NCE scores of more than 2 standard errors are "least effective."  It's all about the standard error, but, as I've explained, standard error depends on the size of a teachers' classroom.

The exact same growth in student achievement resulted in a gain of over two standard errors for Mr. C in our example above,  and so he is lauded as one of the "most effective" teachers in the state.  Mrs. A and Mrs. B, however, with the exact same gain, but a standard error of 4.9 are average.  Same results - same gain in student achievement but the teachers are evaluated very differently because of the size of their classes.

If this sounds unfair to you, you would be correct.  There are MANY problems with using standardized test results and norm-referenced testing for accountability that I addressed before here  , here, and here. But for teachers looking at "average" ratings, this problem is significant.  My effectiveness should not be determined by the size of my class.

Wednesday, November 27, 2013

To fail or not to fail, that is the question

A particularly disturbing meme making the rounds in social media boldly states:  "In some schools, they have abolished failing grades and they'll give you as many times as you want to get the right answer.  This doesn't bear the slightest resemblance to ANYTHING in real life."


When something is repeated widely, it can take on a truth of its own.  People accept it without question because of the ubiquitous nature of the quote without thinking through the underlying meaning.  I maintain that giving students the time and support they need to actually LEARN something is far more helpful than failing them the first time and moving on.  Let me explain.

In the real world, I have multiple chances to pass the bar exam, take a drivers' license test, master proficiency on the Praxis exam.  I have used this metaphor before, but before a pilot is entrusted with landing a plane on his own, he must be given repeated attempts to demonstrate that he has acquired the skill.  Some will take fewer attempts, but ALL must learn to land the plane safely.  I don't show the pilot how to land one time and give him a test.  No, I provide much practice and multiple attempts to reach proficiency.  On the job, if I am being trained to do a new task, my employer makes sure that I get the training I need to do the task correctly. It may take some of us longer to learn, but certainly multiple attempts to grasp the necessary knowledge are afforded so that all employees, in the end, can do the expected task.  Even after an employee is trained, if a job is botched, it will likely be sent back to be redone so that the final result is acceptable.  It is the rule, rather than the exception in life that we have more than one chance to master skills that we are learning.

Why do we accept this notion that all children must achieve the same standard at the same time?  Historically, grades were used to sort and rank students.  Rather than serving as communication about student achievement relative to clear learning standards, grades were intended to put children into tracks.  These students would go to college, these would enter a trade, and these would be relegated to working on the farm or in an entry-level position in industry.  This sort and rank strategy worked in an industrial society where children could make productive livings on farms or in factories with little formal training.  College was not intended to be for all students.  Education was set up to be like the assembly line - all children moved along at the same pace and were sorted at the end of the line by grades, the quality control of the system.  

While this process functioned as intended in an industrialized economy,  it is disastrous in the twenty-first century.  Today, we know that all children can and do learn.  We understand that high expectations are possible.  Teachers strive to challenge each student at his or her own level, so that all children move forward toward clear learning targets.  It is vitally important that all children acquire the skills and knowledge necessary to be successful adults.  This means that ALL children should be given the opportunity to learn.  Assembly-line education is obsolete.  Grades should inform us of student achievement relative to learning goals.  While the goals may be the same for all children in a classroom, the time it takes each child to reach the goal is the variable.  

In the past, the TIME was the constant and the learning outcome was the variable.  All children didn't learn the same high standards.  Today, TIME is the variable and the learning outcome is the constant.  Quite simply, it takes some of us longer to learn than it does others, but that doesn't mean we all can't learn.  Students who fail a summative assessment initially are given extra time and support to reach the same level of mastery as students who grasp the material the first time presented.  Failing grades are not a mark of rigor.  They are a sign that someone has given up.

Tuesday, November 26, 2013

Sample Superintendent Letter for Ohio

Here is a letter Ohio superintendents concerned about common core implementation and PARRC assessments can use as a template or model to send to Ohio legislators as they contemplate HB 237.





Dear (state legislator)

We are a group of superintendents from districts in Ohio who would like to express our concern over recent developments and requirements in K-12 education in our state.  You are examining a bill to halt implementation of the common core standards, as well as the PARCC assessments that are due next school year.  We believe that it is important for you to hear our voices and concerns as you debate this legislation.

There is much good in the common core standards, adopted in Ohio as the New Learning Standards in English/Language Arts and Math.  Our teachers are using Close Reading strategies and taking students more deeply into complex text.  We are using more focused math standards and solving rich problems.  We are purchasing materials to help with the instructional shifts required. However, we also have grave concerns about the common core standards and the assessments that accompany them.

First of all, child development specialists, early childhood experts, and teachers of young children are concerned about the cognitive level and developmental readiness required in the new standards.  The standards were developed by selecting the SAT score that would be required to achieve a B in a 4-year college program, and then back-mapping the skills and knowledge to preschool.  This is unrealistic and certainly not research-based.  If we are to implement these standards, early childhood experts MUST be involved in developing the benchmarks for young children.  

We are also concerned that these standards are totally untried.  They may lead to children being more prepared for college and career, and they may not.  They may also lead to higher dropout rates, frustration, and discouragement; we simply don’t know.  With the extreme accountability measures in place for teachers and schools, undue emphasis is placed on English and Math standards that may or may not work.  Not only that, but with budget cuts and such high stakes on standardized tests, subjects that are untested are falling by the wayside.    

Another question concerns the purpose of the new standards.  We have heard that they are intended to prepare students for college and career, however research suggests that there is no correlation between student achievement and rigor of standards. We posit that the standards were developed for the purpose of creating a national market for companies that sell educational tests, textbooks, and test-prep materials. Bill Gates, at the 2009 National Conference of State Legislators, stated, “ When the tests are aligned to the common standards, the curriculum will line up as well—and that will unleash powerful market forces in the service of better teaching.”  Our children deserve better.

Next Generation Assessments will be piloted this spring and enforced upon our schools next year, carrying high stakes with them, as well, even though there is grave concern that we do not have the number of computers nor sufficient band width to accommodate so many students taking online assessments at once. Again, the tests are untried, hurriedly prepared, and are designed to fail 70% of the students taking it. There is no reliability or validity to these assessments.  Using them to notify 9-year old children that they are not on track for college and career is ludicrous, and to use them to evaluate teachers is equally as absurd.  In addition, our children will be spending literally days of instructional time in lengthy assessments.  Children in grades 4-8 will spend 9.5 hours in standardized testing for the ELA and math assessments, and those who are most needy and require extended time, even longer.  Science and social studies assessments for Ohio are expected to mirror the format of PARCC assessments, which is an additional four hours, minimum. In addition, children will be required to take an assessment in “Speaking and Listening,” with no projected time specified yet.  When children are in the computer labs or classrooms for extended testing time, the operation of the entire school is disrupted.  The PARCC assessments alone take 40 days of assessment (20 for performance assessments and 20 for end of year assessments) to rotate all children through the limited computers we have.  If even half that number of days is required for science, social studies, and speaking and listening, our schools will be disrupted for 60 days of the school year.  This is a conservative estimate.

With these concerns in mind, we urge you to support HB 237 to halt the implementation of the common core and PARCC assessments.  Review the standards with child developmental readiness in mind.  Pause high-stakes testing until we can transition properly to new standards that are good for all of Ohio’s students.