Shadow & Genuine Assessment

Shadow assessment occurs when programs focus on what they think other people want to hear, as opposed to what is practical and meaningful for them. There are many reasons why programs engage in shadow assessment:

– Fear of assessment and evaluation.
– Lack of time, energy or motivation.
– Fear that assessment results will be used against someone, or to prove someone else wrong.
– Fear resulting from a dysfunctional or toxic work environment (see Where Assessment Works Best).
– Fear of having honest conversations about a program, usually leading to cynicism and skepticism.

With the exception of one item, the main theme running through shadow assessment is fear. This has almost everything to do with the organizational climate and culture of a program or institution. If a culture or climate is toxic, then the result is that most people are going to engage in shadow assessment, and just communicate what they think others want to hear. Because assessment is so dependent on genuine and civil conversations, I am not sure there is a lot one can do to foster a culture of genuine assessment.

In most programs and institutions, however, a culture of genuine assessment can be fostered. Genuine assessment is honest, transparent, and used for program self-reflection and improvement.

Note. Being genuine is not the same as being transparent. People get them mixed up all the time. Being transparent is about posting items to a website, message board, or sending an email. However, it’s a transparent conversation, not a genuine one. This is because it’s not a dialog – it’s a two-way monologue. Sharing assessment results on an online website may be helpful in terms of access to information, but it’s not the same as sitting down with someone or a group and engaging in a real conversation about program improvement. There is a significant difference between what the data are and what the data mean.

Institutional leaders and colleagues can do several things to foster a culture that values genuine assessment instead of shadow assessment.

  • Reinforce the idea that assessment is about improvement first, and compliance second. In my experience, if a program is doing a good job with assessment anyway, they shouldn’t have to worry about compliance.
  • To paraphrase George Patton, tell people what to do, not how to do it. They will surprise you with their creativity and ingenuity. Granted, there are best practices in assessment. And, there are a lot of great methods out there. Many programs are doing very creative things in regard to assessment. Some programs, particularly ones that are less mature in regard to expertise with assessment, will probably appreciate some kind of framework. Most programs, however, will naturally resist a process they feel like they have little control over.
  • With responsibility, comes authority. If programs are going to be responsible for assessment, they need to feel like they have authority over the program. The worst situation arises when programs feel like an authority has forced an evaluation process on them, but the authority absolves themselves of all responsibility, which is pushed down to the program-level. In this case, programs should feel like they have support that will help share the responsibility.
  • Resist commenting on the substance of a program. We cannot possibly be experts in every discipline. Expertise needs to be respected. Leaders and assessment professionals should remind programs they are only there to communicate institutional expectations and provide support, not judge the substance of a program. (Note. This may be different in program review, where budgeting and planning decisions need to be made. The point is to be clear about expectations and the purpose of the evaluation).

When potty training my daughter, we thought it would be a good idea to use cookies as an incentive. While making dinner one day, she toddled in the kitchen and asked me for a cookie. I told her “no,” as we were eating dinner soon. I could tell by the look on her face that she was strategizing: how can I get a cookie? Within seconds, she told me “I have to go potty.” We stopped using cookies as an incentive and she was trained within a week. Not using an incentive was a much more effective strategy for us.

By focusing on support and professional development, as opposed to compliance and rewards, one can do a long way towards fostering a culture that discourages shadow assessment.

Posted in Culture | Leave a comment

Getting Assessment Results Off the Shelf: The Where and the When of Assessment

If you look at many program and institutional assessment plans, they contain a grid. They usually look something like this:

Outcomes | Data Needed | Groups Assessed | Methods | Who | Timeline

This is a very useful and practical way to get organized with assessment, particularly the parts about using data that is already available. Many programs, however, hit a roadblock when completing the timeline column. A common entry in the “timeline” is column is something like this:

– Fall 2013
– After Thanksgiving break
– Summer 2011
– End of the semester

These timelines almost never work. The problem is that people get busy with the most crucial elements of their work, and the parts of their job they care about most. Additionally, the timeline is so vague, it’s very easy to just put it off.

Even when programs are specific about using assessment results, problems can still occur. It’s all too easy to easy to put off a meeting, particularly when these discussions are slated to occur at the end of the semester.

A better strategy is to focus on the where, as opposed to the when. Rather than planning to discuss or use assessment results in “March 2015,” state that you will discuss and use the results at the Annual Planning Retreat in March 2015. Instead of setting aside “Spring 2013,” maybe set aside the second bi-weekly meeting of every month to talk about assessment results. Instead of planning to discuss and use the results at “on-going curriculum committee meetings,” maybe include a presentation of assessment results as an agenda item one or two meetings before the curriculum revision meeting.

Keep in mind that it is the conversation and dialog that occurs as a result of assessment that is much more important than the results themselves. Assessment data, by itself, doesn’t do anything until actual human beings place value on and use the data. Assessment is a form of action research, and its ultimately utility lies in whether it is used for improvement.

Posted in Methods | 1 Comment

Serendipitous Assessment

An Art History major in college, I never felt confident in science and math (although I later liked learning statistics). In my senior year, I needed one more hour of science. I discovered an obscure one hour course in the biology department. To complete the hour, students had to read three books and write papers about them. This seemed like a good fit for me.

I met the biology professor and explained I had no idea what to read. He pulled a random book off the shelf called Serendipity: Accidental Discoveries in Science. I thought the idea of serendipity was wonderful and I really enjoyed the book. The idea that science is not a rigid field dictated by set rules and guidelines was very new to me. Human subjectivity and accidents can have a significant impact on science (Thomas Kuhn explored this idea in The Structure of Scientific Revolutions).

Is assessment, as a discipline, open to serendipity? It seems to largely driven by learning outcomes and goals. This is not to suggest that goals or outcomes are not important. Goals and outcomes provide direction. They serve as symbols for what we care about. Goals communicate our aims. Too much direction, however, takes away from agility.

I think there were four causes that led to assessment being a linear, goal-oriented exercise that left little room for agility, openness, and discovery:

  1. As a discipline, assessment’s foundational roots are in empirical research, specifically quantitative-oriented research. This research is driven by the scientific method and emphasized rational and orderly approaches to research.
  2. In the 1980’s and 1990’s, accreditation became the reason for doing assessment, not intellectual curiosity. This created a high-stakes environment that had little room for intellectual curiosity and random discovery.
  3. Accreditation increasingly borrowed tools and methods from quality improvement and strategic planning models. Assessment adopted a lot of these models in response to accreditation mandates. All of the sudden, TQM, CQI, benchmarking, and all kinds of management fads caught on. One of the strangest ones to take hold, in my opinion at least, was six-sigma. One of the primary rationales behind the quality-improvement methods was minimizing variability. This seems like an odd fit for higher education, which is purposely designed and structured to be diverse on so many levels.
  4. Assessment became a strategic planning exercise and, thus, closely aligned with processes largely unrelated to learning outcomes, like planning and budgeting. In a classic article, The Rise and Fall of Strategic Planning, Mintzberg describes what can happen when leaders adopt this approach:

The problem is that planning represents a calculating style of management, not a committing style. Managers with a committing style engage people in a journey. They lead in such a way that everyone on the journey helps shape its course. As a results, enthusiasm inevitably builds along the way. Those with a calculating style fix on a destination and calculate what the group must do to get there, with no concern for the members’ preferences….calculated strategies have no value in and of themselves…strategies take on value only as committed people infuse them with energy (Harvard Business Journal, January-February 1994, p. 109).

Serendipitous assessment can work in the following ways:

  • Eliminating narrow definitions of what it means to use assessment data.
  • Respecting inter-institutional variability among programs and departments.
  • Realizing it’s okay to have frameworks and best practices, but allowing some flexibility in terms of their use and implementation.
  • Focus on assessment processes and conversations, as opposed to exclusively results.
  • Allowing for multiple methods.

Data-driven decision making doesn’t work. People, not data, make decisions. People informed by good data make better decisions. Life is a combination of what we intend and what happens along the way. Good assessment plans and processes should capture both.

August 2015 Update: In Grading Student Achievement in Higher Education, author Mantz Yorke describes a new form of validity that is associated with serendipity. I thought it was applicable to this post. Here is the excerpt:

What is missing from the list (of forms of validity, including predictive, concurrent, content, and construct) produced by Cronbach and Meehl – and generally from textbooks dealing with validity – is any conception of validity in terms of the capacity of the test to reveal something unexpected. In assessing students’ performances, there may be a need to go beyond the prescribed assessment framework in order to accommodate an aspect of achievement that was not built into the curriculum design. This does, of course, create problems for grading systems, especially when they are seen as measuring systems (p. 21-22).

Posted in Culture | Leave a comment

Differentiating Between Levels of Assessment: Program Goals and Learning Outcomes

One of the problems with assessment is a lack of clarity about what is being assessed. A lot of this has to do with language – metrics, goals, learning outcomes, performance indicators, etc. can all be confusing. Overuse of this language can exhaust people and turn them off to assessment. A really good paper by Susan Hatfield clarifies and simplifies a lot of this. The paper’s better, but here’s a summary:

1. Learning outcomes should focus on students and what they learn. In this scenario, students are the units of analysis, not the program. You can assess what individual students have learned, or a cohort of students. Sometimes, this can be a broad program goal. For example, a program may have as a goal that students will learn teamwork skills. If the program agrees on an assessment method, creates space to talk about the results and everyone is willing to make changes based on the results, then it’s probably an acceptable program goal.

As an aside, it is the conversation and dialog about the assessment results that matter just as if not more than the process or data produced. Addressing one learning outcome a year in a meaningful way is a much better better use of your time than addressing three or four a year that you don’t care about. Data do not drive or make decisions – people do. Data should inform decision-making, but not drive it. Relying solely on data takes the human element out of decision-making, and ignores contextual factors and subject-matter expertise. Data should inform decision-making, but not drive it

Another advantage to keeping assessment simple is that it frees your unit or program to discover new things, many of which were probably unanticipated. This is serendipitous assessment.

2. Program goals refer to broader program goals. For example, a program goal may be to increase external funding through grants and fundraising. Another program goal might be to improve lab space, or to increase the hiring of faculty or staff in a specialized field. These types of goals are indirectly related learning, and, in this context at least, should not be viewed as learning outcomes. A problem arises when a unit decides to treat all goals as learning outcomes. That is a difficult argument to make – how exactly does increased external funding directly impact learning? Even more difficult, how does one measure that? This is not to suggest that it’s not possible or that indirect items like space, funding, or human resources don’t impact learning. Rather, it is just practically problematic to measure in the context of program assessment and evaluation.

I think of goals and learning outcomes like squares and rectangles. Any square can be a rectangle, but not all rectangles are squares. Similarly, any learning outcome can be a program goal, but not all program goals can be learning outcomes.

So, what do you do when you have an annual report, accreditation report, or program review document to write? I think it’s a good idea to separate program goals and learning outcomes. This may seem like just more work, but I don’t think so. Goals are things a unit does continuously anyway. They might be expressed in a strategic plan. If you focus on just assessing one learning outcome a year, after four or five years, you will have a pretty robust and comprehensive report. Or, one could focus on program goals, and then articulate how learning outcomes support each goal.

Posted in Methods | Leave a comment

Where Assessment Works Best

(Originally posted: Wednesday, February 19th, 2014)

In Strategic Planning for Non-profits, John Bryson states that strategic planning works best at places that need it the least, and worst at places that need it the most. The message is that if an organization is mired in politics, negativity, leadership turnover, board micro-management, poor selection of leaders, and all the other things associated with dysfunction, it is hard for planning to work because people are going to base decisions on narrow political agendas, not data or plans, or even what is best for the people they serve.

However, if an organization takes care of and retains long-lasting and good leaders, engages in healthy politics, focuses on problems and not people, and is moving forward in a healthy manner, then good governance, people, and leaders instinctively know the right course. Data and planning work well in these environments because leaders defer to plans that had input and were vetted by a variety of people. Leaders make decisions based on evidence and what’s best for the organization, instead of relying on an agenda or what’s best for their careers. This Dilbert cartoon highlights an example of a bad organization considering a dashboard.

The same principle applies to assessment. Assessment works best at places that needs it the least, and worst at places that need it the most. How does one approach assessment in the two types of organizations? If you work for a healthy organization, then assessment is about supporting the agreed-upon process, faculty and staff support and development, and meaningful dialog (not to be confused with two-way monologue, or talking but not listening).

In unhealthy organizations, there isn’t a lot one can do. Since assessment results are rarely used in decision-making (at least not intentionally), people are naturally going to be skeptical. There is also going to be an obvious lack of trust. The extent to which these barriers can be overcome will determine the success of quality improvement efforts like assessment in unhealthy organizations.

Posted in Culture | Leave a comment

Making Assessment Useful: Why People Are Really Bad At Using Results for Decision-Making

(Originally posted: Friday, January 24th, 2014)

One of the most common challenges people face in assessment and planning is use. Some or maybe many programs go through a lot of work to gather data, talk in meetings, and sit through presentations, only to have the results sit on the shelf.

How does this happen? How do we allow this happen? Furthermore, why do we allow it happen? It is easy to blame assessment and planning frameworks, but ultimately these are just tools. People have the power to manipulate and use these tools for their use.

If assessment and planning exercises are found to have little value, it usually due to one of three reasons.

1. Assuming the process is the outcome. When students participate in a leadership program or take a class, most people assume they’ve learned something or somehow changed as a result. While a reasonable assumption, it is still an assumption. The number of students who participate in a leadership program or the mere provision of the program, by itself, doesn’t really say much about the impact it had on students. This is not to suggest that the number of students who participate in a program is not meaningful, but it is a measure of efficiency and maybe effectiveness, but not impact. (These categories come from a really good introductory book to evaluation, The ABC’s of Evaluation).

In a situation like this, the program’s learning goals, outcomes, and activities aren’t examined, reflected on, or even really discussed. In Utilization-Focused Evaluation (1978), Patton describes two situations where this can occur:

  1. Programs operating under a charity model evaluate program success and worth by the amount of faith, hope, hard work, and emotions put into the program by staff. Obviously, evaluation will not be effective because it is seen as an exercise in questioning staff intentions and faith. It is just assumed that the program is effective, regardless of what an evaluation may or may not reveal.
  2. In programs operating under a pork-barrel approach, program effectiveness is measured by a policy maker’s or constituent’s will. If the program is effective, it is because a policy maker says it is or has the strong backing of a constituent group. Evaluation really doesn’t matter in this situation, and results will be either be ignored or used for political advantage.

Programs that assume the process is the outcome are in a precarious position from an accountability standpoint. Without an assessment or evaluation process, it’s very easy for others to evaluate and judge the merit or worth of the program. This is not to suggest that one should engage in shadow assessment.  Evaluation that produces sobering results will demonstrate accountability and attention to improvement, which is better than saying nothing at all.

In a situation where the process is the outcome, an assessment framework or cycle would only show the outcome and the activity, but nothing else:


2. A failure to make the results meaningful. The second obstacle to using assessment in decision-making is a failure to make it meaningful, or evaluate it. There’s a big difference between what the data are and what the data mean. Data-driven decision making is a misnomer. Data doesn’t make decisions – people do.

In this situation, the assessment data is actually gathered, but not used. Gathering assessment data and not using it is worse than not doing any assessment at all. If a program isn’t doing an assessment, at least time is not wasted on assessment processes that will never be used.

Assessment is a form of action research. Thus, its value lies in its utility. If there is no plan to use the assessment data, then an assessment should not be conducted. The only exception may be when an external agency – like a govt. agency, granting agency or accreditation group – specifically requires the assessment. Institution-wide processes related to planning and budgeting will also require assessment for compliance reasons.

For a program like this, the evaluation of the data and the use of it are missing from the assessment framework. It might look something like this:


3. A failure to act. The third challenge to using assessment in decision-making is a failure to act. In this situation, a program has gathered the data, assessed it, and evaluated it. But, for some reason, it failed to act. The assessment cycle resembles this kind of framework:


There are several reasons for a lack of action, even when evidence exists and it has been evaluated:

  • Budget and resource constraints.
  • A dysfunctional and toxic culture that makes it difficult to act.
  • Leadership or staff turnover.
  • Exhaustion or fatigue.
  • Distrust of anything that has the words ‘assessment’ or ‘evaluation’ tied to it. (I once attended an internal conference where people presented wonderful results of efforts to enhance learning and retention in classrooms and programs, none of which ever showed up in program review or assessment reports. When asked, the people in the room never associated their work with assessment and evaluation, which they viewed mainly as an administrative task aimed at compliance and, even worse, control). To be fair, we can’t blame a lot of people for thinking this way, because….
  • … some administrators identify and define assessment and evaluation as a formal planning or quality-improvement exercise. Assessment and evaluation can certainly be used to inform and feed into larger planning and quality-improvement processes. But the measurement and process principles that are characteristic of quality-improvement techniques (statistical control, reduction in variation, etc.) are difficult to apply to program evaluation.
  • Vague ideas about when the evaluation data will be used, or an unrealistic timeline. For example: “Results will be evaluated in mid-July.”
  • Assessment is viewed as an add-on, and not integrated or embedded into the curriculum.
  • Assessment viewed as an evaluation of people, not processes.
  • Sometimes programs will state that communication is the reason for a lack of sharing assessment results. Communication is never a cause, it’s a symptom. If people don’t like each other, of course they will have problems communicating with each other.

Fortunately, programs that have evaluated their assessment data, engaged in dialog, and generated ideas about what they think is important are in a good position. The ideas should be documented in a report or other format, so that when the time comes to act, the program is ready, and have an assessment framework that looks like this:


Acting on Assessment Data

A limitation of assessment frameworks is that they merely state: “use the results.” That’s easier said than done. There are so many factors that influence decision-making in colleges and universities, almost all of which aren’t captured in regular assessment cycles (see the figure on the right, below). Although, to be fair, models necessarily simplify processes, and can’t possibly capture all variables.

In Utilization-focused Evaluation (1978), Patton describes a situation where a program was struggling to come up with evaluation questions. After a lot of agonizing, debating, and indecision, Patton walked up to a board on the wall, and wrote:

“I would like to know _______ about my program.”

A program struggling with this can change the language of the assessment cycle/framework by using questions. Merely changing the language of a question or directive can have a significant impact on the answers. For example, a recent study from the Harvard Business School found that by changing one word in a question can have a dramatic effect on the answer and action of the recipient. Consider this question:

  1. What would you do if you won the lottery?

Most people will say things like: quit my job, buy a house, buy a new car, travel, etc. Now, think about this question:

2. What could you do if you won the lottery?

With the change of just one letter, most people will give dramatically different answers, like: work on ending poverty, volunteer at an animal shelter, learn a musical instrument, etc.

Similarly, an assessment model that poses questions, like the one on the left below, can give different answers and, for some programs, be more helpful. A great start, then, in making assessment meaningful, useful, and practical is to ask questions like:

“What does the assessment data mean?”
“Where do we go now?” “How do we get there?”


Note. Graphic on left from J. Chapman, Heartland Comm. College, 2011.

Posted in Culture, Methods | Leave a comment

Getting Organized with Assessment – A Framework

(Originally posted: Thursday, January 9th, 2014)

Assessment and evaluation can be confusing to people. I have found that a lot of this confusion is usually the result of two things:

1. A lack of clarity about what is being assessed. There is a difference between program goals and learning outcomes. A program goal may be to increase research funding or expand lab space. These items are operational in nature, and indirectly support learning. I suppose a program goal could be a learning outcome, but personally I would separate them. There is a really good paper by Susan Hatfield that touches on this subject. In fact, if you are new to assessment, this is probably the best introduction out there, and it is only eight pages.

2. A lack of clarity about how assessment can be used to evaluate learning. A college degree tells you very little about what someone has learned. I have met several brilliant people who lack college degrees, and vice versa :). An engagement survey, like the National Survey of Student Engagement (NSSE) tells you a little more, but engagement surveys only look at processes that support learning. At other end of the spectrum, an achievement test, like the GRE or NCLEX (nursing exam) tell you a lot about what someone has learned. Here is a framework for thinking about it:


Posted in Methods | Leave a comment