By Ken Nowack, President & Chief Research Officer/Co-Founder of Envisia Learning Inc.
“There’s nothing wrong with being shallow as long as you’re insightful about it…” – Dennis Miller
Have you ever asked for feedback and wondered whether their perception about you was accurate? Ever ask more than one person for feedback and wonder why there were differences in how you were perceived?
Ideally, if we could all find the one “all knowing and totally honest” person in the world we wouldn’t need to ask a variety of people to provide us with feedback–we would know the “truth” about us immediately.
Having used multi-rater or 360-degree feedback for many years it’s important to understand that raters don’t really see and experience the same things about your behavior. In fact, there might be even more congruence between raters than within a group of raters (e.g., colleagues or peers).
Most vendors of 360-degree feedback products won’t tell you what researchers in the field actually know about between and within rater agreement so I thought I’d share a bit about this with you1.
How We Perceive Ourselves and How Others Perceive Us
- Ratings between different rater groups are only modestly correlated with each other.
Research consistently shows that ratings between direct reports, peers, supervisors, self and others overlap only modestly2. Self-ratings are typically weakly correlated with other rater perspectives with greater convergence between peer and supervisor ratings3.
Simply, we are more often seen differently by various groups of people we interact and work with (e.g., our boss, our direct reports, our team members). As Brutus, Fleenor, & London (1998) note, ratings from different sources are not necessarily expected to be interchangeable or even highly correlated with each other despite the finding that multi-rater assessment instruments generally demonstrate equivalent functioning across the traditional rating sources.
It seems intuitive to expect that some differences in perspectives will occur across rater groups. In general, direct reports tend to emphasize and filter interpersonal and relationship behaviors into their subjective ratings whereas superiors tend to focus more on “bottom line” results and task-oriented behaviors4.
However, these meaningful rater group differences might also be a point of confusion in the interpretation of their data for participants trying to use the results to determine specific behaviors to modify and which stakeholder to target. This possible ambiguity in understanding and interpreting multi-rater feedback is important in light of recent research suggesting that people who are even mildly neurotic report more distress by uncertainly within oral and written feedback than given direct negative feedback (Hirsh, & Inzlicht, 2008).
At a practical level, it means that participants might be challenged to understand how to interpret observed differences by rater groups and whether to focus their developmental “energy” on managing upward, downward and/or laterally in light of these potentially discrepant results.
- Ratings within rater groups are only modestly correlated with each other.
In one meta-analytic study by Conway & Huffcutt (1997), the average correlation between two supervisors was only .50, between two peers, .37 and between two subordinates only .30. Greguras and Robie (1995) explored within-source variability in a study of 153 managers using 360-degree feedback. Using generalizability theory, they analyzed the number of raters and items required to achieve adequate reliability in practice5.
These researchers suggest that if a 360-degree feedback assessment has an average of 5 questions to measure each competency (not uncommon in practice), it would require at least 4 supervisors, 8 peers and 9 direct reports to achieve acceptable levels of reliability (.70 or higher). Since our coachees can rarely find that one “all knowing and candid” rater to provide them with specific and useful feedback, it suggests that having an adequate representation and larger number of feedback sources is critical to ensure accurate and reliable data to be used for behavioral change efforts.
From a practical perspective, since reliabilities set an upper limit for validity, having too few raters providing input to the 360-feedback process might actually minimize the usefulness of the feedback that is given back to participants. Given these findings, vendors who do not provide a way for participants to evaluate within-rater agreement in feedback may increase the probability that average scores used in reports can be easily misinterpreted—particularly if they are used by coaches to help coachees focus on specific competencies and behaviors for developmental planning purposes.
Best Practices for Getting Reliable (Accurate) Feedback from Others
Given the research about feedback from raters, how can you maximize the usefulness and meaningfulness of information you get from others?
Here are at least six things to consider doing:
- Communicate the purpose to your selected raters ahead of time and let them know you value their confidential input for your own growth and development.
- Ask more people than you think you need (remember that you need a “critical mass” within each rater category to minimize outliers and get a true picture of reality).
- Ask raters to provide concrete, constructive and behavioral suggestions for what you can do more, less or differently to be more effective.
- Follow up with all raters to thank them for their quality time and to share an insight, learning or commitment to action (this will politically help signal you heard a message from them).
- Don’t try to guess individual responses–look for themes from all raters to get a snapshot of your signature strengths and potential development areas.
- Review the feedback on balance–neither accentuating anything that might appear overly critical nor negative or just emphasizing “leveraging your strengths” because even things we do well if overdone can become our weaknesses in the long term.
In the spirit of this blog topic on feedback, how’d I do? Be well…
- Nowack, K. (2009). Leveraging Multirater Feedback to Facilitate Successful Behavioral Change. Consulting Psychology Journal: Practice and Research 61, 280-297
- Woehr, D.J., Sheehan, M.K., & Bennett, W.J. (2005). Assessing measurement equivalence across rating sources: A multi-trait approach. Journal of Applied Psychology, 90, 592-600
- Nowack, K. (1992). Self-assessment and rater-assessment as a dimension of management development. Human Resources Development Quarterly, 3, 141-1 55
- Nowack, K. (2002). Does 360-degree feedback negatively affect company performance: Feedback varies with your point of view. HR Magazine, 47, 54
- Greguras, G.J. & Robie, C. (1995). A new look at within-rater source interrater reliability of 360-degree feedback ratings. Journal of Applied Psychology, 83, 960-968
Felix Global Corp. partners with Envisia Learning for their validated online assessments which are a powerful and easy to use suite of tools for consultants and in-house teams alike.
Kenneth Nowack, Ph.D. is a licensed psychologist (PSY13758) and President & Chief Research Officer/Co-Founder of Envisia Learning, is a member of the Consortium for Research on Emotional Intelligence in Organizations, and is a guest lecturer at the UCLA Anderson School of Management. Ken also serves as the Associate Editor of Consulting Psychology Journal: Practice and Research.