Identifying and Countering Bias during Performance Reviews: A Practical Guide for Engineering Managers

How manager's biases can affect annual performance reviews & what to do about it.

Jan 19, 2024

Welcome to the Captain’s Codebook newsletter. I'm Ilija Eftimov, an engineering manager at Stripe. Each week (or so) I write a newsletter article on tech industry careers, practical leadership advice and management how-tos.

January is usually the month where most of the companies go through their annual performance review. I had to go through the process for the first time. As an individual contributor, I had a different image of the process. My realization is that annual performance reviews are not a perfect process, but they work.

One thing that blindsided me during this process was cognitive biases. I knew humans were flawed. I knew we had biases - I remember the first time I learned about them during interview training. But, the wide-ranging effects of biases on the annual performance reviews surprised me. As a new manager, giving any signal a more significant relative weight to other signals can completely skew the review. Skewed reviews not only don't do justice to the person under review but can also change the course (usually for the worse) of that person's career.

Understanding biases

Imagine the human brain as a vast forest filled with countless trees. Initially, the forest is dense and uncharted. As we grow and experience the world, we walk through the forest, creating pathways. These pathways represent our thoughts and decision-making processes. Just like in a natural forest, there are a few ways to create these paths:

The first way to create paths is through familiarity and repetition. The more we walk down a particular path, the more pronounced it becomes. Similarly, the more we are exposed to a specific idea or way of thinking, the more ingrained it becomes in our brains. Repeatedly thinking the same way is like repeatedly walking the same route. Eventually, the road becomes a worn-out path that's easy to walk through.

Another way is by creating shortcuts for efficiency. Just as we might create shortcuts in the forest to reach our destination faster, our brains develop cognitive shortcuts (a.k.a. heuristics) to make decisions quicker. Shortcuts are helpful and efficient. But they can also lead us down the wrong path, arriving at the wrong destination.

The third way is via influence of our environment. Our surroundings also influence the paths we choose in the forest. If we follow others' paths, we might walk the same routes as they do. In other words, we're adopting other people's biases. Sounds similar? It's how our society, culture, and families shape our thinking.

Another way is through avoidance of discomfort. In the forest, we avoid paths that seem uncomfortable or dangerous. Similarly, in thinking, people often avoid information or ideas that challenge their existing beliefs and cause discomfort.

I am not a cognitive or behavioral psychologist, but this thinking framework is sufficient to be an engineering manager. The point is to understand how biases are formed in our thinking, not to get a PhD on the topic.

Effects of biases on performance reviews

Innovation requires fast but good decision-making in fast-paced companies with bold missions to change the world. At the same time, managers are accountable for their teams but have little time to make perfect decisions. Frequently, good enough is perfect. In such fast-paced situations, biases can creep in. As we mentioned before, biases are shortcuts for thinking or heuristics. However, they can be ineffective, as we can't apply the same heuristic cannot in all situations. Biases can lead to inaccurate or ineffective decisions, negatively impacting the team's performance and overall organizational success.

At the core, biases lead managers to make mediocre decisions. I know that every individual in my team needs a different approach to management. So, using shortcuts, even subconsciously (biases), will surely lead me to suboptimal decisions. For example, imagine an engineer asking their manager how to broaden their influence in the organization. The manager has seen other engineers using 'brown bag' sessions to present and influence - so that's what they suggest to the engineer. The problem is that the individual might not have (or want to have) the presentation skills. Their manager should know that. Such suggestions can lower employee morale and make them feel unfairly treated. In their head, the engineer will think, "Why is my manager suggesting presenting in front of the team, even though they know I hate presenting?!" and "If presenting to groups is the best way to develop my career here - maybe I need to find a new company."

In the context of annual performance reviews, biases affect fair evaluation. As a manager, I have observed how biases have skewed my perception of an employee's performance. For example, imagine an individual performing very strongly in the year's first half. But they ended the second half on a lower note. A manager who's not in check with their biases might focus on the individual's recent achievements, overlooking the employee's performance over the entire year. Over-indexing on recent achievements is just one example of how managers can skew their views if they're not keeping tabs on their biases.

An unjust evaluation can slow down an individual's career. For instance, that same individual who ended the year on a lower note might get discouraged from trying harder to contribute, knowing that their manager ignored their achievements from the beginning of the year. It will not only disappoint them, but it's not uncommon for individuals to leave their teams in such situations.

Common biases to be aware of

Here's a list of eight biases I would encourage any engineering manager to be aware of.

Recency bias

The first is recency bias. Recency bias occurs when a manager gives more weight to recent achievements (or failures) of an individual, which skews their view of the long-term performance of the individual. In such cases, instead of having a holistic view of the individual's performance over the past year, the over-index on the work done in Q4. The manager might overlook an essential contribution by the engineer, which can demotivate the engineer.

For example, imagine a team whose system has excellent availability and latency during the heavy shopping week around Thanksgiving in the US (a.k.a. Black Friday/Cyber Monday). Their manager, who falls prey to recency bias, might over-index on that achievement, ignoring or downplaying the poor operational posture that the team had in prior quarters.

Primacy bias

Another bias in the same family is primacy bias. Primacy bias is effectively the reverse of recency bias - the manifestation of favoring prior information when making decisions or forming impressions. These biases belong to a broader group called 'order effects'. A manager with a primacy bias might ignore or downplay an engineer's growth in later quarters and over-index the engineer's difficulties in earlier quarters. Despite the growth and strong performance for most of the year, the manager still rates the engineer lower than might be warranted.

Stereotype bias

Next is stereotype bias. Stereotype bias manifests as forming opinions or making decisions based on the attributes of an individual. The bias is often based on oversimplified generalizations about a group based on race, gender, age, nationality, profession, etc.

For example, an engineering manager of a team with members of different generations, influenced by age-related stereotype bias, might assume that the younger engineers are more adept with newer technologies and innovative in their approaches during the annual performance evaluations. Conversely, they may perceive the older engineers as reliable but not as up-to-date with the latest tech trends or as innovative. As a result, in their evaluations, the manager might disproportionately highlight and reward the contributions of younger team members when it comes to innovation and adoption of new technologies while undervaluing similar accomplishments by older team members.

Similarity bias

Another bias is similarity bias. Similarity bias manifests as preferring team members similar to the manager regarding background, experiences, or interests. For example, a manager with expertise in a particular tech stack might align well with engineers heavily invested in that stack. The manager enjoys working with this technology and has a background in it, so team members who strongly prefer and excel in this particular technology get better performance ratings. Other team members who are equally skilled and prefer a different tech stack are rated less favorably.

Proximity bias

With the proliferation of remote and hybrid working models, proximity bias has become very relevant. Proximity bias is the tendency to think that projects, events, or people that come to mind more quickly are more representative of reality. In other words, if particular team members come to mind more easily with the manager - they can get better performance designations at the annual performance review.

For example, imagine a manager spending more time with on-site team members than remote team members. The manager might fall into the trap of proximity bias and favor the colocated team members more than the remote team members. Proximity bias occurs because the manager has more face-to-face interactions with the on-site team members, making their contributions more visible and top-of-mind. The remote workers, despite their equal or superior performance, might receive less recognition or lower performance ratings simply because they are not physically present and, thus, less 'visible' to the manager.

Proximity bias can also manifest in non-remote or collocated teams. It's not always about geographical proximity, but geographical closeness can exacerbate the bias.

Ingroup bias

Ingroup bias is the tendency to favor individuals within a particular group over those outside the group. Another way to think about it is ingroup favoritism. Various factors can be used for grouping: race, gender, nationality, profession, organizational affiliation, hobbies, or even more arbitrary distinctions.

For example, a manager who has graduated from a prestigious university might look more favorably at team members who have graduated from their alma mater. Conversely, team members from other schools or self-taught software developers might be looked down upon or treated differently. During performance reviews, a manager with unchecked ingroup bias might give better performance ratings to reports who have attended their alma mater.

In effect, this alienates the other team members. The manager's bias pushes the "outsiders" out of the team, or they must work harder to overcome their manager's bias.

Confirmation bias

Confirmation bias is the phenomenon where individuals favor information confirming their pre-existing beliefs or hypotheses. Conversely, this bias gives disproportionately less consideration to alternative possibilities or contradictory information. When left unchecked, confirmation bias can damage managers - it affects how they gather, interpret, and recall information. Simply put, managers develop selective memory.

For example, imagine a manager who believes great software engineers must have excellent Git hygiene (e.g., clean and atomic commits). This belief is a preconceived notion the manager holds about what constitutes a good software engineer.
During the year, one engineer on the team, despite having messier commits, produces clean, performing, and maintainable code and has a substantial business impact. Another colleague produces great commit messages but has less business impact than the first engineer.

During annual reviews, their manager, due to their confirmation bias, equates the second engineer's good Git hygiene to better performance, giving them a better performance rating than the first engineer. The manager undervalues the first engineer's achievements because they're a "worse" engineer, purely based on their Git hygiene.

Organizational bias

Organizational bias is the tendency to overemphasize skills, behaviors, or characteristics valued within an organization while not being required to succeed in a role. Unlike individual biases, which are personal, organizational biases embed themselves in the structure and culture of the organization. Because the organization is biased in a particular way, the individuals perpetuate said bias, perceiving it as a way to get ahead or be successful in the organization.

A typical example, exacerbated in the tech industry, is when a company is biased toward long work hours. A manager who has been long enough with the company to absorb (and perpetuate) the bias will perceive engineers who spend extra hours at work as more dedicated and hardworking.

During annual performance reviews, biased managers equate longer hours at work with higher performance. So, they give better performance ratings to engineers who spend extra hours at work. Conversely, said managers hand out lower ratings to engineers with standard work schedules despite them being equally (or more) efficient.

Practical tips, tactics and bias blockers

Being aware of my biases is excellent, but humans are fallible. In the heat of the moment or when under pressure, we quickly think we have our biases in check. We don't. To indeed manage our biases, we need a structured approach. We need processes and frameworks to act as "bias blockers" or "bias interrupters".

Here are some ideas I have seen in practice or picked up from the Internet.

Build and maintain a brag doc

I like to encourage my team members to document their work regularly. Ships, challenges, learnings, accolades, feedback, anything is worth documenting as long as it shows the engineer's growth in their role. Companies might use internal platforms where they store the work log. While it's nice to have this centralized system, it adds overhead.

The most straightforward approach is the so-called brag doc. A brag doc is just a plain document with bullets and dates next to it, showcasing the engineer's career milestones. It's as easy as opening a new Google doc, writing down the date, and jotting down a few bullets on the most recent accomplishments.

Here's a straightforward example of how a brag doc can look like:

        
Date of entryWhat I did
2023-12-15Led the incident remediation of incident Lorem Ipsum, effectively identifying the root cause and mitigating the incident with a forward fix. Handed over the ownership to the Payments team for completion.
2023-12-04Shipped a new email notification for potentially dormant users.
2023-11-20I am in the company's top 5% of all interviewers by the number of interviews conducted in Q3 of 2023!

Date of entry	What I did
2023-12-15	Led the incident remediation of incident Lorem Ipsum, effectively identifying the root cause and mitigating the incident with a forward fix. Handed over the ownership to the Payments team for completion.
2023-12-04	Shipped a new email notification for potentially dormant users.
2023-11-20	I am in the company's top 5% of all interviewers by the number of interviews conducted in Q3 of 2023!

view raw bragdoc-example.md hosted with ❤ by GitHub

It should be the responsibility of both the manager and the engineer to keep the document current. While the engineer is supposed to add content to the brag doc, I also like to keep the engineer accountable by occasionally checking the doc for new content. The brag doc helps me help the engineer during performance reviews, where it's effortless for us to look at their achievements over time. It helps create a comprehensive record of each engineer's contributions and struggles, making it easier to recall their performance objectively during reviews. This win-win aspect of brag docs makes them powerful yet simple to implement.

Protip: I encourage my engineers to set periodic reminders to update their brag doc (e.g., fortnightly). Doing it at performance review season is an order of magnitude more difficult than doing it continuously. If each engineer can spend 20 minutes updating their brag doc every two weeks - we're both set up for success!

Build a rotation of roles & responsibilities

In every engineering team, there's work that goes beyond the mission and day-to-day operations of the team. Administrative tasks, like organizing team activities. Fielding questions from sales and customer support. Organizing dashboards. Cleaning up old A/B experiments. On-call. The list goes on. It's not all engineering work, but the engineering team would cease functioning without it.

While some engineers have a knack for some of these roles, I want to avoid trapping any engineer into an "extracurricular" role. All engineers on the team should share the burden of these additional responsibilities, so for that reason, a fixed rotation works best. A rotation of shared duties and roles allows each member to showcase their strengths and work on their weaknesses in different areas. In the context of performance reviews, it provides a more rounded view of their abilities and reduces bias based on specific roles or tasks.

With a rotation, there isn't only a single engineer who holds the pager - it's all of the team. It's not only one engineer cleaning up dead code - it's the whole team. It's not only one person organizing off-sites. Or answering questions from sales. Or helping support agents. You get the idea.

Gather comprehensive feedback

To avoid biased signals in performance reviews, collecting feedback from various sources is essential. The familiar sources of feedback are great: self-reviews, peer reviews, and ad-hoc feedback from cross-team peers. But I like to get more creative. It's essential to think about who the engineers work with. Recruiters? Check. Technical enablement teams? Check. Legal? Check. Support? Check. Sales? Account executives? HR?

While most of these folks are not engineering-related, their perspective on an engineer's contribution will be insightful. Feedback from cross-functional partners is a high-leverage tool to escape the engineering bubble. I love bringing their exciting perspectives into the performance review of each engineer. This multi-source feedback helps provide a more balanced view of each individual's performance. It also mitigates the risk of my personal biases influencing the evaluation.

As a manager, I am aware that my feedback is also crucial in the mix. So, I hold regular team meetings with a structured agenda. These meetings ensure consistent communication about project progress, individual contributions, and areas of improvement. Structured meetings also help me track performance consistently and fairly for all team members.

Regular one-on-one meetings with my team members are vital to observing the individual's performance. It helps understand their challenges, progress, and achievements in near-real time. Through regular interactions and updates, I reduce the bias of the annual review. I don't base it on memory or recent events but rather a summary of the entire year's performance.

Keep the review specific

I avoid speculative or generic feedback. I perceive it as a copout. I understand speculative feedback as based on assumptions, predictions, or possibilities instead of concrete evidence. It's rooted in hypothesizing about future outcomes or behaviors, not in facts. An example of speculative feedback is when a manager says, "If you refactored the code as you suggested, our latency might've halved, but there's also a risk that we might've created an incident".

On the other hand, generic feedback is broad, non-specific, and can apply to many situations. It lacks personalized insights or specific references to relevant situations that would give the engineer under review actionable feedback. For example, generic feedback would look like "Your RFC looks good; keep up the good work." It's generic because it needs to address specific aspects of the RFC or provide actionable insights.

I push myself away from unspecific feedback for annual performance reviews. That way, I force myself to make my feedback signal-packed and actionable. The STAR methodology is a simple framework for writing feedback: Situation, Task, Action, Result.

Here's an example of STAR-shaped feedback that I try to emulate:

In the third quarter, our team faced a significant challenge with our database system, which was struggling to handle the increased load due to our growing customer base. (Situation)
You were assigned to lead the project to optimize our database performance and ensure it could scale effectively to meet our future needs. (Task)
You conducted a thorough analysis of the existing database structure and identified several key areas for improvement. You then developed a plan to introduce indices and implement caching strategies. You also collaborated with the team to integrate these changes without disrupting our live services. (Action)
Post-implementation, we've seen a 40% improvement in database performance and a marked reduction in downtime. Your innovative approach not only resolved the immediate issue but has also positioned us well for future scaling. Furthermore, your willingness to collaborate and communicate effectively with the team ensured a smooth implementation process. (Result)

In addition, giving more than one piece of evidence for the whole evaluation period is a great practice. The number of proofs can depend on the number and size of the projects the engineer worked on. But I like writing at least two, maybe three, pieces of evidence to back up my overall assessment. All of them should follow the STAR method outlined above.

Lastly, I eliminate vague language. "Awesome," "culture fit," "strong presence," "vocal," or other vague concepts have no place in a performance review.

Be grounded in data

There's no better way to be specific than using data. I like to always refer to previous performance information I have available for the engineer. Companies use centralized platforms to track the performance of their employees, so I look there for previous feedback and performance evaluations.

Was there any feedback I gave in the previous cycle? It's good to acknowledge it. But also, it's terrific to reflect if the engineer worked on that feedback. If they've made strides on it - I like to acknowledge and praise them. If not, I would like to highlight it again. Continuity in giving feedback is essential. No feedback should be a surprise at annual reviews.

Remember the brag docs? I like to put them into use when I need data. They are full of it, ready to be used. I like to tap into the brag doc of the engineers and pull out examples or inspire myself. Especially at the start of the review cycle when I have difficulties starting off the reviews. It's much easier to start with data than gut feelings about the engineer.

I like to use stats about the engineer's contributions: completed projects, participation in incidents, fixed tickets, closed epics, replies to customer support, number of interviews conducted, etc. While these stats are as (not!) relevant as lines of code written, they still are signals about the engineer's work. I like to use these engineering stats to support my performance evaluations. They reduce reliance on personal opinions or preferences and minimize the impact of biases.

Keep expectations leveled

If your company has an engineering ladder - use it. It's what allows all engineering managers across all departments to anchor to a set of expectations for each engineering level. Being unable to base your expectations on a single source of truth will turn the performance reviews into chaos.

If you're part of an organization that has yet to define a set of expectations per level, that's the first thing you need to align on. You can find a bunch of examples on https://progression.fyi/. You might want to take Amazon's career progression ladder and implement it in your organization - I suggest you don't. Get inspired by it, but it's crucial to develop a ladder that matches your business's needs.

Each ladder can contain specific project goals, teamwork metrics, problem-solving abilities, and deadline adherence. Documenting each success criterion per engineering level ensures that all evaluations are based on the same standards, reducing the influence of subjective judgments.

The bottom line is that I believe in tying each item in the performance evaluations to a competency on the engineering ladder. If it's not on the ladder - it's irrelevant and should not be evaluated against.

Talk with my manager and mentor(s)

Once I write the performance reviews - I like to share them with my manager and mentor(s). Sharing my reviews with others does wonders for removing biases - an "outsider's" view, who does not work day-to-day with the team, will provide a different perspective. Sharing helps identify improvement areas and promote a more objective evaluation process.

With my manager, I like to talk about whether, in their view, I am up to par with my peer group. My department will calibrate my engineers' reviews against my EM peer group, so I like to be well-prepared. I present each review similarly to how I would present it at the calibration sessions. Then, I ask for feedback: is there something I could be doing or phrasing better?

With my mentor(s), the overview not that relevant as my mentor(s) are usually outside my org and do not know my team well. So, I like to ask specific questions. How do they fight biases? What are signals of biases they commonly see? Do they see me over-indexing particular aspects of someone's performance? And so on.

Calibrate

Calibration sessions are meetings where managers from within the same organization come together to ensure all performance reviews within their organization are consistent, fair, and unbiased. The goal is to eliminate the chance of any engineer within the organization getting preferential treatment over the rest, ensuring a leveled playing field for everyone.

At calibration sessions, it's straightforward to spot biases. A group of managers scrutinizes each review package, so any cracks in the performance review package will quickly surface. As a new manager, I found the prospect of having my writeups reviewed by many of my peers. But after being through the process once, I think it's one of the best ways to ensure a fair, balanced, and not biased review. It may harm my ego, but it is a net positive for the organization.

And that’s it for this week! If you liked this, consider doing any of these:

1) ❤️ Share it — Captain’s Codebook lives thanks to word of mouth. Share the article with your team or with someone to whom it might be useful!

2) ✉️ Subscribe — if you aren’t already, consider becoming a subscriber. Seeing folks read the newsletter energizes me and gives me ideas to write.

3) 🧠 Let’s chat — I am always on the lookout for questions to answer, to help out with challenges, or topic ideas to cover. Hit me up on LinkedIn, Twitter or send me an email to blog@ieftimov.com.

Until next time,
Ilija

Captain's Codebook