"When you have alignment, cherish it."— Ray Dalio

Cost of Consensus

2019-08-24

TL;DR

You have two knobs at your disposal when managing the cost of alignment:

How effective your team is at getting aligned.
The number of people you are required to align.

Though it is an uncomfortable reality, I don't think any team's alignment effectiveness is able to overcome the sheer volume of connections as a team scales. Consequently, I'd like to encourage more thinking & discussion about how to reduce the number of people required to be aligned in the first place.

When you need to make a decision, limit its scope, prove it out in a safe way. Don't pay the cost of consensus eagerly until you have to. Weigh the value of consensus against the probability of reaching it, when you do this a whole bunch of things that make you itch for uniformity show their true colors as minutia and you find tolerance is a viable strategy. Build consensus for the things that truly matter and cherish alignment when you have it.

The Cost of Consensus

Authentic alignment is precious. When a group of people has agreed on a given course of action their collaboration will result in a beautiful result that is more than just the sum of the parts. Beyond that, each member of the team will be engaged because they understand the who, why, what, where and when of each item at hand to accomplish their shared purpose. Everyone has felt what it is like to be on a team that is truly aligned & consequently it is no wonder organizations everywhere pursue this kind of alignment.

Alignment Is Expensive

The cost of alignment increases proportionally to the number of agents that need to be aligned. On a small team, alignment is so cheap that it is taken for granted. For example, if Bob is on a team with Susan. Susan & Bob have a conversation about their next steps and they are effectively aligned as a side effect of their planning.

Hey Susan, I'm thinking we should use React for your next project, what do you think? - Bob

Hey Bob, sounds great! - Susan

Susan & Bob align, effectively for free. But throw in just one more person, Scuba Steve.

Hey all! I'm thinking we should use React for this next project, what do you think? - Bob

I was talking to Steve and he wanted to use Vue. - Susan

Steve, waits a few days cause he is on his scuba trip & he is kind of nervous to confront Bob about his desire to use Vue.

Yeah, I'd rather try out Vue, you ok with that? - Scuba for Life, Steve

No, I'm not OK with that. - Bob

Sounds like we need to meet. - Susan

This makes sense, each person needs to be aligned with every other person and consequently the cost of alignment goes up roughly:

connections = (number of people * (number of people - 1) / 2)

cost of alignment = connections * alignment effectiveness

This is the number of connections in a group multiplied by a score of how effective that group is at achieving alignment.

What exactly is the cost? The cost is a rough proxy for the amount of time spent doing "meta work" managing the connections before doing the "work" itself. The cost is the time spent creating presentations to convince everyone involved, the energy spent addressing questions, adapting to feedback, compromising & convincing until everyone is aligned. It is the time that you aren't getting user feedback because you're debating in the realm of imagination. The "cost of alignment" refers to the very real financial & opportunity cost that causes great people to look elsewhere. It makes your business ripe for disruption. Do you find yourself in a growing business and feeling like you get less done even though you have more people? Or shocked at how rapid your side-projects move along while your team is still trying to figure out which framework to use? That is the cost of alignment.

What Is Alignment Effectiveness

Alignment Effectiveness is simply how effective your team is at getting aligned. Are people radically candid with each other? How clear is the goal of the team in the first place? Are people safe & comfortable with healthy disagreement? How willing are people to disagree and commit? Are the incentives of the organization such that playing as a team is more rewarded than winning as a hero all-star contributor?

What about communication mechanisms? Are conversations had in the open so anyone can follow up and see why a decision is made? Are meetings recorded and run effectively in their own right?

Certainly, improving your team's ability to reach consensus is one angle of driving down the cost of consensus.

Effective Alignment

Poor Alignment

In the above examples, the team with Effective Alignment can scale to a reasonable number of people and keep the Cost of Alignment at bay. However, the team with poor alignment immediately struggles as more people are added! Who knows, maybe their goals aren't clear? Maybe, they need to invest more time in nurturing their connections? Maybe they hired a diva who refuses to collaborate? Regardless, I just wish I could hug each one of them because that is a miserable place to be.

Worse yet, as each team tries to scale the number of people creates an insurmountable cost, with the Alignment Effectiveness of the former charts held constant:

Effective Alignment

Poor Alignment

Oh no! Even the team with great alignment skills has to pay a high cost to keep their team aligned as they scale. No matter how effective each individual is at reaching alignment there is a growing cost in the limitations of human communication & the number of people attempting to communicate.

Eventually, the number of people passes our biological limitations (150 is the commonly used value) to coordinate effectively as a cohesive group, and sub groups are forced to emerge. The number at which this occurs is referred to as Dunbar's number, Dunbar theorized:

"this limit is a direct function of relative neocortex size, and that this in turn limits group size [...] the limit imposed by neocortical processing capacity is simply on the number of individuals with whom a stable inter-personal relationship can be maintained"

So, what is a growing company supposed to do? Just stop growing?! No, of-course not. I shudder to think of the wonderful human accomplishments that never would have occurred if each organization decided to stop growing at this point. I think it is natural to reach for processes & tools to improve Alignment Effectiveness. I consider Agile, Kanban, Scrum & other planning methodologies as tools people use for trying to improve this very measure.

Unfortunately, these tools can come at the cost of autonomy and mastery and yield a shallow sense of alignment where many people disengage. Furthermore, the overhead of conforming to the process is at risk of exceeding the cost of alignment. Often the overhead is simply additional cost (process for process sake) and doesn't improvement alignment at all. I think encouraging teams to use these tools should they fit and creating an "interface" for reporting to the organization is a reasonable middleground.

Some of my favorite tools for improving alignment.

RFC Processes
Communication Interfaces like Changelogs, Slack Channel Types & Status Updates

I'd like to propose some additional ways of thinking about this problem beyond improving a team's Alignment Effectiveness.

Small Teams

Given that we are not going to be talking about improving the Alignment Effectiveness variable, that leaves us with one additional variable to work with: number of people. What tools do we have at our disposal to keep the cost of consensus down by reducing the number of people that need to be aligned?

The answer here is obvious in ideal, but extremely difficult in practice. Have small teams! Jeff Bezos famously referred to this as the "two pizza rule",

Bezos believes that no matter how large your company gets, individual teams shouldn’t be larger than what two pizzas can feed.

When I read this, I knew I could never work at Amazon because I eat an entire pizza by myself so I'd languish in a life of perpetual isolation.

Effective Alignment

Poor Alignment

By limiting team size to 8 people, the cost of alignment is much more manageable. This sounds great, but of-course restructuring your organization into small teams doesn't mean that those teams suddenly don't need to have connections between them! The technical architecture has to facilitate this kind of strategy.

There is an adage referred to as Conway's law that states:

organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations.

If this is true, if systems are merely a reflection of the communication structures that create them, is the relationship bidirectional? Meaning, can a system backpropagate & change the communication structures of the organization that creates it? I think so! Let's explore the tip of the iceberg of a few ideas to get your gears turning on how to scale your organization with technology.

Event Sourcing

An event-sourced architecture allows subsystems to communicate over a sequential, replayable log of events. Subsystems communicate their writes by putting an event in the log. All systems derive and manage their internal state based on their processing of the log. With an architecture like this, each team controls its infrastructure, databases, caches, interfaces & deployments.

Subsystems should not talk to each other for state, it should derive a view of the state it needs for its app from the log, creating what is effectively a "copy" of the state. While this costs you in terms of (eventual) consistency, it buys you resilience and scalability.

There is then a "coordinator" that stitches all of the sub-systems together in a loosely coupled way, on the backend that looks like routing/load balancing:

my-fancy-app.com/some-sub-app => <service-owned-by-some-sub-app-team>
my-fancy-app.com/another-sub-app => <service-owned-by-another-sub-app-team>

And on the front-end, there might be a lightweight coordinator like this:

Router([
  {
    path: "/some-app",
    init: (el, eventLog, dispatch) =>
      import("https://my-fancy-app.com/some-sub-app").then((SomeApp) =>
        SomeApp.init(el, eventLog, dispatch)
      ),
  },
  {
    path: "/another-app",
    init: (el, eventLog, dispatch) =>
      AnotherSubApp.init(el, eventLog, dispatch),
    init: (el, eventLog, dispatch) =>
      import("https://my-fancy-app.com/another-sub-app").then((AnotherApp) =>
        SomeApp.init(el, eventLog, dispatch)
      ),
  },
]).run();

In this hypothetical example, SomeApp & AnotherSubApp are responsible for communicating writes via dispatch and deriving state from eventLog, they are also given an el to render into, they would also need to return a function that can be used to "tear them down". Both apps, in this case, have a clearly defined interface, how they are implemented is owned by some-sub-app-team & another-sub-app-team. Those teams each interview customers, deploy updates & make technical decisions all autonomously within their "domain".

The "coordinator" implements the "shell" of the UI, such as the navigation & to avoid too much redundant work, sub-systems can publish embedded interfaces for each other to use. The team that owns the "coordinator" can also provide a Design System for sub-teams to use, unfortunately though, these design systems create a wide "link" for consensus cost.

This is a completely didactic & hypothetical interface. What an App looks like in your system needs to be clearly defined for your team. And of course, whether an architecture like this is a good idea depends a lot on the busines domain you are serving.

Here are some great resources for learning more about a loosely coupled event based architecture:

Benefits

Teams are very encapsulated & therefore very autonomous.
The system has some great resiliency because subsystems can keep operating when other systems have an outage. Systems can self-heal by catching up with the log. Since there are copies of the state stored across various systems your data has some great redundancy.

Trade-offs

Teams are so encapsulated that there is a great deal of redundant work. I think tools like gRPC help drive down this cost but again, the more you share, the more links you introduce back into the system, the higher the cost of consensus rises.
A large investment in infrastructure is required, however, tools like Kubernetes & Kafka are improving the landscape.
There is a performance cost, I think this is most notable on clients. Where they share a runtime environment. Duplicating state on the client is just wasteful. Shipping multiple runtime frameworks ultimately make your user pay for reducing the cost of consensus. This cost can be mitigated by aligning your teams and user personas.
Analytics are harder to get right & cross compare, you'll likely need to create an "interface" for teams on this front.
When you create good boundaries to reduce the risk of allowing teams to innovate autonomously you have implicitly put boundaries on the positive scope of the impact they'll be able to make.
I don't know of a way to mitigate negative risk exclusively without also compressing your upper end. If Julie can only contribute to this sub app, she can only impact that sub app, regardless of the direction of that impact.

I have so many architectural ideas to explore here. Especially as technology like HTTP2, Web Workers, WASM, Portals & SharedArrayBuffers become more widely available.

A humble bow of admiration to the people at Pluralsight, it was there that I encountered many of these ideas in practice.

Better Tools

Technology Agnostic Design Systems

I hope that tooling will allow design systems to be created in a framework-agnostic way and then "compile" to whatever technology the sub-system is using. This preserves the sub-system team's autonomy in their technical choices without the overhead of maintaining their own implementation of the design system. It also reduces the design team's friction in making changes across such a distributed & encapsulated system. Heck, Designers should be able to push new versions of presentational components directly to the package registry and sub systems should be able to import & use the latest version in the technology of their choice.

Better Client Build Tools

I hope that WebAssembly continues to thrive. That dynamic linking & garbage collection integration make it a viable compilation target for polyglot teams.

I hope HTTP2 adoption continues and that our build tools can take advantage of this reality, requiring less compile time awareness of various sub-systems to create performant builds.

I hope new standards like Portals & Realms will make truely encapsulated sub systems on the client a viable reality without paying iFrame overhead.

Polyglot Typed RPC Systems

I hope that tools like gRPC will reduce the cost of communicating across encapsulated sub systems by preserving type information across languages, reducing performance overhead and having consistent interoperation when communicating across network boundaries.

Containers

Containers have created a wonderful encapsulation model and I hope tools like Kubernetes will continue to thrive.

Other Ideas

Here are a few high-level ideas that can mechanistically reduce the cost of consensus and allow you to scale your organization as a set of small teams.

Mono Repos, Modules & Interfaces
Actor architectures
Abstraction as a common language
The Levels of Process, from implicit people-oriented processes to automated processes
Relying on automation to reflect team boundaries, for example, disallowing importing from particular directories without going through a particular interface. Tools like Prettier leverage automation to drive down the cost of consensus.

Lastly, if you'd like to play with different values for the visualization used throughout this article, here you go:

Number of People

Alignment Effectiveness, 0 terribad - 1 world class

A Note On Top Down "Alignment"

Alignment is so valuable that it is tempting for leaders to try and force alignment using top-down mandates. Given the cost of consensus is so high, I can understand this impulse but the cost of imposed alignment is much higher. People disengage, they can't bring their best self to work, they don't surface important feedback, they look out for themselves. This course of action creates a specter of alignment at best and it looks so little like authentic alignment that I don't think it belongs in this article at all.

I'd rather fight the good fight for authentic alignment than fall for any artificial versions of it.