Minimize Functional Hierarchies in Agile Software Development
TLDR: Each functional hierarchy costs you agility at the square of total count, and there is a lot of functions in agile devops software development.
1. Overview
Over the last 20 years, the software development industry has faced two independent, similar but sometimes conflicting revolutions fundamentally about bringing together different functional expertises (“functions”) together to collaboratively rapidly iterate via tighter feedback loops -
- the Agile¹ movement embracing cross-functional teams taking an iterative approach to delivery
- the Devops movement pulling together software development and operations functions into to improve addressing operational issues by fixing core software.
Out of this union comes two questions around organizational structure:
- How to structure the organization around functions who are important to the development and operation of software, but lack the technical expertise to be part of a devops team?
- For those functions within the development and operation of software, how to structure the different software and systems engineering specialties?
Today, there is no de-facto standard across companies or even within companies across functions. Steven Sinofsky has the state of the art writeup of tradeoffs for software development, but as it is based on his experience in Windows and Office which were neither devops nor agile, not all of it carries over for those doing both. Thus, in the rest of this document I attempt to lay a specific proposal for companies (going forward “agile software development” companies”) who seek to get the best of both worlds.
2. Organizational Structures Background
There is a large amount of literature on tradeoffs of different organizational structures of companies. One of the important distinctions is whether the structure is Divisional, Functional or Matrixed (with divisional sometimes called “Unit”). Note some literature talks about these as being absolutes at the company level, yet most companies are blended in implementation, depending on the specific function.
Most agile software development companies follow a functional structure for functions significantly removed from the creation and operation of software. That is functions like Legal, Finance, HR, and Sales are all hierarchies that report up to the CEO. Now a caveat for massive companies is some of these are broken to the mega-divisional level (say AWS having its own Finance org) — essentially though this is just treating the mega-division as a sub-company.
Caveat aside, because these functions are predictable and well regulated in what their outputs are relative to what is needed by other functions, them having a functional hierarchical structure is widely viewed as the best option. So as an example, while many people may wish their company’s legal team was more rapid to respond to requests, that is different to company’s needing their legal organization to be more agile in iterating with other functions to solve large complex problems.
All of that is in contrast to the roles involved in developing and operating software. The functional specialties here are diverse: Product Management, Program Management, Design, Usability, Support, Operations, Data Science, Security Engineering, Software Engineering. The last itself has many sub-functions which large company’s treat as specialist functions, for example: Front End Engineering, Distributed Systems Engineering, Systems Engineering, Reliability Engineering, Data Engineering, Systems Software Engineering. Across both lists, complex coordination between individuals across functions is very important for success of a project/product, and so highly functional parallel management orgs do not work. On the flip side a lot of cross-system, cross-teams and often cross-product communication within function is necessary as well, so pure divisional structures do not work either.
Appearing in the 70’s and 80’s, one solution proposed to this for technology companies is matrixed management, where functional specialties have a management tree (under “functional management”), and then that matrixed into a project tree (under “project/program management”). This was first used at NASA, Andy Grove implemented it at Intel (as documented in High Output Management), and it is still in place today at companies such as Qualcomm. Those are all notable successes, and I am sure it is still used today at many more companies where physical manufacturing is an important part of the innovative process. Still it has had its failures, for example Nokia. That article is well worth reading as it highlights both the particular failure but also makes a meta-statement about matrix-management — the structure itself is heavyweight to implement at scale. And so while for large, unagile, slowly iterative technology industries with manufacturing overheads the structure can make sense, for an industry that has embraced iterative release like agile software development, all that structure seems a burden. Indeed the only company I know who have documented trying, Spotify, admits it was a failure mostly due to the overheads of the management structure.
3. Common Structure in Agile Software Development
The last section raises, what do agile software development organizations do? Here most companies start with a core software-development function splitting itself into a divisional hierarchy along the lines of the technical architecture²:
- Division (say under a VP)
- Product Group (say under a Director)
- Product (say under a Manager)
- Component/service (say under a Team Lead)
Then for the other functions, you see three options:
Cross-Functional — individuals in a specific function are cross-functionally placed into core software development teams. For example, a Front End Engineer or Reliability Engineer on a team of Backend Software Engineers working on an Alerting product.
Mini-Matrix — They have functional teams, but any single team is pushed down to the lowest level of the core hierarchy that supports their mandate in terms of cross-functional collaboration, which allows the “matrix” aspect to be kept informal. For example, a team of 5 front end engineers under a Director, 2 work on an Alerting product, 2 work on a SLO product, and 2 works on an Incident App product, each of those 3 products having their own backend development teams of 5–6 people. Or even a company-wide Architecture team, time slicing their functional expertise across all teams in the company.
Functional Hierarchy — Functional teams are formed into a nested and parallel hierarchy reporting somewhere up in the core hierarchy. For example, a Front End org, or a Reliability org, reporting to a VP, SVP, or sometimes even the CEO.
As stated in the intro, beyond this what any company chooses tends to be unique to both the functions and the company’s mission (and organizational debt³ in light of mission).
4. Choosing Model For Specialty Functions
My basic rule comes from a combination of three things:
- The goal is the agile delivery of software that is also operated, which is a problem of cross-functional and cross-divisional complex coordination
- In section 2 I laid out at least fifteen different specialist functions involved in the delivery and operations of software
- Closeness in the hierarchy makes complex coordination easier
The third point has a strong underlying assertion that is important to address; namely, that management hierarchies create a power structure with the downside of making collaboration harder the further apart individuals are in it. This assumption underlies all management literature about why organization matters and so seems a truism. However, advocates for functional hierarchies often argue that they are not looking to create a power structure, but rather want to solve other important problems. Those problems include things like:
- Ensuring consistent hiring into the functional role
- Ensuring that promotions are evaluated by peers who understand excellence in the functional role
- Making it easy for the staff in the functional roles to float between teams as their skills are needed
However, all of these requests can be met with other mechanisms. Interviews can be conducted and reviewed by functional experts, functional committees can review promotions, and it is possible for teams to occasionally lend out experts to other teams where there is a need for a critical project. Creating a functional team makes these goals easier to achieve because it creates a formal hierarchy and grants power to the leader of the team. The functional team to solve these problems is particularly seductive early on in its existence, because when it is created everyone tends to agree on their priorities on what they should work on and why. But over time, this agreement will fade. The company will grow, and people will be removed from the founding relationships. Newcomers to the team will assume that this team operates in the same way other teams operate: where the most important relationships are inside of the team itself, followed by other nearby teams in the reporting chain. As this happens, the downsides start to occur more and more frequently, taking multiple levels of management to resolve, and turning into organizational debt.
With that in mind I propose the basic rule of “be extremely careful any time you create a new nested management hierarchy for a functional specialization”, which means the preference should be Cross-Functional, then Mini Matrix, and only finally Functional Hierarchy. Now, “Extreme care” does not mean never having a functional hierarchy. For example at Datadog we have a strong preference for a seamless user experience across product facets and even products. For us, that argues nesting the Design and Product Management hierarchies is worth the cost. But we do it for two specialities, not fifteen.
5. Agile Software Development Management is Cross-Functional
The previous section leads us to see that companies who want to succeed in minimizing parallel hierarchies need to find mechanisms of empowering specialized functions so their concerns are addressed. That only comes from changing the way the organization thinks about “software development management”. There is an old saying in organizations where non-core functions are asked to succeed through influence — “Lots of responsibility, no authority”. The flip of this, for those who have experienced middle management in rapidly growing agile software development orgs, is our roles are “lots of authority, no autonomy”. That is, we have a role which nominally has a lot of power, but out time and focus ends up being spent as the negotiator that makes everyone net maximally happy in light of divergent wants. Across upwards/downwards/sideways stakeholders, both in-function and cross-function. In fact, this can feel like constantly owning “no-one is happy” tradeoffs, but that is a feature, as the other option is having functional mini-empires whose leaders feel great about their tradeoffs, apart from their fights with other mini-empires who have different opinions, and nothing is resolved.
Thus, in the type of structure advocated for here, the company has to understand “software development management” above “line” team lead level is not about software development, nor is it only about “people management”. This is why those are just two of the seven areas of Software Development Management I think take equal weight. In particular there are two areas that managers need to spend significant time on for this type of structure to work — Partner Engagement and Company-Wide management processes. This means spending significant time/focus on problems such as “the 10% of my team who are front end engineers want to feel like they have career development prospects”. Needless to say without both coaching and rewarding such behavior at the highest level of leadership, it won’t magically happen.
6. Addressing Concerns of Non-core Functions
As I stated in section 4, a lot of the initial pushers for functional hierarchies do so with legitimate concerns of problems within the company that a functional hierarchy would address. The point is that the future costs of a functional hierarchy are very high, and there are other options. So in this section I lay out common concerns and how to address them.
6.1. What about standards of recruiting and promoting?
This is a heavily recurring concern, and is often the thing that growing software development organizations do too late for their specialist functions that lead to those in the function feeling like second class citizens and so wanting to centralize. My rule of thumb is when you have ~10 people in a function company-wide, you need to invest in level matrices and standardized processes for both recruiting and promoting, and they need the time and focus of Recruiting, HR, Engineering Management, and senior ICs to be made legitimate.
6.2. Function X is extremely important in our culture/business/etc?
As covered in section 4, there are legitimate reasons for breaking out some functions in their own hierarchies to suit the culture and problems of a particular business. However with each, you are making it easier for co-operation internally in that function at the cost of impeding cooperation with other functions, and this impedance rises as a N² problem. This can be particularly impactful as this is a slippery slope: when a function centralizes, there is always a next in line who can use most of the same arguments about why they need to centralize too. So in general, the best solution here is centralize into teams at the lowest place in the chart that makes sense. But don’t centralize company-wide unless you really, need to empower that function to address a core challenge of your business.
6.3. Function X is something our core software development org does badly?
This is an argument often used within the engineering functions; e.g. “we are bad at making our systems reliable”, “we are bad at building great front ends”, “we are bad at building scalable back ends”. These are often truths that come about based on the majority subfunction of engineers the company hired in the past and has now promoted into management. The key point is, putting a speciality function into its own hierarchy makes it easier for them to work with each other at the cost of making it harder to do so with everyone else. Now, if you need that, and can live with the consequences for the long term, then centralizing makes sense (say a Mobile development org). On the other hand a lot of times you need these agile coordination problems solved throughout the organization, and at that point you are better off finding better ways to empower the speciality without it needing to be a hierarchy.
6.4. What do I tell smaller functions about their limited growth options?
The first point is, a second level of hierarchy of say manager of ~5 leads of ~5 ICs, is a 1 in 30 role. While growth opportunities for the 1 in 30 should not be neglected, you need to be thinking about growth opportunities for the 29 others much more in their roles as individual contributors or of closely managing individual contributors. That comes from 3 things:
- Greater functional technical expertise through increased depth and breadth of hands-on experience
- Greater social expertise allowing them to execute their functional expertise on projects/programs of greater risk, scope, ambiguity, complexity and so impact.
- Greater ability to impact the company through helping others to between execute and grow in the function.
Now sometimes for a specific function an org can’t supply the opportunities for the second or even the first. There’s just not the business need. Centralizing into a hierarchy isn’t going to fix that, and worse may create a situation where opportunities get sought out contrary. However, if the business does have needs, then the only thing a functional hierarchy makes easier is to move people between projects/programs to maximize average growth. That is important but comes with costs, and the need for moving people between teams to maximize growth applies just as much to core engineering as other functions. Thus I argue you can meet this by holding management accountable for the growth of all individuals in all functions under them.
Coming to the third point, I would say this is a strength of not centralizing where it is too easy for work like this to be a “management problem”. Instead with sufficient guidance, this can be seen as part of the role of senior IC’s and have the second order effect of allowing them to build the social relationships and capital that also helps them succeed at 2.
Finally while it comes with costs, don’t assume you can’t mix other functional backgrounds into the core software development management tree; i.e. get past the “only ex-engineers can manage engineers” mindset. The strongest example of that is Amazon where leaders of any function can manage engineering teams provided they can demonstrate strong leadership in the face of teams core challenges, which often are not technical innovation.
6.5. What about authority/advocacy of best practices of the function?
As stated in the last section, this can be done without a functional hierarchy. However it does require management work and enablement of the community, examples I have seen:
- When you have 1–5 people who are clearly most senior in the expertise, ask them to do this as 50% of their role, often with a EA or TPM helping them on the coordination side.
- When you have either too many or too few for that, I do think the more generalized notion of a “guild” that came out of Spotify can work, but likely only with someone senior up the tree (say Director level or above) to spend the time and focus to enable it.
- Should those not be possible, you can get away with a single “core team” (for example, a company-wide architecture team), although they are often limited by the fact that they are not community-based nor hierarchy-based in their source of power, so can struggle to have influence. To counter that I would suggest giving them a mission that still needs them to deliver as individual contributors, as per the next point.
6.6. What about common software/problems that need a team of specialists?
At the team level, this is completely fine. If the common problems are big in terms of internal coordination, there is no trouble with nesting either — this is just a “platform org”, which is common in core engineering. Further, platform teams are often a great place to put the most experienced IC’s, drawing on their expertise technically and organizationally to allow advocacy of building the right things, and so also acting as an anchor of authority and advocacy.
The only warning here is a more general warning about platform organizations, which are infamous for saying “we will solve the problem you have today in a more general way in about 12 months” — i.e. they can end up heavily favoring throughput over agility as they are removed from the advantages of agility for the broader business as they only see its costs. My last manager Camille Fournier has a good writeup of the techniques you need to use to counter this.
6.7. What about flexibility for moving people to where the business needs it most?
It is clear having all specialists in the function report up to a single point makes it easier for that point person to move people between their teams. If this is a major concern you can see the business following through on, this more or less falls into what section 6.2 describes about this being an important exception. However, be sure the follow through is likely. All of these functions are high skill, projects usually take months, and specialist knowledge of a particular domain can take years to learn. Are you really going to move such functions around rapidly and on mass? In my experience I have only seen this exercised regularly for lower-skilled functions, much less involved in the agile delivery and operations of software — really it’s only been for un-agile pre-devops operations teams.
[1] I use “Agile” with trepidation, as what started with good intentions around rapidly-iterative minimal processes turned into an industry of consultants selling a whole lot of process. When I use “agile” I mean it the original definition, which came from software development getting over its “engineering envy” at the processes and rigor used by the more physical forms of engineering, and instead embracing its virtual-ness in implemented form to allow late-binding in meeting and so specifying requirements, and so allowing quick iteration to solve complex problems at which all the other forms of engineering now envy.
[2] When done thoughtfully, this is called embracing Conway’s law around shipping the org chart by structuring around what you want shipped
[3] For lack of a better term, “organizational debt” is the heavy inertia of changing the structure of human organizations, which beyond the conflict of having winners and losers, is exacerbated by any proposed change having very clear losers and less clear winners.