Over the past years, I have discovered a useful classification for all engineering work. While it is not perfect and definitely not universal - it serves as a powerful framework when I think about processes, communication and planning.
I split everything into these three categories:
- Three weeks long projects
- Problems that require immediate resolution
- Small chunks of continuous improvement work
These types of work have their own specifics. Below is a description of how the collaboration might be organised in each case.
Projects
We do most of the product development work in three weeks long projects. Within these weeks engineers organise their work as they see fit. We don’t assign tasks. Developers get a fixed amount of time to implement a project within guidelines that product and engineering teams initially agree on.
To prepare for each project we collaborate on a document that we call a proposal. The goal is to understand whether the work proposed seems achievable in three weeks and to identify risks as early as possible.
At the beginning a draft for a proposal is created by a member of the product team. After some initial framing it is shared with the developers for an engineering review. Once the feedback from engineering is taken into account and an agreement is reached on the feasibility of the proposal - we kick it off and the implementation team starts the works.
All further communication is asynchronous. We intentionally don’t create chat rooms per project. We communicate over comments in a Trello card that we create for each new project.
Surges
When we are in a situation with a sense of urgency we start a surge. The reason for the pressure can be an issue that requires an immediate fix or a rare business opportunity that needs an urgent engineering help. During a surge engineers always act in an accordance to predefined protocols and runbooks, follow strict communication rules and focus exclusively on delivering a quick solution. The engineering leadership is heavily involved in order to pull required resources and coordinate activities.
Surges require completely different approach to communication. Given their urgency and high impact there are more people within company interested in real-time updates. At the same time an engineer investigating a problem requires to maintain clear focus and can be easily distracted by constant chat alerts with requests for updates.
Email might be the best communication tool for a surge. We could create an email thread as the very first step and add everyone involved into it. The communication over email is asynchronous, however in this case we commit to respond as soon as we have more information. For example, if investigating an issue requires time the engineer handling the surge will clearly identify how much time he needs for the first go. The same engineer will reply not later than the time specified, either to request more time or to share the latest findings. A thread in a Slack channel might be even more synchronous alternative in case if communication happens to be much more real-time.
Continuous improvement work
Certain aspects of our tech stack require regular enhancement. Series of small refinements help us to achieve this. Here are some examples: improving our DevOps flows, updating documentation, improving cross browser compatibility, updating libraries, cleaning up unused service instances upgrades etc. These modest contributions produce useful results in a short span of time from few hours to couple days of work. The best moment for this mode of working is the period between projects.
It is good to have a tech strategy that identifies areas of responsibilities for each engineer and sets long term goals for them. Driven by these goals engineers would self-manage their work on small improvements. Status updates on their most recent efforts can be shared and discussed in a team chat.
Improvement Kata can be used as a coordination and visualisation tool. We could describe iterative changes clearly defining the current state and the next steps as we move towards an ideal future.
This categorisation also provides the engineering team and the executives with a shared language. In addition to other benefits this language gives us a straightforward prioritisation mechanism: a surge is always more important than a project, and the work on a project always has higher priority than a small improvement. It also helps to understand the trade-offs that should be made as we allocate the engineering resources. This is especially valuable in a dynamically changing environment. Ability to adjust sails accordingly in an unpredictable situation is what defines a truly agile organisation.