Following on from our initial conversation, we’ve been speaking with lead engineer, Nick Barton. Nick has just rolled off one of our biggest projects following a great success in a major release and we’ll be speaking to him about how he’s managed the team who succeeded in this.
The release was a substantial Azure migration under intense time restraints. AL’s role was providing DevOps Engineers working on both development and operations roles, with Nick running the project. The Azure migration meant shifting a huge number of applications to public cloud which were previously on-premise. The challenge included migrating 400 servers, and 6 main applications comprising of around 50 microservices in total. To complete the workload we formed an initial team of 7 AL engineers, scaling up when needed, to work on building the new infrastructure over 8 months, overcoming many challenges including a lack of detailed design and frequent changes in priority.
We caught up with Nick to get some more information on how he managed this release to gain some insight into usable management techniques for these large-scale projects.
Why was this project particularly challenging for the team?
The project needed to be planned and releases given deadlines when we arrived on site. The nature of our clients working in highly regulated industries means we have to be prepared for external drivers creating hard deadlines, even if we’re missing key dependencies at the time. Estimating timescales for these types of projects, as every architect and engineer will know, always comes with a risk. Despite Agile methodologies offering a number of estimation techniques such as Affinity Mapping and The Bucket System, they all stress that it’s near impossible to get an accurate estimation and it only gets harder the longer the time-frame you’re estimating for.
This is why the team works in short sprint cycles, the extra detail you acquire as you move forward sheds more light on the next step, especially when you add in some architectural planning. We realised that the initial timescales we’d estimated were around 50% out, so quite a difference.
How did you pull together to deliver under pressure?
Our main achievement when we realised the challenges we were facing was to prioritise getting other disciplines such as security and architecture embedded within the team. Simply having them involved in the day to day stand-ups and able to answer questions as and when they came up rather than working in a big feedback loop made a huge difference. We came up with a solution quickly and automatically had their buy-in.
In terms of our AL team, I appointed Anna, one of our engineers who has actually just been shortlisted for the young professional of the year award, as a lead for the project, responsible for breaking down tasks for engineers. Giving them work they could get on with rather than seeing only the huge milestone ahead. That made a big difference.
It became clear we would need to work extra hours to get things done. Overtime payment, toil and days off in lieu were all in place and it became apparent that deflecting the team from the pressure of these deadlines wasn’t the best way to go.
The reality is, most people thrive under a little bit of pressure. I had to take a step back and get everyone involved in the planning of the work. Despite ambitious deadlines, allowing them to break it down and say ‘we can do this in these steps’ made it seem achievable to everyone.
I focussed more on team morale than incentives, we’re all invested in this project and everyone wanted to see it work. Little things like arranging a weekend working at our HQ for a change of scenery, the occasional lunch or after work drinks. The aim was motivation but ensuring they all still got their downtime. Getting the buy-in from the engineers on wanting to achieve this was all it took, they know what it means to AL and what it means to the future of the client.
When it came down to the wire you were faced with the risk of something going wrong, how did you keep a cool head and keep the team motivated?
It all starts with a calm discussion. Trying to get to the bottom of the issue and coming up with alternative solutions for discussion that will get buy-in from all different teams. Really, it sounds simple but it’s giving people the opportunity to try things out. In a way, having systems in place for failure and giving people autonomy works out better in the long run than trying to dictate people’s actions.
Otherwise, in a serious crisis, it comes down to getting the right people to swarm around it. Ensuring my team are giving me regular updates, giving them a direct communication channel to save time repeating things and mitigating the risk of miscommunication. Open communication without pressure is key.
When it came down to it as long as the communication was strong, the team could act in a more flexible manner. Accepting changes in priorities and being flexible helps reach the end goal more efficiently.
There isn’t an exact science to it managing something that doesn’t go as initially planned. All you can do is scope out the changes you need to make, work out how it’s going to impact other things, and impress the importance of prioritisation. It sounds simple but without weighing up the other aspects of the project you can easily do something that will have more of a negative impact. Being honest with your client is also really important, it’s better to say ‘We can do this, but there’s a compromise that we’ll have to rectify at a later date, or we can do this and revisit that’
Most important for me was taking the time to truly understand the aims of the project. Not just what we need to do but the wider business impacts and drivers. For example, there are times when you know the main goal is cost reduction – knowing that means when you’re planning you’ll choose the most cost-effective option which may help save costs in other areas of the business.
Did the DevOps team have much interaction with the testing in the lead up to go-live?
We had quite a good relationship with testers, but what we had to do was extract them from their teams and embed them within the Dev teams. Once Testing was embedded with Development, this allows for test-driven development. In the end, we built a dedicated testing team. It was a very unique release. We were duplicating systems which allowed us to safely test all new infrastructure before we switched over. This was unusual because we could configure anything and run tests before starting the migration. In general, though, you should be sure of what you’re testing in pre-production without worrying that it’ll work in production.
When you came out the other side, how did you celebrate success?
We’re lucky to work with a great client who organised drinks. AL also funded a dinner for everyone, we had team lunches. It’s little things like that, and equally as important, taking the time to personally thank people for their effort and the programme director themselves doing the same. Making it clear to people that their contribution was valued is really important.
Recognition back at AL is a key thing, it’s hard to balance overall team recognition as well as calling out the shining stars. There are things like all hands to allow for team recognition among the whole company, to reiterate the wider business benefits such as project extensions that these successes can lead to.
How do you capture what you’ve learnt and do things better next time?
In terms of my personal leadership development, I want to focus on picking up more agile techniques for these intense situations. Knowing tricks and tools to employ while still keeping in control of the project will be helpful. Once they’re in place it’s easier to report back to the client. When everything is visible and they can see what’s going on in a system, it makes everything easier.
Throwing processes out of the window at crunch times can seem really tempting, but it leaves you in a position as the only person that can provide an update to the team and the client. In the long run, that’s a bad position to be in. There needs to be a compromise between rigid processes and utter chaos.
Other than that, I’ve learnt that the team can handle a little stress, and actually allowing for a little pressure gets more buy-in than completely protecting them, just get the right people doing the right things and let them get on with it.
I kept up informal feedback, but moving forward I’d want more retros of what went well and what could be done better with the whole team. If people feel excluded at any time it runs the risk of them not buying into the process. Everyone here has something to offer, it’s up to me to ensure that skill feels appreciated.
It’s pretty amazing what Nick and his team can achieve under pressure. He believes that everyone working to AL’s values helps, with humility going a long way but also with the addition of courage. The courage to push back and not accept something they don’t believe is right, knowing to fight the battles that need fighting but also when to concede and move on. As a team, we want to collaborate and compromise when we can. Ultimately, it’s the courage to just get on and do it, even when you’re not given all the rules on how to do it.