My CV is available on
GitHub.
I’m an experienced operations engineer with a focus on automation and
tooling. Most of my career has been spent working with Linux systems,
and I cut my teeth working in customer facing support.
I enjoy writing Go and Ruby. I have a lot of experience with
continuous integration (CI) and deployment systems, and have a strong
interest in using automation to solve problems. I particularly enjoy
making the lives of my colleagues better with intuitive tools.
Much of my work in the last few years has been in an AWS environment.
I feel comfortable working within the huge Amazon ecosystem and using
the AWS SDK.
October
2021-present: Site Reliability Engineer for Ably
I work in the Infrastructure team to help scale the global platform
as Ably continues to grow rapidly.
Ably is a distributed system that runs thousands of servers across
multiple regions in AWS. The Infrastructure team is responsible for
maintaining and ensuring uptime of the system, as well as providing
tooling and support for the engineering teams, and leading on
improvements to legacy parts of the system.
July
2021-October 2021: Systems Development Engineer for Amazon Web Services
I briefly worked at Amazon, but chose to leave because it didn’t feel
like the right fit for me.
March
2018–June 2021: Site Reliability Engineer for FutureLearn
As the sole SRE at FutureLearn, I was responsible for improving the
development workflow, the CI and deployment methods, and the core
infrastructure of the site.
Achievements in this role:
- Gave regular technical lightning talks and interactive workshops
aimed at ensuring that software engineers, technical leads, and
technical architects are also able to support and improve the
platform.
- Sped up the overall build time for the main application by
decreasing test times from ~15m to ~5m, greatly reducing the development
feedback loop.
- Built, configured, and migrated to a new CI system to replace the
out-of-date system previously in use.
- Unified all infrastructure code into Terraform and replaced manual
provisioning tasks with automated instance bootstrapping. This saved
operational support cost and reduced the complexity and number of tools
involved in making changes to the stack.
- Wrote a command-line tool in Go for developers to ease deployments
and allow working with SSH, AWS, CI, Datadog, Docker, and more.
- Upgraded the servers from Ubuntu 14.04 to Ubuntu 18.04.
- Modernised the platform by migrating from EC2 based deployments to
Amazon ECS, focusing on scalability and simplicity while balancing the
need for lower cost of ownership by moving to higher value managed
services.
- Scaled the platform to handle an unexpected spike in traffic during
COVID-19
As technical lead for GOV.UK Infrastructure, I led a team of web
operations engineers and software engineers, and worked alongside the
delivery and product managers in moving GOV.UK toward a modern
infrastructure. GOV.UK is made up of over 50 microservices, mostly using
Ruby on Rails, and the platform includes MySQL, PostgreSQL, MongoDB,
Elasticsearch, Jenkins (deployment and CI), Varnish, NGINX, and Redis.
It is a critical national resource, and so it is essential that the
general public are able to reliably access content published on the
site.
Achievements in this role:
- Enabled software engineers to deploy code as quickly and safely as
possible on the current platform, while maintaining and improving the
infrastructure and increasing automation and self-healing. This allowed
my team to concentrate on improvements rather than maintenance.
- Collaborated with other technical leads, delivery managers, and
product managers to ensure we were targeting the right goals, and
provided guidance and advice based on my experience.
- Worked with technical architects to provide the vision for the
future of the platform, which included highly available, ephemeral, and
dynamic infrastructure and a transition from self-hosted to cloud native
services.
This position involved maintaining and supporting the ongoing needs
of the evolving GOV.UK infrastructure. As part of this, I provided
out-of-hours and in-office second-line support.
Achievements in this role:
- Worked in a multi-disciplinary team, responding to both technical
and business needs.
- Made substantial contributions and improvements to our Puppet
code.
- Planned and contributed to a number of improvements to functions
such as monitoring, logging, CI, deployment, backups, and disaster
recovery.
I joined DediPower as an intern on work experience, and was then
hired full time as part of the systems administration team, working with
customers, other members of the support team, and other parts of the
business including the networking, sales, datacentre, and provisioning
teams. Our services were primarily Linux-based, but we also supported a
wide range of systems including Windows, VMWare, Cisco firewalls, and
server hardware.
Achievements in this role:
- Worked my way up from the lowest to the highest level of
support.
- Supported all customers with extensive on-call coverage, often
responding to and fixing issues from customers we’d had very little
experience with previously, along with being a point of contact for
larger and more complex systems.
- Provided technical guidance and ensured efficient knowledge sharing
with new hires during a period of rapid growth from a dozen to over 60
people.
- As an intern, helped to build one of our datacentres, including the
initial power cabling, air conditioning implementation, networking, and
full rack placement.