Canonical is a unique tech company - global, remote-first, open source, with 700 professionals across 50 countries - we want to be the world's best, not biggest, global software company.
Read more about this company
Information Systems - Open Source Technical Architect
The Information Systems (IS) team is responsible for all IT operations at Canonical, including the infrastructure for building, packaging and distributing Ubuntu globally. As the IS architect, you will own the design of hardware and software implementation from PCI lanes to CDN and everything in between. Our goal is to be a reference operation, using the latest capabilities in Ubuntu and open source more generally, with the most modern operating principles. Your choices will impact the Ubuntu user experience for millions of users, and drive how Canonical's engineers engage with compute and network resources in a dev-ops setting.
As the IS Architect, you'll be in a unique position to improve Canonical products and the open-source technologies they're based on. You do this by advising System Reliability Engineers (SRE) and Data Center Engineers (DCE) about best practices, and make informed decisions on technology choices in all aspects of cloud infrastructure and services. You will coach the IS team on automation, reliability, operational/technical scalability, network infrastructure and security.
Setup, maintain and update the technical design roadmap and guidelines for the SREs within IS, with the aim of improving reliability, resilience, operational scalability, and technical scalability
Own the design of the coming generations of global data centers and infrastructure
Collaborate with, and provide the cloud-ops software development teams with input for roadmap, requirements and prioritization to build a world-class, highly standardized and automated operation
Provide the IS management with input and advice with regards to technology, reliability, resilience and business cases
Lead technical choices to implement solutions as self-service products, ensuring scalable operation
Collaborate with product security as well as operations security to set best practice and mitigate new threats in a timely manner
Automate operations for reuse across the worlds largest companies, taking into consideration the complexities of distributed systems
Collaborate with development teams to design service architecture, documentation, playbooks, policies and operational procedures
Analyze incidents and events, and establish what the reason behind the reasons are, and what structural improvements can be made to minimize the chance of them reoccurring
Provide assistance and work with globally distributed engineering, operations, and support peers
Valued Skills And Experience
Bachelor's degree or greater, preferably in computer science or related engineering field
Extensive knowledge of cloud computing concepts, technologies & operation
Practical knowledge of Linux networking, routing, and firewalls, internet transit and large scale/bandwidth networking.
Hands-on experience of automatic administration of enterprise Linux servers at scale
Experience dealing with significant production outages, incident response and postmortems
The ability to see the big picture in all the details
A passion for writing, sharing, and maintaining enterprise open-source software solutions
Strong modern software engineering background (peer-review, unit testing, SCM, CI/CD, Agile)
Strong knowledge of best practices in data center design with regards to networking and hardware choices.
Able to communicate clearly and effectively in English over email, chat, video or voice calls and in-person
Be inspired by the needs of fast-changing environments
Happy to work within distributed teams
A willingness to be flexible and able to learn new things quickly
Be familiarized and passionate about open-source, especially Ubuntu or Debian