We are looking for a highly technical SW Engineering Manager to lead a newly DevOps team. As an DevOps lead, you will play a crucial role in ensuring the reliability,
scalability, and performance of our customers’ systems and services. You will lead a small team of DevOps and Operations and collaborate with
cross-functional teams to implement and maintain best practices for managing infrastructure, applications, and operations.
Your primary responsibility will be to oversee the day-to-day operations of the DevOps team and drive initiatives to improve system reliability, efficiency, and uptime.
We expect you to embrace Infrastructure as Code (IaC) at all levels with automation as a core requirement for all projects. The person taking this role will have the freedom to make significant decisions that will have significant impact.
This is very much a hands-on role; you will be expected to be in the weeds with your team.
Objectives
- Demonstrate your proven leadership capabilities and act as a change agent of the R&D team.
- Run the production environment, monitoring availability, and taking a holistic view of system health.
- Build software and systems to manage platform infrastructure and applications.
- Improve reliability, quality, and reduce the time to production.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, and innovating for continual improvement.
Evolve and modernize our CI/DC practices.
Responsibilities
- Lead, manage, develop, and grow a team of diverse, remote engineers.
- Work collaboratively with your team to plan their work and ensure that everyone know what they should be working on.
- Respond to escalations, help with incident response, RCA’s, provide management and follow up from post-mortems.
- Establish metrics for data-driven decisions to help increase availability, reliability, and Velocity.
- Directly manage the operations responsibilities: deployments, tools, troubleshooting, and performance tuning.
- Ensure service availability and performance of the SaaS platform.
- Manage rollouts of future releases and patches to the production environment.
- Develop automation wherever possible.
Qualifications
- At least 5 years of proven leadership experience within SaaS products.
- Proven ability to write code (Python, Go, Perl etc).
- Expert-level knowledge of Linux systems, network protocols.
- Experience with DevOps principles and concepts such as Infrastructure as Code, CI/CD, monitoring and visualization (Prometheus, Grafana).
- Experience with containers and container management (Docker, Kubernetes).
- Experience in developing software, troubleshooting, and monitoring large scale distributed systems.
- Implement modern software engineering best practices and maintain software development life cycle.
- Working knowledge and experience of Agile software development methodologies.
- Outstanding collaboration and communication, and documentation skills with a proven ability to work cross-functionally to establish and meet OKRs.
It would be an advantage for you to have:
- Academic degree, minimum of bachelor level, in engineering (IT, Telecom).
- Experience with big data technologies such as NoSQL/RDBMS, Elasticsearch, Databricks.
- Proven experience in supporting mission-critical applications on a global scale.