Principal Software Engineer
The Senior Infrastructure Engineer will work within the Sophos Core Platform Group System Engineering team. This team is responsible for driving system resiliency, scalability, and supportability through software & infrastructure development, system design consultation, and development of operational best practices. This team is also responsible for the development and operations of all MongoDB and Redis shared infrastructure on the platform. You will work closely with site reliability engineering, engineering services, and service development teams as you pursue your mission. As a senior member of the system engineering team, you will directly influence both system design and best practice across Sophos’ global cloud development organization.
- Develop and operate shared infrastructure on the Central platform. This includes large scale Mongo DB, Redis, and Memcached infrastructure.
- Actively re-factor existing platform infrastructure and java code to increase resiliency, scalability, and cost efficiency of Central applications and platform services.
- Develop and maintain sustainable automation for deployment and management of java applications and infrastructure in production.
- Develop and promulgate system design and operational best practices to be adopted by development, engineering services, and SRE teams.
- Implement monitoring, telemetry, and debugging tools within the platform to improve operational response and troubleshooting capabilities.
- Triage and troubleshoot system errors in prod and pre-prod Central platform environments. Analyze logs and communicate potential code issues to development teams.
- Consult with application development teams on use of shared infrastructure and general system engineering aspects of service design.
- Actively develop system improvements and refactoring in the context of incident postmortem to improve system resiliency.
- Influence service decomposition priorities as we extract services from the monolithic components on the platform.
- Actively monitor system performance of “SOA” processes in production.
Experience and Skills
- BS in Computer Science or equivalent experience
- 5+ years experience in system engineering, infrastructure development, or SRE roles working with large scale cloud systems.
- Strong competency in the following essential skills:
- Linux operating systems
- Scripting / software development with languages such as bash, Python, Go, and Java
- Public cloud platforms. Amazon Web Services (AWS), Azure, or GCP
- Declarative infrastructure as code frameworks such as Terraform or CloudFormation
- Conceptual understanding of design, implementation, and operational patterns for:
- Distributed cloud systems
- Large scale distributed data stores. Expertise in Mongo and Redis are particularly valuable in this role.
- System monitoring & telemetry
- Familiarity with relational, noSQL, and in memory data stores, particularly MongoDB and Redis. Ability to understand and modify both mongo and SQL statements and scripts to fulfill database configuration and setup needs for applications. Ability to perform database installation and configuration steps, including backup and restoration of test data in multiple environments.
- Excellent written and verbal communication skills to coordinate with worldwide development and operations teams
Job Type : Full-Time
Apply at: : https://www.sophos.com/en-us/company/careers.aspx