Job profile summary
* responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Creates a bridge between development and operations by applying a software engineering mindset to system administration topics. Splits time between operations/on-call duties and developing systems and software that help increase site reliability and performance.
summary of this role
* responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Creates a bridge between development and operations by applying a software engineering mindset to system administration topics. Splits time between operations/on-call duties and developing systems and software that help increase site reliability and performance.
what part will you play?
* chaos engineering - thinks laterally about how systems might fail in theory, designs tests to demonstrate how they behave in practice, and then formulate and implement remediation plans, as appropriate.
* use practices from devops and gitops to improve automation and processes to make self service possible.
* pushing our systems to their limits, and then coming up with designs for how to get them to the next performance tier.
* safeguarding reliability. Ensuring that our services are highly available, resilient against disasters, self-monitoring, and self-healing.
* running "game days" to test assumptions about reliability and learn what will break before it matters to customers.
* building systems to proactively monitor the health, performance and security of our production and non-production virtualized infrastructure.
* improving our monitoring and alerting systems to make sure engineers get paged when it matters (and don't get paged when it doesn't).
* troubleshooting systems and network issues, alongside our technical operations team.
what are we looking for in this role?
minimum qualifications
* bs in computer science, information technology, business / management information systems or related field
* no experience required. Typically has a basic knowledge with programming in one or more programming languages and unix/linux systems internals and administration (e.g. filesystems, inodes, system calls) or networking (e.g. Tcp/ip, routing, network topologies and hardware, sdn).
preferred qualifications
* b2 english skills
what are our desired skills and capabilities?
* skills / knowledge - learns to use professional concepts. Applies company policies and procedures to resolve routine issues.
* job complexity - works on problems of limited scope. Follows standard practices and procedures in analyzing situations or data from which answers can be readily obtained. Builds stable working relationships internally.
* supervision - normally receives detailed instructions on all work.