With 7+ years of experience, you're not just supporting systems, you’re shaping the future of oracle’s production services.
*what you’ll do*:
- lead the installation, maintenance, monitoring, and optimization of production server infrastructure across oracle cloud.
- act as a technical escalation point for highly complex issues, including coordinating cross-functional teams and third-party vendors to resolve incidents.
- represent the infrastructure team as a technical sme on major incidents, service calls, and cross-org initiatives.
- contribute to the evolution of slos, slas, and kpis for services you support, driving reliability and performance at scale.
- standardize, automate, and improve operational processes and system efficiency using your linux and scripting expertise.
- own or co-own key service improvement projects, from roadmap ideation to post-deployment impact analysis.
- provide rotational on-call support to ensure high availability of services across a 12-hour, 7-day coverage model.
*what we’re looking for*:
- expert-level experience with linux system administration in large-scale or cloud-native environments
- strong scripting skills in python and bash/shell
- solid understanding of networking fundamentals
- proficiency with monitoring tools such as grafana, new relic, prometheus, etc.
- demonstrated success in supporting production environments with high performance and availability expectations
*nice to have*:
- familiarity with oracle cloud infrastructure (oci) or similar cloud platforms (aws, azure, gcp)
- experience defining or evolving kpis, slos, and slas
- exposure to infrastructure automation tools or infrastructure-as-code (e.g., terraform, ansible)