About the role
we are looking for a middle sre operations engineer to maintain reliability across a cloud-based saas platform. You’ll handle live incidents, improve observability, and reduce toil through automation using kubernetes, terraform, grafana, and aws. Hands-on, execution-focused, with real ownership across ci/cd pipelines, gitops workflows, and on-call rotations.
what you will do
* monitor and support production and staging environments to ensure availability, performance, and stability;
* respond to incidents, perform triage and root cause analysis, and contribute to remediation efforts;
* participate in on-call rotations with defined slas;
* handle operational requests from internal teams;
* maintain and improve monitoring, alerting, dashboards, logs, and metrics;
* support ci/cd pipelines, production releases, and gitops workflows;
* contribute to automation initiatives to reduce operational overhead;
* maintain and improve kubernetes-based infrastructure and containerized workloads;
* support infrastructure as code practices and environment improvements.
must haves
* 2+ years of experience in site reliability engineering, devops, or production operations;
* experience with aws supporting production environments;
* experience supporting production saas applications;
* strong understanding of ci/cd systems (github actions, jenkins, circleci);
* experience with gitops and git fundamentals;
* experience using github, jira, and confluence;
* experience with kubernetes (eks, kops or similar);
* experience with docker and containerization;
* experience with observability tools (grafana, prometheus, loki, pagerduty);
* proficiency in scripting (bash, python, or go);
* experience with infrastructure as code (terraform, helm);
* ability to work within structured operational processes and slas;
* strong written and verbal english communication skills;
* self-driven with a growth mindset.
nice to haves
* aws certifications such as solutions architect, devops engineer, or sysops administrator;
* experience with multi-tenant saas environments;
* experience working in globally distributed teams;
* familiarity with chatops practices;
* experience improving monitoring quality and reducing alert fatigue.
perks and benefits
* professional growth: mentorship, techtalks, and personalized growth roadmaps.
* competitive compensation: usd-based pay with education, fitness, and team activity budgets.
* exciting projects: modern solutions with fortune 500 and top product companies.
* flextime: flexible schedule with remote and office options.
#j-18808-ljbffr