Role overviewhiring a senior site reliability engineer (sre) to join platform engineering team. This role is responsible for the shared aws infrastructure that supports client ́s core products, including a large monolithic web application as well as a growing set of microservices. This is not a "ticket‐based ops" role — it's about building and evolving the platform that engineering relies on.core responsibilitiesinfrastructure & reliabilityown and manage shared aws infrastructure used across the companymaintain and operate eks clustersensure reliability, scalability, and performance of production systemsmonitor infrastructure health and proactively address issuesobservability & monitoringown monitoring, logging, and alerting across infrastructure and applicationsheavy use of:grafanaopen search clustersdesign alerts that:detect infra and application issues earlyare actionable (not noisy)drive observability standards across teamsci/cd & automationdesign, build, and maintain ci/cd pipelinesimprove deployment safety, speed, and consistencyautomate infrastructure and development workflowspartner closely with engineering and qa to support reliable releasesmust‐have experiencesenior‐level experience in sre, dev ops, or platform engineeringstrong aws experienceinfrastructure as code (terraform preferred)kubernetes / eks in production environmentsdesigning and operating ci/cd pipelineshands‐on experience with observability toolingmonitoringloggingalerting (grafana or similar)