Overviewthe ic3 sre engineer is responsible for supporting and enhancing the reliability, availability, and performance of the company's it infrastructure and applications. This semi-senior role focuses on improving system stability and efficiency through advanced monitoring, automation, and incident response, contributing to the overall success of it operations and strategic initiatives.main responsibilitiesadvanced system monitoring: implement and maintain advanced monitoring solutions to ensure the health and performance of infrastructure and applications.incident response: lead incident response activities, diagnosing and resolving system reliability issues, and conducting post-incident reviews.automation and scripting: develop and implement automation scripts and tools to improve system reliability and operational efficiency.performance analysis: collect, analyze, and interpret performance data to identify trends, anomalies, and potential issues, providing actionable insights.documentation: maintain accurate and up-to-date documentation of system configurations, processes, and procedures.collaboration: work closely with other it team members and departments to support reliability engineering projects and initiatives.mentorship: provide guidance and support to junior engineers, helping to enhance their technical skills and knowledge.security compliance: implement and enforce security measures to protect systems and ensure compliance with security policies.continuous improvement: drive continuous improvement initiatives, exploring new technologies and methodologies to enhance system reliability.autonomous work culture: actively contribute to creating an autonomous work culture by taking initiative, being self-motivated, and collaborating effectively in an agile and lean environment.spin culture ambassador: embody and promote spin's values in every action, fostering a positive and inclusive work environment.disaster recovery: develop and maintain disaster recovery plans to ensure business continuity in case of system failures.required knowledge and experiencebachelor's degree in computer science, information technology, or a related field, or equivalent work experience.minimum of 5+ years of experience in site reliability engineering or related fields.strong understanding of system reliability concepts, including monitoring, automation, and incident response.proficiency with scripting languages and automation tools.strong problem-solving and troubleshooting skills.excellent communication and teamwork skills.willingness to learn and adapt to new technologies and processes.data-driven mindsetstrong communication skillsenglish level: intermediate to advancedspin está comprometida con un lugar de trabajo diverso e inclusivo. Somos un empleador que ofrece igualdad de oportunidades y no discrimina por motivos de raza, origen nacional, género, identidad de género, orientación sexual, discapacidad, edad u otra condición legalmente protegida. Si desea solicitar una adaptación, notifique a su reclutador.seniority levelmid-senior levelemployment typefull-timejob functionengineering and information technology
#j-18808-ljbffr