Overview:
*we are pepsico*
pepsico is one of the world's leading food and beverage companies with more than $79 billion in net revenue and a global portfolio of diverse and beloved brands. We have a complementary food and beverage portfolio that includes 22 brands that each generate more than $1 billion in annual retail sales. Pepsico's products are sold in more than 200 countries and territories around the world. Pepsico's strength is its people. We are over 250,000 game changers, mountain movers and history makers, located around the world, and united by a shared set of values and goals.
*responsibilities*:
*the opportunity*
we are seeking a talented observability engineer to join our global command centre. This role focuses on maintaining and supporting our observability platforms, including elastic (elk stack), while ensuring effective l1 bau (business-as-usual) support. You’ll work proactively to reduce major incidents through advanced monitoring, anomaly detection, and predictive insights, playing a critical role in system reliability and continuous service improvement.
*your impact*
as *cto assoc manager - observability engineer & elastic bau* your scope would consist:
- maintain and support observability platforms (grafana, appdynamics, sevone, thousandeyes, elk) with focus on l1 operational support and incident triage.
- act as the first line of defense for observability-related alerts, handling log ingestion issues, dashboard or alert failures, and routing/escalating complex issues to l2/l3 as appropriate.
- configure, optimize, and maintain alerting, logging, and telemetry for performance, capacity, and anomaly detection across diverse systems.
- utilize elastic stack (elasticsearch, logstash, kibana) for log correlation, alerting, and operational dashboards to support incident analysis.
- drive automation to reduce toil and improve platform reliability using scripting (e.g., bash, powershell, python, ansible).
- participate in rca/postmortem reviews and support documentation for learnings and continuous improvement.
- provide mentorship to command center staff on observability tooling and operational best practices.
Qualifications:
*who are we looking for?*
*education*
- bachelor’s degree in computer science, information technology, or equivalent experience.
*experience*
- 3+ years of experience with observability/monitoring tools (grafana, appdynamics, thousandeyes, sevone).
- hands-on experience with elk stack (elastic, logstash, kibana) for log ingestion, visualization, and basic troubleshooting.
- strong understanding of cloud monitoring, preferably on microsoft azure.
- basic knowledge of infrastructure components such as routers, load balancers, compute, and storage.
- exposure to scripting languages for automation (ansible, bash, python, powershell).
- familiarity with itil processes and experience in operations or command center environments.
- excellent communication and documentation skills.
- strong problem-solving skills, with a proactive and analytical mindset.
*what can you expect from us*:
- excellent analytical skills and the ability to translate analytical findings into actionable solutions and processes
- organized personality
- team player
- ability to manage stress and meet deadlines while maintaining high levels of accuracy
- advocates for and embraces the use of new tools and techniques
- problem solving
- seeks opportunities to strengthen digital culture through collaborating and sharing knowledge
- track record of improving processes, leading projects and influencing decision-makers
- has an informed opinion on digital trends, including fluency with specific digital technologies
*we are an equal opportunity employer and value diversity at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We respect and value diversity as a work force and innovation for the organization.