Cloud infrastructure specialist
the ideal candidate will own and evolve cloud infrastructure that powers ai-driven products, ensuring scalability, reliability, and security across distributed systems.
* lead cross-cutting infrastructure projects not tied to app/platform changes, such as domain migrations for pipelines and customer-facing sites, networking redesigns to avoid ip exhaustion, and medium-sized automation initiatives.
* design and evolve aws networking and environments to support more dev/test sites, future saas infrastructure, and sandboxes where dev teams can experiment safely.
* define and implement a disaster recovery strategy and secondary region or dr zone to improve resilience and recovery time.
* implement cost, reliability, observability, and monitoring improvements across services, using metrics and logs to guide optimization.
* design, maintain, and evolve aws-based infrastructure, including ecs, rds/aurora, lambda, s3, cloudwatch, cdk, vpcs, subnets, security groups, route 53, and load balancers.
* upgrade aws aurora postgres clusters to the latest supported versions, ensuring high availability, data integrity, and minimal downtime.
* own and improve ci/cd pipelines using github actions for production deployments, covering containerized services and lambda-based workloads.
* manage infrastructure as code using aws cdk (typescript) and other iac practices to drive automation, consistency, and repeatability.
* consolidate and optimize shared tooling, utility scripts, and reusable components across multiple repositories.
* collaborate with engineering and leadership to define the infrastructure roadmap, influence architecture decisions, and promote devops culture and best practices.
required skills and qualifications
* 6+ years of experience in devops/site reliability engineering, including ownership of multi-quarter infrastructure projects or leadership roles.
* strong expertise with aws services such as ecs, rds/aurora, lambda, s3, cloudwatch, cdk, and core networking (vpc design, routing, subnets, security groups, nat, dns/route 53, load balancers).
* proficient with docker, github actions, and modern ci/cd patterns for cloud-native applications.
* deep knowledge of postgres administration, including upgrades, backups, and performance tuning.
* strong scripting and automation skills with typescript, python, or bash.
* proven ability to architect scalable, secure, and reliable cloud environments, including dr strategies and cost-optimization practices.
* experience improving observability (metrics, logs, traces, alerting) and using it to guide reliability and cost improvements.
* excellent communication and collaboration skills, with a track record of working closely with engineers and stakeholders to execute infra roadmaps.
* self-driven, practical, and detail-oriented, comfortable making decisions, documenting trade-offs, and delivering high-quality results with limited supervision.
* upper-intermediate english level.
benefits
* professional growth: accelerate your professional journey with mentorship, techtalks, and personalized growth roadmaps.
* competitive compensation: we match your ever-growing skills, talent, and contributions with competitive usd-based compensation and budgets for education, fitness, and team activities.
* a selection of exciting projects: join projects with modern solutions development and top-tier clients that include fortune 500 enterprises and leading product brands.
* flextime: tailor your schedule for an optimal work-life balance, with the option of working from home or going to the office – whatever makes you the happiest and most productive.
perks and benefits
* meaningful career advancement opportunities
* competitive salary and benefits package
* collaborative and dynamic work environment
* flexible work arrangements