Site Reliability Engineer (SRE) – AWS Cloud (Terraform & Ansible Focus) - 111

Remote
Posted within last 24 Hours

We are seeking a Site Reliability Engineer (SRE) with deep expertise in AWS cloud infrastructure, Infrastructure as Code (IaC), and large-scale production operations. This role is heavily focused on designing, deploying, automating, and optimizing cloud-native infrastructure in AWS using Terraform and Ansible.

You will work at the intersection of software engineering and cloud operations to build resilient, scalable, secure, and highly automated systems that power mission-critical applications.

Key Responsibilities

AWS Cloud Architecture & Deployments

  • Design, implement, and maintain scalable, secure, and highly available infrastructure in AWS
  • Lead large-scale AWS deployments across multi-account, multi-region environments
  • Architect and optimize solutions using services such as:
    • EC2, EKS, ECS, Lambda
    • VPC, Route 53, CloudFront
    • RDS, DynamoDB, S3
    • IAM, KMS, Secrets Manager
  • Implement well-architected solutions aligned with AWS best practices

Infrastructure as Code (Terraform Focus)

  • Develop and maintain reusable, modular Terraform code
  • Build CI/CD-driven infrastructure pipelines
  • Manage Terraform state securely (remote backends, locking, environment separation)
  • Enforce policy-as-code and guardrails
  • Review and optimize Terraform modules for performance and maintainability

Configuration Management & Automation (Ansible Focus)

  • Design and maintain Ansible playbooks and roles
  • Automate configuration management and application deployments
  • Integrate Ansible with CI/CD pipelines
  • Ensure idempotent, secure, and maintainable automation

Reliability & Operations

  • Define and implement SLOs, SLIs, and error budgets
  • Lead incident response, root cause analysis (RCA), and postmortems
  • Improve observability using logging, monitoring, and tracing tools
  • Optimize system performance, cost, and resilience
  • Build self-healing infrastructure and automation-first solutions
  • Required to participate in recurring On Call shifts

DevOps & CI/CD

  • Design and maintain CI/CD pipelines for infrastructure and applications
  • Promote GitOps workflows
  • Integrate automated testing, security scanning, and compliance validation

Security & Compliance

  • Implement least-privilege IAM policies
  • Automate security controls within Terraform and Ansible
  • Ensure compliance with internal and regulatory standards
  • Implement infrastructure security best practices (network segmentation, encryption, patching)

Company Description:

GR8 People Software provides the foundation for enterprising companies to cultivate, collaborate and engage with high potential talent; recruiting the best more quickly, transforming relationships into better business performance. 

We believe that GR8 players do their best work when surrounded by other GR8 players and are always looking for individuals who are leaders in their fields to join our team. We are boldly disrupting talent software by bringing the whole talent world together into one, powerful, seamless solution.  Join Us!