• Full Time
  • Permanent
  • Remote
  • $100-$150k USD / Year

Walrath Recruiting Inc.

Salary: $100,000 – $150,000
Job Title:  Senior Site Reliability Engineer
Job #: 5406
Location: Texas or New York (Remote)
Category: Construction & Engineering
Position Type: Full-time, Direct Hire

Description:
Our client is currently seeking a Senior Site Reliability Engineer to join their team. This is a full-time permanent position.

Job Responsibilities:

  • Design, build, and maintain scalable and reliable cloud infrastructure using AWS and other modern cloud technologies.
  • Provide technical leadership and mentorship to junior Site Reliability Engineers, fostering a culture of collaboration, growth, and continuous improvement.
  • Drive operational efficiency by automating workflows, reducing manual interventions, and optimizing processes to increase reliability and scalability.
  • Develop and implement strategies to meet and exceed Service Level Objectives (SLOs) and Service Level Agreements (SLAs), ensuring high system availability and performance.
  • Lead incident response efforts, perform thorough root cause analyses, and implement preventive actions to minimize downtime and service disruptions.
  • Conduct proactive capacity planning and system optimization to identify performance bottlenecks, manage resource usage, and support infrastructure scalability.
  • Apply cloud security best practices, including least-privilege IAM policies, secrets management, and compliance readiness for frameworks such as SOC 2 and ISO 27001.
  • Perform other related duties and contribute to special projects as needed.

 

 Minimum Qualifications:

  • 5+ years of experience in Site Reliability Engineering or a related role with a strong focus on cloud infrastructure and operational excellence.
  • Deep expertise in AWS services and architecture, with hands-on experience building and managing scalable cloud environments.
  • Proficiency in scripting languages such as Python or PowerShell for automation and tooling development.
  • Strong understanding of Linux systems, networking fundamentals, load balancing, and core security principles.
  • Experience implementing GitOps practices and working with CI/CD pipelines using tools like GitHub Actions, Jenkins, ArgoCD, or similar.
  • Skilled in Infrastructure as Code (IaC) using tools such as Terraform to provision and manage cloud resources.
  • Demonstrated experience designing and operating observability stacks (e.g., Prometheus, Grafana, ELK), focusing on metrics, alerting, and meeting SLOs.
  • Excellent problem-solving and analytical abilities with a keen attention to detail.
  • Strong communication, collaboration, and organizational skills, with the ability to manage time effectively and prioritize tasks in a dynamic environment.

 

Physical Requirements:

  • Ability to remain seated at a desk and work on a computer for extended periods.
  • Must occasionally be able to lift and move items weighing up to 15 pounds.

 

Hours & Benefits:

  • M-F 8-5
  • Remote
  • PTO
  • Health, Dental, & Vision
  • 401K

For more details on this role please contact jobs@walrathrecruiting.com or call 518-275-4816. Please reference the specific job number you are inquiring about.

The specific salary/pay rate offered to a candidate may be influenced by a variety of factors including but not limited to the candidate’s experience, education, and work location.

This is an immediate opportunity. The interview process will be comprised of a combination of virtual and in-person. If you are qualified for this position, please apply using our secure online form.

Upload your CV/resume or any other relevant file. Max. file size: 256 MB.