
Walrath Recruiting Inc.
Salary: $100,000 – $150,000
Job Title: Senior Site Reliability Engineer
Job #: 5406
Location: Texas or New York (Remote)
Category: Construction & Engineering
Position Type: Full-time, Direct Hire
Description:
Our client is currently seeking a Senior Site Reliability Engineer to join their team. This is a full-time permanent position.
Job Responsibilities:
- Design, build, and maintain scalable and reliable cloud infrastructure using AWS and other modern cloud technologies.
- Provide technical leadership and mentorship to junior Site Reliability Engineers, fostering a culture of collaboration, growth, and continuous improvement.
- Drive operational efficiency by automating workflows, reducing manual interventions, and optimizing processes to increase reliability and scalability.
- Develop and implement strategies to meet and exceed Service Level Objectives (SLOs) and Service Level Agreements (SLAs), ensuring high system availability and performance.
- Lead incident response efforts, perform thorough root cause analyses, and implement preventive actions to minimize downtime and service disruptions.
- Conduct proactive capacity planning and system optimization to identify performance bottlenecks, manage resource usage, and support infrastructure scalability.
- Apply cloud security best practices, including least-privilege IAM policies, secrets management, and compliance readiness for frameworks such as SOC 2 and ISO 27001.
- Perform other related duties and contribute to special projects as needed.
Minimum Qualifications:
- 5+ years of experience in Site Reliability Engineering or a related role with a strong focus on cloud infrastructure and operational excellence.
- Deep expertise in AWS services and architecture, with hands-on experience building and managing scalable cloud environments.
- Proficiency in scripting languages such as Python or PowerShell for automation and tooling development.
- Strong understanding of Linux systems, networking fundamentals, load balancing, and core security principles.
- Experience implementing GitOps practices and working with CI/CD pipelines using tools like GitHub Actions, Jenkins, ArgoCD, or similar.
- Skilled in Infrastructure as Code (IaC) using tools such as Terraform to provision and manage cloud resources.
- Demonstrated experience designing and operating observability stacks (e.g., Prometheus, Grafana, ELK), focusing on metrics, alerting, and meeting SLOs.
- Excellent problem-solving and analytical abilities with a keen attention to detail.
- Strong communication, collaboration, and organizational skills, with the ability to manage time effectively and prioritize tasks in a dynamic environment.
Physical Requirements:
- Ability to remain seated at a desk and work on a computer for extended periods.
- Must occasionally be able to lift and move items weighing up to 15 pounds.
Hours & Benefits:
- M-F 8-5
- Remote
- PTO
- Health, Dental, & Vision
- 401K
For more details on this role please contact jobs@walrathrecruiting.com or call 518-275-4816. Please reference the specific job number you are inquiring about.
The specific salary/pay rate offered to a candidate may be influenced by a variety of factors including but not limited to the candidate’s experience, education, and work location.
This is an immediate opportunity. The interview process will be comprised of a combination of virtual and in-person. If you are qualified for this position, please apply using our secure online form.