Site Reliability Engineer

Staffing and Recruiting

Reston

April 16, 2025

Apply Now

Job Description

If this blog helped you, spread the word!

Greetings!!

Hiring for Sr Site Reliability Engineer, please see below JD and if interested send resume at prashanth@brathon.com

Only W2 and H1b Transfers.

Site Reliability Engineer (SRE)

Reston, VA

2 years

Job Description:

We are seeking a highly skilled and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in cloud platforms, DevOps practices, and modern software development frameworks. The SRE will play a critical role in designing, building, and maintaining highly scalable, fault-tolerant, and secure cloud infrastructure while ensuring operational excellence, high availability, and reliability.

1. Cloud Infrastructure & Automation:

• Design, implement, and manage cloud-based infrastructure using platforms like AWS, Azure, or GCP.

• Utilize Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, and Ansible to automate deployments and configurations.

• Create robust automation targeted at anomaly detection, toil reduction, recovery processes, and self-healing mechanisms, and optimize cloud costs.

2. DevSecOps & CI/CD:

• Deep understanding of DevSecOps principles and CI/CD pipelines using tools like GitLab, Jenkins, SonarQube, Nexus/Artifactory, and Docker.

• Implement security best practices, including IAM roles, RBAC, vulnerability remediation, and SAST/DAST/SCA tools.

3. Observability & Incident Management:

• Design and implement monitoring, logging, and distributed tracing solutions using tools like AWS CloudWatch, Splunk/SignalFX, Dynatrace, and OpenTelemetry.

• Lead root cause analysis, blameless postmortems, and proactive incident management to minimize MTTR and MTTD.

• Define and monitor SLOs, SLIs, and error budgets to ensure system reliability.

4. Microservices & API Management:

• Architect and manage microservices, serverless computing, and RESTful APIs.

• Ensure fault tolerance and resilience using design patterns like Circuit Breaker, Retry, Timeout, and Bulkhead.

5. Chaos Engineering & Resiliency:

• Conduct chaos engineering experiments using tools like AWS FIS and Chaos Toolkit.

• Perform resiliency assessments using Resilience Hub and implement self-healing solutions.

6. Database & Application Support:

• Manage and optimize database technologies such as PostgreSQL, MongoDB, DynamoDB, Oracle, and Redshift.

• Provide production support, including incident response, problem management, and runbook creation. Participate in on-call rotations.

7. Collaboration & Communication:

• Collaborate with cross-functional teams to implement shift-left testing practices (BDD, TDD, Unit, Regression).

• Create and maintain architecture diagrams, knowledge articles, and disaster recovery plans.

• Communicate effectively with stakeholders and demonstrate strong relationship management skills.

If this blog helped you, spread the word!

Site Reliability Engineer

Job Description

Related Jobs

Business Analyst

WWHS Paralegal

Techno-Functional CTO Lead Analyst

ObGyn Job at UPMC Hamot in Erie, PA with $400K Salary and Reasonable Call Schedule

Contact Us

Support@h1bvisahub.com

For Candidates

For Employers

About Us

Login to H1bvisahub

Reset Password

Create a free H1bvisahub account

Site Reliability Engineer

Job Description

Share this post

Related Jobs

Business Analyst

WWHS Paralegal

Techno-Functional CTO Lead Analyst

ObGyn Job at UPMC Hamot in Erie, PA with $400K Salary and Reasonable Call Schedule

Contact Us

Support@h1bvisahub.com

For Candidates

For Employers

About Us