Job Description
**Visa: USC or GC or GC-EAD (W2) / H1B (C2C)**
**Best way to reach:-**
Email:
**dave@brightintelli.com**
**Position: API Production Support (100% REMOTE)**
**Duration: Multi-year Contract**
(3 Openings. 24×7, On-Call Production Support Team)
**Requirements:**
* **MUST be available to work in a 24×7, Level 2 API support and incident response service team**
* **ON CALL Required**
* **Expertise in MuleSoft API troubleshooting and support**
* **Experience using monitoring tools for API management like Azure Monitor, Splunk and Dynatrace**
* **Familiarity with ServiceNow tools for incident tracking and documentation**
* Ability to use enterprise runbooks and wiki documentation for issue resolution
* Ability to collaborate with multiple internal and external stakeholders, including the Tier 3 team and Support Lead
* **Preferably a Java background to understand stack traces**
, logs in order to pinpoint root cause
* **Experience with SOAP/REST APIs with Spring Boot and Java microservices**
* **Experience with MuleSoft AnyPoint Platform including Exchange and monitoring**
* **Use Azure, Splunk and Dynatrace-based dashboards for monitoring and resolution**
* Conduct root cause analysis, escalate issues to internal Tier 3 team as necessary, and engage multiple vendors for resolution when required
* Use enterprise runbooks, wiki documentation, and collaboration with the Tier 3 team or Support Lead
* Provide 24×7 on-call support as a primary or secondary contact (rotation basis)
* **Serve as API support on least one major incident call per day, averaging 2 hours**
* API-related incidents through ServiceNow and based on Moogsoft tickets
* **Troubleshoot and resolve issues within L2 incident criteria**
* Ensure timely response and resolution of API-related incidents per agreed SLAs
* Perform initial triage, log analysis, and impact assessment
* Ensure monitoring and alerts are accurate, current, and functional
* Utilize enterprise runbooks and wiki documentation for troubleshooting and resolution
* Participate in Problem and Knowledge Management process as requested
* Observability support for incident management to proactively identify, diagnose and resolve issues
* Conduct detailed RCA (Root Cause Analysis) for recurring or high-impact incidents
* Provide RCA reports with contributing factors, corrective actions, and long-term recommendations
* Work with internal teams to implement preventative measures
* Collaborate with the Tier 3 team or support lead when necessary to resolve complex issues
* Maintain documentation of escalations, including logs, timestamps and resolution progress
* After RCA, determine and contact relevant vendors required for issue resolution
* Provide necessary logs, issue descriptions, and troubleshooting details to vendors
* Track vendor resolution progress, coordinate efforts, and update stakeholders