Job Description
My Email: dave@brightintelli.com
Visa: US Citizen, Green Card, GC-EAD (W2 Only); H1B (C2C)
Position: API Production Support (100% REMOTE)
Duration: 12 Months contract to hire
Requirements
· MUST be available to work in a 24×7, Level 2 API support and incident response service team
· ON CALL Required
· Expertise in MuleSoft API troubleshooting and support
· Experience using monitoring tools for API management like Azure Monitor, Splunk and Dynatrace
· Familiarity with ServiceNow tools for incident tracking and documentation
· Ability to use enterprise runbooks and wiki documentation for issue resolution
· Ability to collaborate with multiple internal and external stakeholders, including the Tier 3 team and Support Lead
· Preferably a Java background to understand stack traces, logs in order to pinpoint root cause
· Experience with SOAP/REST APIs with Spring Boot and Java microservices
· Experience with MuleSoft AnyPoint Platform including Exchange and monitoring
· Use Azure, Splunk and Dynatrace-based dashboards for monitoring and resolution
· Conduct root cause analysis, escalate issues to internal Tier 3 team as necessary, and engage multiple vendors for resolution when required
· Use enterprise runbooks, wiki documentation, and collaboration with the Tier 3 team or Support Lead
· Provide 24×7 on-call support as a primary or secondary contact (rotation basis)
· Serve as API support on least one major incident call per day, averaging 2 hours
· API-related incidents through ServiceNow and based on Moogsoft tickets
· Troubleshoot and resolve issues within L2 incident criteria
· Ensure timely response and resolution of API-related incidents per agreed SLAs
· Perform initial triage, log analysis, and impact assessment
· Ensure monitoring and alerts are accurate, current, and functional
· Utilize enterprise runbooks and wiki documentation for troubleshooting and resolution
· Participate in Problem and Knowledge Management process as requested
· Observability support for incident management to proactively identify, diagnose and resolve issues
· Conduct detailed RCA (Root Cause Analysis) for recurring or high-impact incidents
· Provide RCA reports with contributing factors, corrective actions, and long-term recommendations
· Work with internal teams to implement preventative measures
· Collaborate with the Tier 3 team or support lead when necessary to resolve complex issues
· Maintain documentation of escalations, including logs, timestamps and resolution progress
· After RCA, determine and contact relevant vendors required for issue resolution
· Provide necessary logs, issue descriptions, and troubleshooting details to vendors
· Track vendor resolution progress, coordinate efforts, and update stakeholders,