API Production Support

Apply Now

Job Description

If this blog helped you, spread the word!

My Email: dave@brightintelli.com

Visa: US Citizen, Green Card, GC-EAD (W2 Only); H1B (C2C)

Position: API Production Support (100% REMOTE)

Duration: 12 Months contract to hire

Requirements

· MUST be available to work in a 24×7, Level 2 API support and incident response service team

· ON CALL Required

· Expertise in MuleSoft API troubleshooting and support

· Experience using monitoring tools for API management like Azure Monitor, Splunk and Dynatrace

· Familiarity with ServiceNow tools for incident tracking and documentation

· Ability to use enterprise runbooks and wiki documentation for issue resolution

· Ability to collaborate with multiple internal and external stakeholders, including the Tier 3 team and Support Lead

· Preferably a Java background to understand stack traces, logs in order to pinpoint root cause

· Experience with SOAP/REST APIs with Spring Boot and Java microservices

· Experience with MuleSoft AnyPoint Platform including Exchange and monitoring

· Use Azure, Splunk and Dynatrace-based dashboards for monitoring and resolution

· Conduct root cause analysis, escalate issues to internal Tier 3 team as necessary, and engage multiple vendors for resolution when required

· Use enterprise runbooks, wiki documentation, and collaboration with the Tier 3 team or Support Lead

· Provide 24×7 on-call support as a primary or secondary contact (rotation basis)

· Serve as API support on least one major incident call per day, averaging 2 hours

· API-related incidents through ServiceNow and based on Moogsoft tickets

· Troubleshoot and resolve issues within L2 incident criteria

· Ensure timely response and resolution of API-related incidents per agreed SLAs

· Perform initial triage, log analysis, and impact assessment

· Ensure monitoring and alerts are accurate, current, and functional

· Utilize enterprise runbooks and wiki documentation for troubleshooting and resolution

· Participate in Problem and Knowledge Management process as requested

· Observability support for incident management to proactively identify, diagnose and resolve issues

· Conduct detailed RCA (Root Cause Analysis) for recurring or high-impact incidents

· Provide RCA reports with contributing factors, corrective actions, and long-term recommendations

· Work with internal teams to implement preventative measures

· Collaborate with the Tier 3 team or support lead when necessary to resolve complex issues

· Maintain documentation of escalations, including logs, timestamps and resolution progress

· After RCA, determine and contact relevant vendors required for issue resolution

· Provide necessary logs, issue descriptions, and troubleshooting details to vendors

· Track vendor resolution progress, coordinate efforts, and update stakeholders,

If this blog helped you, spread the word!