Our mission: To be Earth's most customer-centric company.
Amazon has built a reputation for excellence with a mission to be the earth’s most customer-centric company, a company that customers from all over the globe will recognize, value, and trust for both our products and our service. Amazon Web Services (AWS) is carrying on that tradition while leading the world in cloud technologies.
The Escalation and Event Management (E2M) team is part of the broader AWS Support organisation and is dedicated to managing critical escalations, customer facing communications, and handling large-scale customer impacting events. E2M’s purpose is to drive operational excellence and improvements to the overall customer experience.
E2M is looking for people who are detailed, analytical thinkers as well as creative problem solvers, with a strong bias for action. You are someone who is not constrained by the notion of “how things are usually done”, and you are equally comfortable operating in the minute detail, as well as with coordinating efforts at the forty thousand foot view. You confidently act as an advocate of your customer, maintaining composure in dynamic and high pressure situations. You are comfortable working on highly technical initiatives to consistently improve the AWS customer experience. You are someone who excels at working in a dynamic environment while collaborating with some of the smartest people in the industry, and you get excited about owning critical infrastructure services that serve global customers, every second of the day!
Finally, you are passionate about technology with a desire to learn more and do more with AWS.
ABOUT THE ROLE
As members of the AWS Support E2M Event Management team, we work to identify widespread and systemic customer facing problems for Amazon Web Services. We are responsible for monitoring internal tools to identify customers impacting issues. When a problem is identified, we ensure the appropriate parties are engaged to drive the resolution of the problem and act as an advocate of the customer to both report on and manage the customer experience. Because of our unique role as Escalation Engineers, we have front-and-center limitless exposure to all things AWS, including numerous leading edge technologies.
Every day will bring new and exciting challenges that include elements of:
- Provide critical incident response/management (including leading calls with internal/external participants) for customer’s critical workloads and AWS Service Teams
- Provide crisp and timely communication on developing issues to relevant internal and external customers
- Drive down mean time to engagement and communication for all incident types
- Facilitate Post-Mortem/Root Cause Analysis after each event to mitigate problem recurrence
- Work with key stakeholders across AWS to improve the customer experience and develop mechanisms that support operational excellence
- Analyze data trends on internal tickets, customer contacts, social media, and network monitors to identify potential issues
- Build a broad understanding of AWS architecture and service inter-dependencies
- Design, build, or collaborate on solutions using automation and self-repair rather than relying on human intervention
- Other duties as required by the organization
- 5+ years of experience with Incident/Event Management for mission critical services
- 2+ years of experience in Support Engineering, Network Engineering, Solutions Architecture, or similar IT role.
- Bachelor’s degree in Computer Science, Information Science/Technology, Communications Engineering, Business, or a related field (or 6+ years of relevant work experience)
Candidates that have been most successful after joining our team have demonstrated capabilities in one or more of these areas:
- Excellent written and oral English communication skills
- Familiarity operating or designing distributed architectures with the ability to correlate system behaviors based on known inter-dependencies
- Experience creating or designing cloud application architectures with a focus on high availability
- Industry specific certification(s) such as the AWS SysOps Administrator certification
- Ability to review complex technical details regarding ongoing issues/events and convey the key details to senior stakeholders to facilitate real-time decision making