web analytics

Senior Principal Incident Response Analyst

Autodesk

Autodesk gives you the power to make anything.

Job Requisition ID #

22WD57047

Position Overview

Want to help make a better world? As Incident Commander and Analyst at Autodesk you can do just that. How is this possible? You will shape the frontier of customer facing cloud services support at Autodesk by being an elite, Senior Incident Commander and Analyst. Your finger will be on the pulse of Autodesk Cloud Services that empower our customers to “Make Anything”. At Autodesk, we believe that incidents are unplanned investments so, your focus will be as a member of a cadre that drives incidents to resolution and extracts deep learnings from them. Reporting to the Customer Facing technology operations organization, you demonstrate the ability to operate independently, collaborate with cross-functional teams to ensure that customer impact is mitigated, and incidents are fully understood. Above all, you help Autodesk deliver the highest quality of service for all the customers we serve.

Responsibilities

Communication

  • You are a superior communicator. Written, verbal, and nonverbal language are all essential skills to be an effective and trustworthy leader.
  • You understand how to negotiate across multiple stakeholders and points of view.
  • You can develop and maintain strong relationships with team members by mutual earned respect and the ability to persuade with facts, logic, enthusiasm, and a proven track record.

Incident Management

  • Act in the role of an Incident Commander to facilitate high-severity incident triage. Ensure that high-severity incidents achieve the necessary cross-functional engagement to drive them to resolution in a timely fashion.Communicate clear updates to stakeholders in a timely fashion. Participate in on-call rotation for Incident Commander role for after hours and weekends.
  • Participate in regular review of open Incidents and evaluate if Level1 (Cloud SOC) and Level2 (DevOps) teams are remediating incidents in a timely and effective manner.
  • Drive the use of incident metrics and perform a first-level analysis of incident data to gain insights as to service performance and patterns of emerging issues.
  • Run regular Incident Review meetings with Cloud Operations cross-functional teams. Provide focus for the meetings to maximize benefit and respect the valuable time of the cross-functional representatives who attend.
  • Provide oversight for Cloud Service Operation Center Level 1 engineer performance. Act as one escalation point for the Cloud Services Operation Center manager for investigation of issues regarding process or performance.

Incident Analysis

  • Run post-incident debrief meetings to drive engagement with incident responders.
  • Analyze incidents using an interview-based approach to extract deep learnings from incidents allowing the organization’s knowledge to grow as a result.
  • Engage with cross-functional Engineering Teams to ensure that Incident follow-up (forensic) activities are happening in a timely fashion (as governed by our published internal processes).
  • Develop and implement data analyses, data collection, and other strategies that optimize platform resiliency and quality.
  • Work with Autodesk Engineering teams and leaders to recommend improvements based on analysis. Periodically reviewengagement/follow-throughof cross-functional teams to ensure forward progress is being made.

Service on-boarding and documentation

  • Act as a facilitator for on-boarding of new services to the Cloud SOC. Conduct meetings with cross-functional teams to educate them and act as their mentor through the on-boarding process.
  • Perform required periodic review of new and revised runbooks, evaluating them for their efficacy and relevance. Confer with subject matter experts in Forge and Engineering regarding enhancement of existing runbook documentation.

Minimum Qualifications

  • 15-17 years of experience in a similar operations function within a high availability (HA), 24×7, mission critical operations environment providing or leading front-line support for a public-facing service with a high-volume, paying customer base.
  • 3-5 years’ experience leading or defining processes for high availability production environments or services.
  • Bachelor’s degree in computer science or a related technology field or equivalent experience
  • Proficient in effectively communicating to a wide range of audiences in both written and oral form. This includes leading online meetings related to major customer impacting incidents with many participants in English.
  • Ability to participate in an on-call rotation for the Incident Commander role including off-hours and weekends.
  • Must be process-oriented, energetic and an analytical thinker.
  • Ability to understand how technical deployments and outages impact customers and partners, and the experience to drive mitigation.
  • Solid understanding of basic Amazon Web Services infrastructure services, with exposure to serverless technologies, such as Aurora and Lambda.
  • Solid understanding of concepts and technologies such as, but not limited to: cloud computing, server clusters, high availability network configurations, DNS, SMTP, NTP, NAS and HTTP.
  • Able to assimilate knowledge of new systems quickly and be adaptable.

Preferred Qualifications

  • 8+ years of experience in a similar operations function within a high availability (HA), 24×7, mission critical operations environment
  • Experience with Jira, Confluence and ServiceNow.
  • Knowledge of incident analysis, problem, and change management practices.
  • Understanding of New Relic and similar observability tools.
  • Experience with defining and maintaining operational processes.
  • Experience administering Amazon Web Services accounts and instances, or network infrastructure (switches, routers, firewalls, etc.).
  • Experience defining, analyzing, and maintaining Operational Reports such as (but not limited to) SLA and Outage reports, Operations Performance reports, Maintenance reports, Operations Containment reports, etc. for internal as well as external consumption.
  • Experience with Managed Service Providers particularly with global accounts.

About Autodesk

With Autodesk software, you have the power to Make Anything. The future of making is here, bringing with it radical changes in the way things are designed, made, and used. It’s disrupting every industry: architecture, engineering, and construction; manufacturing; and media and entertainment. With the right knowledge and tools, this disruption is your opportunity. Our software is used by everyone – from design professionals, engineers and architects to digital artists, students and hobbyists. We constantly explore new ways to integrate all dimensions of diversity across our employees, customers, partners, and communities. Our ultimate goal is to expand opportunities for anyone to imagine, design, and make a better world.

#LI-POST

At Autodesk, we’re building a diverse workplace and an inclusive culture to give more people the chance to imagine, design, and make a better world. Autodesk is proud to be an equal opportunity employer and considers all qualified applicants for employment without regard to race, color, religion, age, sex, sexual orientation, gender, gender identity, national origin, disability, veteran status or any other legally protected characteristic. We also consider for employment all qualified applicants regardless of criminal histories, consistent with applicable law.

Are you an existing contractor or consultant with Autodesk? Please search for open jobs and apply internally (not on this external site). If you have any questions or require support, contact Autodesk Careers.

To apply for the job click here

Senior Principal Incident Response Analyst

To apply for the job click here

Contact us

Autodesk

Autodesk gives you the power to make anything.

Related Jobs