Technical Incident Manager

US-WA-Seattle
1 year ago
Job ID
400158

Job Description

Amazon Web Services’ Technical Operations team (TechOps) is Amazon’s central defense against large-scale incidents as well as driving operational excellence across all of Amazon businesses. Our key offering to Amazon is best-in-class Incident Management. Our engineers are front-and-center in driving down event duration through experience in operational excellence, best current practices and incident management tools. We’re looking for engineers who have owned or participated in operational and incident management for at least one large-scale enterprise. You should have a passion for working with new technologies and are not afraid to exercise your creativity in pushing the boundaries of existing technologies. Running incident management for AWS is unique in that AWS supports more than 30% of the internet’s businesses, and our ability to identify and mitigate issues is the most important aspect of every Amazon employee. Because of our unique role, you will have limitless exposure to all things Amazon. TechOps engineers are encouraged to build solutions to problems while sharing the benefit of those solutions with other AWS service teams. This is an excellent opportunity to join one of Amazon’s world-class team of engineers, and work with some of the best and brightest while also developing your skills and career within one of the most dynamic, innovative and progressive technology companies anywhere. In addition to a stimulating and fun working environment, Amazon offers mentoring programs with experienced engineers, regular tech talks with technology Principals, and well-defined career paths for motivated engineers who want to contribute to our culture of operational excellence and relentless customer-focused technical innovation.

Responsibilities
• Be a technology evangelist and use your deep knowledge to solve business problems
• Reduce mean time to resolution for all incident types
• Design and/or build world class listening systems
• Adapt and improve operations management systems and processes to accommodate rapid and increasing growth
• Participate in Agile sprints to evolve business processes and technologies
• Create and review documentation, design new standard operating procedures
• Identify and troubleshoot recurring platform issues and engage service owners to drive resolution
• Automate tasks through creation and maintenance of scripts and tools
• Respond to and complete customer requests within SLA via a trouble ticketing system
• Take part in a “follow the sun” rotation split between Seattle, Dublin and Sydney sites, including weekends and holidays
• Mentor peers in your areas of technical and operational strength
• Participate in the interviewing process

AWS is an equal opportunity employer.

Basic Qualifications

Required:
• A degree in Computer Science or at 8 years relevant experience in a large-scale online technical operations environment
• Excellent English language written and verbal communication skills to facilitate efficient and effective interaction with peers and customers
• Effective organizational skills to maintain a consistently high standard of operations in a busy environment
• Development/scripting skills in at least one interpreted language (e.g. Perl/Python/Ruby) as well as shell
• Solid grasp of Linux and networking fundamentals
• Excellent troubleshooting skills and a commitment to document findings
• Experience driving collaborative projects from conception to delivery
• Experience in Agile/Scrum or related collaborative workflow

Highly desirable:
• Knowledge of best current practice frameworks (ITIL, COBIT), particularly incident, problem and change management
• Confidence to drive and manage large conference calls
• Understanding of routing protocols to help facilitate troubleshooting and remediation of networking issues
• Working knowledge of a compiled language

Preferred Qualifications

• Experience dealing effectively with customers during problem resolution and operating efficiently under pressure
• Effective prioritization and time management
Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed