• AWS Region Operational Excellence Manager

    Location US-VA-Herndon
    Posted Date 3 months ago(4/13/2018 12:13 PM)
    Job ID
  • Job Description

    Are you an operations truffle-hound? Do you have the knack to sniff out complex operational issues, dig them out, and systematically solve them?

    We are seeking a talented Region Operational Excellence (ROE) Manager to help us evolve our practices, policies, and processes to improve commercial cloud services customer experience for national security, defense and law enforcement communities . Region operations managers are expert in a region's purpose, policies, and operational posture. We know the use cases of the customer, operations best practice, and drive operations to meet customer needs at Amazon scale. We drive out snowflakes and drive in region operational parity to ensure these regions to operate and deliver consistently with Commercial Regions.

    Amazon Web Services (AWS) is obsessed with raising the bar in our pursuit of delivering secure, highly available software defined infrastructure services. We develop new practices and sustain existing IT services through our rapid customer-driven pace of innovation. We build, operate, and maintain our global infrastructure through continuous integration/continuous deployment automation. As a region operations manager, you know there is always space to improve operations for greater efficiency to meet customer needs. You are reactive and proactive with operations. You have a sixth-sense about operations areas to improve. You use your high-judgement to act and solve them. When you see a problem area arise at a micro level, you identify and solve for cross-team risks at macro levels. You iterate practice, policy, and automation initiatives to continually improve operations for the benefit of our customers.

    In this role you will
    • Coordinate operations work with multiple AWS service team managers to plan, deploy, and support large scale AWS services and features
    • Be obsessed with our customer's needs and drive them back into AWS service teams
    • Data Dive operational support systems and dig out operations data artifacts to help us answer Operational Risk, Effort, and Priority questions for our service teams and customers
    • Identify, collect and refine supporting key performance indicators (KPI) to produce informed technical designs for region-wide operations
    • Monitor service trends to identify opportunities for improvements within existing frameworks, tools and processes to continuously improve systems
    • Be a point of escalation for operational events, support best practices, and drive operational issues to resolution
    • Provide operational clarity, communicate complex concepts across regional diverse technical domains, and drive organizational resolution
    • Audit and improve system metrics, alarms, and architectures to increase availability and velocity at scale
    • Devise, develop, and champion AWS best practices within and between teams
    • Drive operational priorities to improve operations and deliver results
    • Define ways to measure Risk and Velocity by producing scorecards from operational data artifacts
    • Work with tools teams to define mechanisms to measure KPI, correlate them into cause/effect relationships, and use them to drive operationally excellent results
    This position is perfect for someone who can think about the big picture and iterate operational improvements to get us there. If you insist on high operational standards and can think big to improve operations, this is the position for you.

    Basic Qualifications

    • Bachelor’s Degree in Computer Science or Systems Engineering or related discipline
    • 10+ years of experience managing DevOps teams in a Linux/Unix environment in large-scale, 24x7 production Environments
    • 4+ years with data analysis and business metrics at large scales
    • 4 + years managing multiple cross-organizational, highly demanding projects
    • 4+ years of experience with common scripting or programming languages such as JAVA, Ruby, BASH, Python, or Perl
    • This position requires that the candidate selected be a U.S. citizen and obtain and maintain an active TS/SCI security clearance with polygraph.

    Preferred Qualifications

    • Network administration in a 24x7 environment
    • Experience with computer security and compliance: network security, application security, security protocols, cryptography
    • Experience with agile software development practices
    • Proven ability to troubleshoot large, complex distributed systems to identify root causes
    • Experience with service-oriented architecture and current web service technologies
    • Exposure to machine learning data analysis techniques
    • Excellent written and oral communication skills
    • Excellent technical operations process and procedural documentation skills
    • Excellent problem-solving skills with a strong attention to detail
    • Proven ability to define priorities to drive results across multiple DevOps Team to result in high customer satisfaction.
    • Ability to dive deep to analyze complex issues, collect and analyze data to produce data-driven operational performance improvement recommendations, apply data analysis to identify root cause, solve problems, and automate repetitive tasks.

    Amazon is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.

    **If you have any questions or would like to follow up on your application please contact Renee Rushefski directly at rrushefs@amazon.com.**
    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share this job