The vision of the Azure Production Infrastructure Engineering (PIE) group is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves.
If you enjoy analyzing complicated problems, coming up with creative solutions, and working in focused teams to build reliable and novel solutions, we want you to join our Site Reliability Engineering (SRE) team.
The Azure SRE team is looking for engineers with broad experience in distributed systems to join their team. SREs are people who take engineering-based approaches to solve operations problems: we like infrastructure, we like seeing how big, complicated things work, and most importantly, we gain great satisfaction from making them better.
You will be working across Azure with a focus on increasing quality, performance, and reliability of the most essential services within Azure. The infrastructure SRE team works iterate on grow SRE practices from idea to planetary-scale adoption.
Our team has a wide variety of backgrounds, from Computer Science, Mathematics, and Engineering to and Physics, Philosophy, Psychology, and English. Our diversity of knowledge and experience comes together for the benefit of our billions of daily users, our business, our colleagues, and ourselves.
As SREs, we are members of the Production Infrastructure Engineering (PIE) team and our vision is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring our complementary capabilities together with tooling and infrastructure in new ways to increase reliability through improved service health, incident response, planning, analysis, and change management.
If you are excited by this type of challenge, and you love to work in groups of people who are similarly excited, come join us. We value the input of people who aren't afraid to be learning all the time, who embrace mistakes as they show the way forward, and those who are excited to continuously improve both services and themselves. We strongly believe that diverse experiences, backgrounds, and an environment where everyone can feel safe to contribute their own insights in a data-driven, objective, and supportive way is the key to making the best workplace possible, and the best workplace makes the best products and services. Not only is it the smart thing, it's the right thing.
The vision of the Azure Production Infrastructure Engineering group is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves.
- Work across Azure's internal systems and services to design, develop, launch, and improve platforms and processes that result in improved end-to-end reliability and maintainability
- Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes generally, microservices, etc.
- Work across Azure PIE to drive processes, systems, and architectures that help deliver insights and automation to simplify the complex world of planetary scale services
- Communicate effectively and partner well with other disciplines of the project team to deliver high quality solutions from ideas to implemented solution
- Develop clean and thorough designs that exemplify quality, simplicity, and maintainability while achieving global scalability
- Design systems that prioritize the customer perspective and experience
- Quickly adapt and apply new technologies, tools, methods, and processes from both internal and external sources
- Design and influence design, implementation, and architectural direction
- Drive architectural consolidation and simplification
- Exemplify the Microsoft values of leveraging the work of others and helping others be successful through your behaviors and actions
- Bachelor of Science, Computer Science degree, or 5+ years in distributed systems implementation and management or software development
- 3+ years of experience responding to, diagnosing, and mitigating production issues and improving systems related to production issue mitigation.
- 3+ years of programming experience in automation using scripting languages such as Bash, Python, and PowerShell, or compiled languages such as C, C# are most relevant, but others are acceptable
- 2+ years of design, build, or implementation of distributed service health and telemetry
- Collaboration to accomplish large projects with excellent communication and demonstrated initiative
- Awareness of, and ability to reason about, modern software & systems architectures, including load-balancing, queueing, caching, distributed systems failure modes, and microservices
- Associated troubleshooting skills, including the ability to follow service dependency chains across arbitrary network steps
- Experience running large scale cloud systems
- Ability to analyze, understand, and solve complex problems by leveraging and extending existing technology
- Willingness and ability to respectfully challenge the status quo
- Able to operate in ambiguity and drive clarity through partnerships
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.