Site Reliability Engineering (SRE) Practitioner

Course Description

DURATION - 24 Hours

Introduces a range of practices for advancing service reliability engineering through a mixture of automation, organizational ways of working and business alignment. Tailored for those focused on large-scale service scalability and reliability.

OVERVIEW
The SRE (Site Reliability Engineering) Practitioner course introduces ways to scale services economically and reliably in an organization. It explores strategies to improve agility, cross-functional collaboration, and transparency of health of services towards building resiliency by design, automation and closed loop remediations.

The course aims to equip participants with the practices, methods, and tools to engage people across the organization involved in reliability using real-life scenarios and case stories. Upon completion of the course, participants will have tangible takeaways to leverage when back in the office such as implementing SRE models that fit their organizational context, building advanced observability in distributed systems, building resiliency by design and effective incident responses using SRE practices.

The course is developed by leveraging key SRE sources, engaging with thought-leaders in the SRE space and working with organizations embracing SRE to extract real-life best practices and has been designed to teach the key principles & practices necessary for starting SRE adoption.

This course positions learners to successfully complete the SRE Practitioner certification exam.

COURSE OBJECTIVES
At the end of the course, the following learning objectives are expected to be achieved:

Practical view of how to successfully implement a flourishing SRE culture in your organization.
The underlying principles of SRE and an understanding of what it is not in terms of anti-patterns, and how you become aware of them to avoid them.
The organizational impact of introducing SRE.
Acing the art of SLIs and SLOs in a distributed ecosystem and extending the usage of Error Budgets beyond the normal to innovate and avoid risks.
Building security and resilience by design in a distributed, zero-trust environment.
How do you implement full stack observability, distributed tracing and bring about an Observability-driven development culture?
Curating data using AI to move from reactive to proactive and predictive incident management. Also, how you use DataOps to build clean data lineage.
Why is Platform Engineering so important in building consistency and predictability of SRE culture?
Implementing practical Chaos Engineering.
Major incident response responsibilities for a SRE based on incident command framework, and examples of anatomy of unmanaged incidents.
Perspective of why SRE can be considered as the purest implementation of DevOps.
SRE Execution model
Understanding the SRE role and understanding why reliability is everyone’s problem.
SRE success story learnings

AUDIENCE
The target audience for the SRE Practitioner course are professionals including:

Anyone focused on large-scale service scalability and reliability
Anyone interested in modern IT leadership and organizational change approaches
Business Managers
Business Stakeholders
Change Agents
Consultants
DevOps Practitioners
IT Directors
IT Managers
IT Team Leaders
Product Owners
Scrum Masters
Software Engineers
Site Reliability Engineers
System Integrators
Tool Providers

LEARNER MATERIALS

Twenty-four (24) hours of instructor-led training and exercise facilitation
Learner Manual (excellent post-class reference)
Participation in unique exercises designed to apply concepts
Sample documents, templates, tools and techniques
Access to additional value-added resources and communities

PREREQUISITES
It is highly recommended that learners attend the SRE Foundation course with an accredited DevOps Institute Education Partner and earn the SRE Foundation certification prior to attending the SRE Practitioner course and exam. An understanding and knowledge of common SRE terminology, concepts, principles and related work experience are recommended.

CERTIFICATION EXAM
Successfully passing (65%) the 90-minute examination, consisting of 40 multiple-choice questions, leads to the SRE Practitioner certificate. The certification is governed and maintained by DevOps Institute.

COURSE OUTLINE

Course Introduction

Module 1: SRE Anti-patterns

Rebranding Ops or DevOps or Dev as SRE
Users notice an issue before you do
Measuring until my Edge
False positives are worse than no alerts
Configuration management trap for snowflakes
The Dogpile: Mob incident response
Point fixing
Production Readiness Gatekeeper
Fail-Safe really?

Module 2: SLO is a Proxy for Customer Happiness

Define SLIs that meaningfully measure the reliability of a service from a user’s perspective
Defining System boundaries in a distributed ecosystem for defining correct SLIs
Use error budgets to help your team have better discussions and make better data-driven decisions
Overall, Reliability is only as good as the weakest link on your service graph
Error thresholds when 3rd party services are used

Module 3: Building Secure and Reliable Systems

SRE and their role in Building Secure and Reliable systems
Design for Changing Architecture
Fault tolerant Design
Design for Security
Design for Resiliency
Design for Scalability
Design for Performance
Design for Reliability
Ensuring Data Security and Privacy

Module 4: Full-Stack Observability

Modern Apps are Complex & Unpredictable
Slow is the new down
Pillars of Observability
Implementing Synthetic and End user monitoring
Observability driven development
Distributed Tracing
What happens to Monitoring?
Instrumenting using Libraries an Agents

Module 5: Platform Engineering and AIOPs

Taking a Platform Centric View solves Organizational scalability challenges such as fragmentation, inconsistency and unpredictability.
How do you use AIOps to improve Resiliency
How can DataOps help you in the journey
A simple recipe to implement AIOps
Indicative measurement of AIOps

Module 6: SRE & Incident Response Management

SRE Key Responsibilities towards incident response
DevOps & SRE and ITIL
OODA and SRE Incident Response
Closed Loop Remediation and the Advantages

Module 7: Chaos Engineering

Module 8: SRE is the Purest form of DevOps

Post-class assignments/exercises

Non-abstract Large Scale Design (after Day 1)
Engineering Instrumentation- Instrumenting Gremlin (after Day 2)

Site Reliability Engineering (SRE) Practitioner Course with Official Exam

£250.00

Quantity

Notification of access will emailed directly by PeopleCert® providing full details how to access the course. A device with internet connection, video and audio capability will be required.
American Samoa
Antarctica
Bahamas
Bahrain
Barbados
Bermuda
Bouvet Island
Brunei Darussalam
Canada
Cayman Islands
China
Cook Islands
Dubai
Falkland Islands (Malvinas)
French Polynesia
French Southern Territories
Guam
Heard Island And Mcdonald Islands
Hong Kong
Israel
Korea, Democratic People΄s Republic Of
Korea, Republic Of
Kuwait
Macao
Malaysia
Mayotte
Nauru
New Caledonia
New Zealand
Norfolk Island
Northern Mariana Islands
Oman
Palau
Panama
Puerto Rico
Qatar
Saudi Arabia
Singapore
South Georgia And The South Sandwich Islands
South Korea
Taiwan
Taiwan, Province Of China
Thailand
United Arab Emirates
United States
Åland Islands
Andorra
Australia
Austria
Belgium
Croatia
Cyprus
Czech Republic
Denmark
Estonia
Faroe Islands
Finland
France
Germany
Gibraltar
Greece
Greenland
Guernsey
Holy See (Vatican City State)
Hungary
Iceland
Ireland
Isle Of Man
Italy
Japan
Jersey
Latvia
Liechtenstein
Lithuania
Luxembourg
Malta
Monaco
Netherlands
Norway
Poland
Portugal
Romania
San Marino
Slovakia
Slovenia
Spain
Svalbard And Jan Mayen
Sweden
Switzerland
Turkey
United Kingdom
Afghanistan
Albania
Algeria
Angola
Anguilla
Antigua And Barbuda
Argentina
Armenia
Aruba
Azerbaijan
Bangladesh
Belarus
Belize
Benin
Bhutan
Bolivia
Bonaire, Saint Eustatius and Saba
Bosnia And Herzegovina
Botswana
Brazil
British Indian Ocean Territory
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Cape Verde
Central African Republic
Chad
Chile
Christmas Island
Cocos (Keeling) Islands
Colombia
Comoros
Congo
Congo, Democratic Republic of the
Costa Rica
CoTE D'IVOIRE
Côte d'Ivoire
Cuba
CURAcAO
Curaçao
Djibouti
Dominica
Dominican Republic
Ecuador
Egypt
El Salvador
Equatorial Guinea
Eritrea
Eswatini
Ethiopia
F.Y.R.O.M.
Fiji
French Guiana
Gabon
Gambia
Georgia
Ghana
Grenada
Guadeloupe
Guatemala
Guinea
Guinea-Bissau
Guyana
Haiti
Honduras
India
Indonesia
Iran, Islamic Republic Of
Iraq
Jamaica
Jordan
Kazakhstan
Kenya
Kiribati
Kosovo
Kyrgyzstan
Lagos
Lao People΄s Democratic Republic
Laos
Lebanon
Lesotho
Liberia
Libyan Arab Jamahiriya
Madagascar
Malawi
Maldives
Mali
Marshall Islands
Martinique
Mauritania
Mauritius
Mexico
Micronesia, Federated States Of
Moldova, Republic Of
Mongolia
Montenegro
Montserrat
Morocco
Mozambique
Myanmar
Namibia
Nepal
Netherlands Antilles
Nicaragua
Niger
Nigeria
Niue
North Macedonia
Pakistan
Palestinian Territory, Occupied
Papua New Guinea
Paraguay
Peru
Philippines
Pitcairn
Reunion
Russia
Russian Federation
Rwanda
Saint Barthélemy
Saint Helena
Saint Kitts And Nevis
Saint Lucia
Saint Martin
Saint Pierre And Miquelon
Saint Vincent And The Grenadines
Samoa
Sao Tome and Principe
Senegal
Serbia
Seychelles
Sierra Leone
Solomon Islands
Somalia
South Africa
South Sudan
Sri Lanka
Sudan
Suriname
Swaziland
Syrian Arab Republic
Tajikistan
Tanzania, United Republic Of
Timor-Leste
Togo
Tokelau
Tonga
Trinidad And Tobago
Tunisia
Turkmenistan
Turks And Caicos Islands
Tuvalu
Uganda
Ukraine
United States Minor Outlying Islands
Uruguay
Uzbekistan
Vanuatu
Venezuela
Viet Nam
Vietnam
Virgin Islands, British
Virgin Islands, U.S.
Wallis And Futuna
Western Sahara
Yemen
Zambia
Zimbabwe

Site Reliability Engineering (SRE) Practitioner Course with Official Exam

Accessing the eLearning Course

Certificates

Order Processing

Privacy Policy

Region 1 Countries

Region 2 Countries

Region 3 Countries

Plus

Site Reliability Engineering (SRE) Practitioner Course with Official Exam

Accessing the eLearning Course

Certificates

Order Processing

Privacy Policy

Region 1 Countries

Region 2 Countries

Region 3 Countries

**Plus**

Plus