Advancing service resilience in Azure Lively Listing with its backup authentication service | Azure Weblog and Updates


“Persevering with our Advancing Reliability weblog sequence, which highlights key updates and initiatives associated to bettering the reliability of the Azure platform and companies, at this time we flip our focus to Azure Lively Listing (Azure AD). We laid out the core availability rules of Azure AD as a part of this sequence again in 2019 so I’ve requested Nadim Abdo, Company Vice President, Engineering, to supply the most recent replace on how our engineering groups are working to make sure the reliability of our identification and entry administration companies which might be so essential to clients and companions.”—Mark Russinovich, CTO, Azure


Essentially the most essential promise of our identification companies is making certain that each person can entry the apps and companies they want with out interruption. We’ve been strengthening this promise to you thru a multi-layered strategy, resulting in our improved promise of 99.99 % authentication uptime for Azure Lively Listing (Azure AD). Immediately, I’m excited to share a deep dive into usually accessible expertise that permits Azure AD to attain even greater ranges of resiliency.

The Azure AD backup authentication service transparently and routinely handles authentications for supported workloads when the first Azure AD service is unavailable. It provides an extra layer of resilience on prime of the a number of ranges of redundancy in Azure AD. You may consider it as a backup generator or uninterrupted energy provide designed to supply extra fault tolerance whereas staying fully clear and computerized to you. This method operates within the Microsoft cloud however on separate and decorrelated techniques and community paths from the first Azure AD system. Which means it might proceed to function in case of service, community, or capability points throughout many Azure AD and dependent Azure companies.

What workloads are lined by the service?

This service has been defending Outlook Net Entry and SharePoint On-line workloads since 2019. Earlier this yr we accomplished backup help for functions working on desktops and cell units, or “native” apps. All Microsoft native apps together with Workplace 365 and Groups, plus non-Microsoft and customer-owned functions working natively on units at the moment are lined. No particular motion or configuration modifications are required to obtain the backup authentication protection.

Beginning on the finish of 2021, we’ll start rolling out help for extra web-based functions. We might be phasing in apps utilizing Open ID Join, beginning with Microsoft internet apps like Groups On-line and Workplace 365, adopted by customer-owned internet apps that use Open ID Join and Safety Assertion Markup Language (SAML).

How does the service work?

When a failure of the Azure AD major service is detected, the backup authentication service routinely engages, permitting the person’s functions to maintain working. As the first service recovers, authentication requests are re-routed again to the first Azure AD service. The backup authentication service operates in two modes:

  • Regular mode: The backup service shops important authentication knowledge throughout regular working circumstances. Profitable authentication responses from Azure AD to dependent apps generate session-specific knowledge that’s securely saved by the backup service for as much as three days. The authentication knowledge is particular to a device-user-app-resource mixture and represents a snapshot of a profitable authentication at a cut-off date.
  • Outage mode: Any time an authentication request fails unexpectedly, the Azure AD gateway routinely routes it to the backup service. It then authenticates the request, verifies artifacts offered are legitimate (akin to, refresh token, and session cookie), and appears for a strict session match within the beforehand saved knowledge. An authentication response, according to what the first Azure AD system would have generated, is then despatched to the appliance. Upon restoration, visitors is dynamically re-routed again to the first Azure AD service.

Diagram showing clients/services like Outlook and Exchange Online accessing tokens – including cached access tokens from the new Backup Auth service

Routing to the backup service is computerized and its authentication responses are according to these often coming from the first Azure AD service. Which means the safety kicks in without having for software modifications, nor guide intervention.

Notice that the precedence of the backup authentication service is to maintain person productiveness alive for entry to an app or useful resource the place authentication was just lately granted. This occurs to be most of the kind of requests to Azure AD—93 %, actually. “New” authentications past the three-day storage window, the place entry was not just lately granted on the person’s present system, aren’t at present supported throughout outages, however most customers entry their most necessary functions each day from a constant system.

How are safety insurance policies and entry compliance enforced throughout an outage?

The backup authentication service repeatedly screens safety occasions which have an effect on person entry to maintain accounts safe, even when these occasions are detected proper earlier than an outage. It makes use of Steady Entry Analysis to make sure the classes which might be not legitimate are revoked instantly. Examples of safety occasions that might trigger the backup service to limit entry throughout an outage embrace modifications to system state, account disablement, account deletion, entry being revoked by an admin, or detection of a excessive person danger occasion. Solely as soon as the first authentication service has been restored would a person with a safety occasion have the ability to regain entry.

As well as, the backup authentication service enforces Conditional Entry insurance policies. Insurance policies are re-evaluated by the backup service earlier than granting entry throughout an outage to find out which insurance policies apply and whether or not the required controls for relevant insurance policies like multi-factor authentication (MFA) have been glad. If an authentication request is acquired by the backup service and a management like MFA has not been glad, then that authentication could be blocked.

Conditional Entry insurance policies that depend on circumstances akin to person, software, system platform, and IP deal with are enforced utilizing real-time knowledge as detected by the backup authentication service. Nonetheless, sure coverage circumstances (akin to sign-in danger and function membership) can’t be evaluated in real-time, and are evaluated based mostly on resilience settings. Resilience defaults allow Azure AD to securely maximize productiveness when a situation (akin to group membership) shouldn’t be accessible in real-time throughout an outage. The service will consider a coverage assuming that the situation has not modified because the newest entry simply earlier than the outage.

Whereas we extremely advocate clients to maintain resilience defaults enabled, there could also be some eventualities the place admins would somewhat block entry throughout an outage when a Conditional Entry situation can’t be evaluated in real-time. For these uncommon instances, directors can disable resilience defaults per coverage inside Conditional Entry. If resilience defaults are disabled by coverage, the backup authentication service won’t serve requests which might be topic to real-time coverage circumstances, which means these customers could also be blocked by a major Azure AD outage.

What’s subsequent?

The Azure AD backup authentication service helps customers keep productive within the unlikely state of affairs of an Azure AD major authentication outage. The service supplies one other clear layer of redundancy to our service in a decorrelated Microsoft cloud and community pathways. Sooner or later, we’ll proceed to increase protocol help, state of affairs help, and protection past public clouds and we’ll increase the visibility of the service for our superior clients.

Thanks to your ongoing belief and partnership.


Please enter your comment!
Please enter your name here