Here at Atlassian, we recently went through an exercise to consolidate the authentication and identity management of our key support systems.  As we have grown, we have seen a number of account silos materialize across our system landscape. This required customers to have separate logins for support, forums, account management, etc., resulting in a frustrating experience for our customers, and a tough situation for Atlassian staff.

The problem of multiple account silos is common across the technology domain, yet is a surprisingly difficult one to resolve. It starts with a simple requirement: “We want to use the same login for multiple systems.”

But what sounds easy on the surface can quickly evolve into a complex blend of concerns across technology, data migration and separate functional teams.

So I am pleased to be able to outline the process and technology solution we used as part of this project, in the hopes that it will mitigate some of the headaches in delivering similar initiatives.

Welcome to a Behind the Scenes look at the creation of Atlassian ID.

Challenges and Learnings

Before diving into the solution and design process, I will comment on some of the key challenges (both technical and non-) and learnings. Our biggest challenges were:

  • Multiple teams. By its nature the project required engagement across a number of functional groups within Atlassian. This was made somewhat easier than in most organisations due to Atlassian’s fast moving pace and can-do-it culture, but it was also a double-edged sword. Once we had agreement with the teams, it was necessary to keep up with them as they powered ahead providing new services and capabilities on their system.
  • Availability. Having one user base for all your applications can be effectively like putting all your eggs in one basket. You don’t want that basket to break. Availability was a significant challenge to tackle throughout the solution.
  • Customer migration. This was a big one. We had seven systems and a large user base that had been diverging for over 10 years. You name the scenario… we had it.

What did we learn from the process? What stood out was:

  • Data migration is hard. I have learnt this a couple of times previously in my experience but I was once again reminded. Data migration is hard and always takes more effort than you think. Plan it out…then double.
  • SSO is important. From a technical perspective there is an urge to dismiss this as unimportant. The reality is that SSO is hard. It raises a number of security and distributed ‘state’ concerns that need to be dealt with. The alternate argument is that all browsers remember usernames and passwords, meaning only one additional click for users. Surely a single username and password rather than SSO is ‘good enough?’ More often than not the answer is ‘no.’ By giving away SSO you are sacrificing the ability to really provide a seamless customer experience. In addition, you are losing additional technical capabilities that can be extremely useful. The primary one I would cite is the ability to produce finer grained application or services that can act together to form a larger site (micro-sites/services). The ability to divide your applications into a number of smaller modules has significant benefits for maintenance and development speed.
  • Authentication is complex. It is quite amazing how complicated a simple concept such as authentication can become, this domain is littered with many different (and often competing) standards and options for implementation. We dealt with a handful of these methods including your standard web based login pages, OpenID, BASIC-AUTH and some other custom authentication mechanisms including of course the Seraph connector (used by Confluence and JIRA).  There are many others and it is an area to tread with caution.
  • Design patterns are good. Implementing without a reference architecture is equivalent to exploring uncharted territory, you never know what is going to happen. Having a good design pattern is like having a map with all the pit-falls and dangerous areas to avoid highlighted. If possible always find a relevant design pattern.

Our Solution

We had seven systems that were immediately in scope for the project, each owned by a different business unit here at Atlassian. So the first step was to understand the immediate and potential future requirements of each of these systems and units. They boiled down to the following key capabilities:

  • Single sign-on (SSO). The ability to sign-on once and access multiple applications without the need to re-authenticate.
  • User aliasing. The ability to ‘harmonize’ non-uniform local user identifiers by the means of aliasing or mapping, i.e., username brendan on system A is the same user as bhaire on system B. This was an important capability in relation to legacy users and migration as it meant we could onboard systems to the solution without imposing the need to rename user accounts on the target system.
  • Central identity management. The ability to capture, store and manage user identity information centrally.
  • Provisioning. The ability to push user identity events across multiple applications e.g. New User, Updated Profile.
  • Federated authentication – The ability to provide authentication services to a 3rd party system external to Atlassian, e.g., OpenID, SAML based auth styles.

We used Crowd to form the backbone of the solution, and extended it using the Interceptor or Gateway architecture pattern. We also settled on SCIM as a standard for the identity management space. The final solution is one that can be used to provide the access and identity management services both to JIRA and Confluence applications as well as commercial off the shelf (COTS), open source and home grown applications.

The overal solution covers identity and access management (IAM) but is best split for discusssion across identity management (IDM – left of diagram) and access management (AM – right of diagram).

Atlassian ID - Architecture

Access Management (AM)

This section of the architecture is concerned with providing authentication and access services and capabilities. It follows an architectural design pattern that is often referred to as an interceptor or gateway pattern. As a means to understanding how this functions the two key information or message flows for this solution and component breakdown are detailed below.

Authentication flow:

  1. User (1) requests a particular resource from a protected application (3).
  2. An Interceptor (2) intercepts this request and authenticates the user (if they are logged in) via their token against Crowd (4) and forwards the request to the protected application (3).
  3. The protected application (3) trusts the forwarded request and connection from the interceptor (2) and logs the user into the application under the supplied credentials.

Login flow:

  1. The user (1) is directed (or re-directed potentially from the protected application (3)) to the login (4).
  2. The user (1) supplies their authentication credentials to the login (4) component.
  3. The login (4) component validated these credentials against Crowd – Access (5).
  4. Crowd – access (5) creates a session and returns a unique token to the login (4) component.
  5. The login (4) component returns the token to the user (1).

(1) User

Most traditionally a person at a web browser but can also be another system operating over HTTP/S.

(2) Interceptor

The interceptor’s role is to perform all required authentication and to remove these concerns from the protected application (3) i.e., it is a delegated auth provider. It intercepts all requests that are made to the protected application (3) and then forwards these requests through with appropriate authentication details.

It is implemented as an Apache Mod and interfaces with the Crowd’s API to validate tokens and exchange identity information such as aliasing of user accounts. As well as cookie based authentication it provides BASIC-AUTH authentication and elevated auth capabilities (a ‘sudo’ equivalent for the web). After authenticating a user it passes the users credentials through to the protected application (3) by encoding the details in the HTTP header.

This component is deployed as a load balanced cluster for availability.

(3) Protected Application

This is the application that is being protected. In most cases this is a web application but can also take the form of a REST service end-point or other services operating over HTTP/S.

The protected application is required to extract the encoded user credential information from the forwarded HTTP request and to log the user into the application.  It is also required to listen to and process identity events (such as profile updates etc.) from the provisioning queue (11) and process these accordingly.

(4) Login

This is a web application that provides authentication services to the user (1) the most notable one being the ability to login to the system. It removes the need for the protected application (3) to handle user authentication credentials i.e. password, therefore minimizing the impact of any security vulnerability or compromise in the protected application (3).

This component is deployed as a load balanced cluster for availability.

(5) Crowd – Access

This is a Crowd installation that points to a local read-only copy of the LDAP user directory. It provides the underlying access management and directory management services that drive the solution. Most notable for this section of the architecture is the creation and management of the access tokens.

This component is deployed as a load balanced cluster for availability.

Identity Management (IDM)

This section of the architecture is concerned with providing Identity management services and capabilities e.g. profile updates. It is largely a set of CRUD based services with a provisioning queue to provide a push based communication capability to keep downstream systems in sync.

The component breakdown of this section is detailed below.

(6) Identity Management

This is a web application that provides the user interface to create and maintain the users personal identity information as well as account migration and merging services for legacy users.  It interfaces with the identity services (7) component to enact these services.

(7) Identity Services

This is a REST based service layer that implements identity management services for the user facing identity management (6) component. It adheres to the SCIM standard for message formats.

(8) Crowd – Provisioning

This is a Crowd installation that points to LDAP source (10) the writable LDAP user directory which acts as the source of truth for user credentials.  In this section of the architecture Crowd also provides aliasing capabilities that enables the migration and merging of existing accounts without the need to change downstream account setup i.e. it is not invasive.

(9) Crowd Database

The single Crowd database that both Crowd – access (5) and Crowd – provisioning (8) components use. This is the persistent data source for identity information including aliasing and central configuration elements.

(10) LDAP Source

Central source for managing authentication details of the user base.

(11) Provisioning Queue

Provides push mechanism capability for identity updates. Messages are written to this queue by the identity services (7) component.

Final Stats

The rollout was huge success! We were able to resolve a longstanding problem in our system landscape. Tens of thousands of people affected by the change, yet we only had a handful of  complaints and issues, many from unrelated issues such as not receiving authentication emails (due to over-aggressive spam filter on their companies’ server-side). The migration stats at the time of writing were 20,494 accounts migrated with a 99.3% first-time success rate. For those with complex data scenarios that were not immediately successful we have worked hard to resolve their problems as quickly as possible through our support channels.