Taming the Appocalypse by Creating a Source of Truth

Over time & as we built solutions for our first customers, we began to realize that a kind of source of truth for apps & accounts was missing in the enterprise.

by Alan Flores-López, Co-Founder @Lumos / Varun Jindal, Engineer @Lumos / Matt Millican, Engineer @Lumos,

Which accounts have you signed up for? No one knows!

If we asked you how many accounts you’ve signed up for in your lifetime, how would you go about answering? Would you check a password vault? Look over a spreadsheet or maybe read through your email?

At Lumos, we're fascinated by this question. We were a fairly technical bunch early on; some of us with postgraduate computer science degrees. And yet, unless we used password managers or spreadsheets, we had nothing other than our memory to help us grok our online accounts.

As we dug further into the problem we learned that enterprises face a similar situation. Despite having a very unique mix of IdPs (identity providers), IGA (identity governance & administration) solutions, spreadsheets, SaaS management tools, internal tools, and lots (lots!) of glue in between to stitch everything together, the organizations we spoke with continued to have a loose grasp over the state of their apps, users, and accounts. We found a multi-faceted problem spanning multiple departments.

Over time & as we built solutions for our first customers, we began to realize that a kind of source of truth for apps & accounts was missing in the enterprise. Imagine if you had a tool you could plug into your organization & immediately shed light into the IT environment, empowering teams to solve problems of security, cost, compliance, and productivity. Turns out that building such a tool is tough!

Source of Truth for Apps & Accounts: Surprisingly Tough

Let’s be crystal clear about what we mean by a source of truth and state the problem more precisely:

Definition. Source of truth of apps & accounts. A list of all internet (or intranet) applications (services accessed over HTTP, hopefully over HTTPS!) in use at an enterprise, the accounts associated with each of them, and important metadata about those accounts such as user status, associated cost, and activity.

Problem statement. Consider an enterprise of anywhere between 20 and 20,000 employees that relies on software applications to fuel its operations. How do you produce a source of truth for apps & accounts for that enterprise?

Here’s a trivial solution. Why not have every employee keep a detailed record of all the apps they are using and update a spreadsheet at the end of each day? Interesting to consider, but unlikely to work in today’s industry.

Or you could think of something that sits in the network, like a CASB. But, since we’re aiming for a holistic solution that spans the organization’s interactions with apps & accounts, you’d want a larger list of requirements: scalability, an acceptable threshold for data staleness, accurate & consolidated data, an intuitive UX for admins, an intuitive UX for end users who need to interact with applications… we could keep going.

An acceptable solution gets tricky quick:

  • Multiple methods of discovery. Knowledge of an app or an account may come from different places. For example, APIs, integrations, and user UI interactions. How do you wire up all the data to create a useful representation that satisfies requirements?
  • State updates. As the outside world changes—for example, as employees sign up for new apps, take action on those apps, use up licenses—how do you carefully update the internal representation of the world?
  • Employee identities. Companies typically have HR management systems which define all employee identities. How do you ensure that all SaaS accounts match up with employee identities? Email addresses are problematic identifiers.
  • Triangulating accounts. Sometimes one app will tell you information about accounts in another app. Common examples are IdPs like Okta or Azure AD. How do you use this knowledge to show a useful and consolidated representation?
  • How do you define an “app” anyway? Are Jira or Trello the apps? Or is Atlassian—the overall service behind both Jira & Trello—the app? Both? Should there be a hierarchical relationship between them? What about your internal admin dashboard? Or your hundreds of internal tools? How do you model “apps” unique to your enterprise versus apps that are public and shared across many enterprises such as “Slack”?

At Lumos we’ve faced many of the complications above and more. We’ve solved some. Others are a work in progress. A few continue to stump us. In the next section we’ll give a peek of a concept that has helped us tackle some of these challenges.

Over time & as we built solutions for our first customers, we began to realize that a kind of source of truth for apps & accounts was missing in the enterprise.

A Working Solution: App Clusters

Let’s go one level deeper into the problem of “multiple methods of discovery.”

An enterprise might offer employees different ways to authenticate into an application — sometimes intentionally, sometimes not. Each authentication method yields a point where a third party tool (usually the IdP that mediates the authentication handshake) can become aware of a relationship between an app and an account. Additionally, if more information about the app is needed to satisfy business requirements, we might also need to speak directly with the application via a direct integration. In short, often we’ll find information about the same app but from different sources.

For reasons that are outside of the scope of this article, we found that representing discovered apps on a per-source basis is helpful. To link a group of these discovered apps together into a logical app representing what’s out there in the real world, we landed on the concept of an app cluster.

Example

Let’s walk through a simple example to see how an app cluster comes about.

Here we have three apps that all represent the same “instance” of Zoom. Because one was manually uploaded, the other was discovered from the IdP, and the other was trivially discovered by direct integration, it’s not immediately clear that they all correspond to the same application. In fact, we don’t always have a 100% reliable way of knowing that these applications are the same entity. However, we need to model the notion that these three apps are “the same one,” even if perhaps some user input is needed.

Enter the app cluster. An app cluster is a set of 𝑛 apps linked together by a relationship to a single designated “root” app. An app can only ever be in one app cluster. There are constraints on the apps that can be part of a cluster, depending on metadata such as from where the app is discovered.

Here’s a graphic that illustrates the setup:

With this representation, our customers see the single Zoom app that they know and use. However, on the backend, we are able to determine from which instance of Zoom to provision or deprovision a user. We also can determine from which instance of Zoom we should sync information and at what cadence. The list goes on!

This representation helps us absorb all the complexity involved in syncing, processing, and performing actions on an application while presenting a simple UI to our customers. 

We automatically create the clusters we have high confidence in and enable our customers to create clusters themselves if the need arises, helping keep the source of truth of apps & accounts clean & precise.

Creating Clusters

Determining which apps to automatically cluster is a non-trivial problem. For example, one might ingest two apps that have the names “Slack” and “Slack.com”. On first glance, you might imagine that we would want to cluster these apps. However, it’s possible that a company just went through a merger and they want to keep their two Slack instances separate!

We automatically create the clusters we have high confidence in and enable our customers to create clusters themselves if the need arises, helping keep the source of truth of apps & accounts clean & precise.

A Work in Progress

Our journey with data representations of complex IT environments is not over. While our representation and usage of app clusters has helped so far, there are limitations and future considerations that our engineering team is actively learning about & evolving. 

A Problem Worth Solving

Creating a source of truth for apps and accounts is a hard problem…but it’s one worth solving. With a single source of truth to shine light into a complex IT environment, organizations can more easily take control over their IT environment in a cost-effective way. We believe it is a necessary ingredient in creating One Platform to Govern It All.

In this blog post we focused on showing some of the technical difficulties of the source of truth problem. We also talked about how the notion of an App Cluster has been a helpful ingredient in the solution so far.

We’re excited to peel back the curtain and continue to share what we’ve done (and what is still left to do!) to tame the APPocalypse. Stay tuned!