Guides
Strategies for Managing Permissions in RAG
How do you reconcile permissions in a vector database within a RAG system when each 3rd party data source has their own permissions system? We explore the various approaches to implementing RAG with permissions in a multi-tenant environment.
Mathew Pregasen
,
Developer Relations
12
mins to read
If you're interested in seeing a tutorial (with code samples) we put together that walks through how to build a permissions system into a multi-tenant RAG application, click here.
Handling data permissions in a RAG (Retrieval-Augmented Generation) based world is a hairy challenge. RAG applications leverage vector databases, which are designed to be un-opinionated stores of vectorized contextual data. Out of the box, these databases don’t have native integrations with third-party data nor their associated permissions.
For the most part, vector databases flatten data. Data can be namespaced into separate groups, but generally speaking, vector databases simply contain heaps of contextual data. They are reductive for the right reasons: vector databases can efficiently pull relevant information via semantic search to assist any AI query, irrespective of the data’s origins.
But this reductive approach sidesteps permissions. Some data may be off-limits to the querying user. Without the right precautions, this creates obvious hazards.
But because vector databases are just aggregate stores of contextual data (often brought in by an admin account from 3rd party sources), by default they lack a permissions system that matches those boasted by the original data sources. This creates a hazard: You wouldn’t want a summer intern to gain unauthorized access (even diluted access) to a CEO’s boardroom notes because the RAG pipeline lacked the permissions mechanisms in place to prevent the notes from being retrieved.
The main approach to handling permissions in RAG is through reconciling third-party permissions together via a central permissions service.
Alternatively, companies can use OAuth to outsource permissions to the third-party applications or verify access at query-time, though this comes with many drawbacks which we'll cover later in this article.
While these strategies are all nascent, we’ve seen our customers (SaaS product and engineering teams) implement permissions in various ways, and the tradeoffs of each approach. Our goal today is to build on those learnings, clarifying which approach may be right for you.
The default strategy: permissions reconciliation
One approach—handling permissions in-house—involves a threefold process.
(i) First, permissions must be pulled from third-party integrations like Google Drive or Salesforce via their APIs. This involves working with various API and permission systems.
(ii) Then, permissions need to be reconciled into one identity system, so that access across integrated data is collapsed together.
(iii) Finally, permissions need to be respected at query time, where any unauthorized information is filtered out.
To provide texture on the difficulty of this process, let’s run through the various types of permission systems.
The types of permissions
Understanding permissions systems can sometimes feel like an acronym hell. From ACLs to RBAC to ABAC, there are plenty of permissions systems, each with varying complexities depending on your scale.
Additionally, each 3rd party application may have their own permissions system, which makes reconciling permissions across these different applications more difficult.
Some of the most common permissions systems include:
ACLs
ACLs (Access Control Lists, sometimes known as user-level permissions) are lists of the users that have access to a respective entity. The entity could vary—it could be the entire application (eg. Salesforce), a specific collection (e.g. Opportunities), a specific entry (e.g. a Task), or a specific field (e.g. a Name).
ACLs are simple because they explicitly spell out which users should be able to access something. However, as organizations scale to thousands of users and millions of entities, ACLs can be tedious to manage.
That said, all other permissions systems can be converted into an ACL, which makes it the most popular approach within RAG.
RBAC
RBAC, or role-based access control, uses a concept of roles to abstract permissions for users. Each role has a set of permissions, and all users with that role inherits the respective permissions.
Often, applications will have roles in a set hierarchy—e.g. Administrator, Editor, Viewer etc. An Administrator might have read-write access to everything, while a Viewer may only have read access to a subset of things. However, mature RBAC applications have more complex roles, where there is no explicit hierarchy, and some roles have no overlap with other roles. This is ideal for applications where some data may be privy to sales, but not engineers, and vice-versa.
Often, complex RBAC applications allow for users to inherit multiple roles, where permissions are additive. In this scenario, more permissive permissions (e.g. read-write access) trump overlapping lesser permissions (e.g. just read access).
ReBAC
ReBAC, or Relationship-based Access Control, is distinct from ACLs and RBAC because it leverages a graph database to store relationships between users and entities. ReBAC is ideal for quickly determining bi-directional relationships, which is common in file storage applications like Google Drive, OneDrive, etc. For example, (i) does X user have access to Y and (ii) who does Y extend access to?
Accessing Permissions
Permissions need to be pulled from their respective third-party applications. This may involve an API route, like Google Drive. Or, when accessing it from a managed database, a specific query like Snowflake’s SHOW GRANTS function. Generally speaking, permissions are typically accessible via the same techniques used to ingest the underlying data. Here’s a rudimentary example of how you would ingest your users’ Google Drive data and permissions in Paragon.
Now, once permissions are imported, they then need to be reconciled.
Permissions Reconciliation
There are a few sub-elements to permissions reconciliation.
Merging identities
The first step of reconciliation involves merging user identities across applications (Google Drive, Salesforce, Snowflake, etc). Typically, this would be done using a shared identifier like an email. However, on occasion, this might be difficult as some applications (e.g. Jira) make extracting emails difficult because it’s considered PII (Personal Information Identifier). If an organization uses an identity provider, they may use the IdP’s user ID as the source of truth, while leveraging other identifiers during reconciliation.
Reconciling permissions
Third-party permissions need to be collapsed into a master permissions system. While it's common to see RBAC as the default permissions model, if you are ingesting data from dozens of 3rd party applications, ACL or ReBAC is the way to go. After all, you cannot reconcile different 3rd party permission sets into a single RBAC model.
Using a Permissions provider
Vendors will often use an authorization-as-a-service platform like Aserto to provide a GUI for managing permissions, with a RBAC, ReBAC, or ABAC core model. By coupling that with built-in metadata features in a vector database (e.g. Pinecone), companies can effectively filter data with respect to permissions. Pinecone has a fantastic tutorial detailing this exact process.
One issue, however, is that data is filtered after similarity search, creating a possibility of no results. As a result, permission systems should flag whenever data is withheld due to permissions to inform the user. That way, users can expand the search radius to expand results.
Drawbacks
The main issue with managing permissions in-house is that it creates natural hazards. For example, if the code breaks, permissions may be incorrectly tabulated. But there are also two notable constraints that also afflict this strategy.
Stale Permissions Data
As hinted in the last option, permissions will also change over time. Whether a Confluence, Notion or Google doc is shared with someone new, or an employee switches teams, there are many reasons for why permissions data can go stale.
But permissions APIs generally don’t provide webhooks, which means you need to decide how frequently to re-index permissions from each of the third-party data sources. Thankfully, because permissions typically expand instead of contract, an end-user is more likely to lack relevant results, and not gain unauthorized access to data. Regardless, the hazard exists. It can be minimized by routinely re-indexing permissions or outsourcing permissions to OAuth / individual API keys.
Lack of API access for permissions
Certain 3rd party APIs don’t provide a permissions API that makes it easy to pull permissions for its data. For example, for Notion, the only way to respect permissions is via OAuth for each end-user (the next discussed strategy).
This requires organizations to implement different permissions management strategies for different 3rd party apps. This can get quite complex (and confusing for employees) when dealing with dozens of data sources.
Alternative 1: Offload permissions via multi-tenancy
The first alternative to handling permissions in-house is to rely on OAuth. Every third-party application API enforces permissions through its own permissions system. In other words, an arbitrary user cannot access another arbitrary user’s files if they're not an administrator.
Accordingly, RAG applications can leverage the power of namespaces in vector databases to achieve multi-tenancy. Multi-tenancy is where every user individually authenticates using OAuth, integrating data into a namespace attributed strictly to them. When they return to query the RAG system, they’ll strictly pull context from files that they own.
Benefits
Simplicity
We’ve seen a few customers take this approach due to an obvious benefit: it is simpler. By enabling every user to ingest data via OAuth or their own API Key into a dedicated namespace, companies don’t need to manage a permissions system and deal with numerous edge cases. They also don’t need to do any reconciliation logic and setting up a RAG system becomes dramatically easier.
Usage based storage
Additionally, by having every user manually authenticate, only the users that will actually use the RAG system are having their files pulled. For organizations where AI usage is sparse, this could mean a lot less storage costs.
Drawbacks
Multiple auths
The primary drawback with multi-tenancy is that every end-user needs to manually authenticate. This could be difficult if your customers’ organizations have massive headcounts, where a significant percentage are leveraging your RAG application/feature.
Duplicate data
The other issue is that multi-tenancy creates duplicate data, which makes it very inefficient from a storage perspective. For example, if your customer had 10 sales reps, and each authenticated access to their Sales Google Drive folder, this would result in you storing 10 copies of the same Google Drive files.
Of course, companies might be able to trim storage costs by merging duplicate files, but that then requires in-house permissions management again.
Alternative 2: Check permissions at query time
Instead of importing permissions ahead-of-time, companies could manually check for access at runtime, iterating over each piece of context.
To check permissions at query time, organizations would need to store relevant identifiers in metadata to check if the querying user has access to the entity.
Benefits
Checking permissions at query time prevents the need to manage an in-house permissions system, similar to the previous alternative. However, it also doesn’t require each user to manually authenticate.
It also checks against the most up-to-date permissions, something that neither of the previous approaches account for. Permissions do change, especially as users change roles.
Drawbacks
The obvious issue, however, is latency. Checking permissions via the 3rd party API at runtime can be extremely expensive, especially if some APIs are subject to rate-limits and are generally slow. However, it does create a new benefit: permissions will always be up-to-date. Because permissions do occasionally change, checking permissions at runtime is the most.
Closing Thoughts
Integrating permission with RAG is a tricky process. The first approach would be for organizations to manage a master permissions system that all other integrations are reconciled against, allowing the system to elegant filter-out restricted entities at query time. Another approach would be to use multi-tenancy, where each user manually integrates their owned data via OAuth, with it stored in a namespaced application. The final approach would be to check permissions at runtime, but this process can be slow, albeit being the most up-do-date and accurate.
Choosing the right strategy comes down to evaluating your needs as an organization and your expectations to scale, but the reality is, you may need to implement a combination of the three permissions strategies. Changing strategies down-the-road is possible, but it can be expensive on engineering hours as each approach uses a fundamentally different approach.