Knowledge AI Chatbot
Scaling 3rd Party Permissions and Access Control for RAG
Handling permissions and access control in your RAG application can look very different when handling one or two integrations versus handling an increasing number of third party integrations. Follow Parato as we walk through how we took inspiration from cache invalidation strategies and graph schema best practices to build a system that scales with more integrations.
In our last tutorial, we walked through how to build a permissions/access control system for your RAG use case that can model and enforce permissions across different file storage integrations (namely Google Drive and Dropbox). In this tutorial, we wanted to go deeper into how your permissions system can handle not only a larger number of integrations from a latency and performance aspect, but also handle more types of integrations like CRM platforms.
Tutorial Overview
Reviewing our chatbot’s permissions/access control system from our last tutorial which we’ll call Parato v2.0, we implemented a very safe design.
The design pattern we showcased used two components:
A FGA graph database as a ReBAC strategy for modeling permissions
The schema focused on Google Drive and Dropbox (file storage integrations)
The graph helped us filter down the number of documents we needed in our next step - the third party check
A third party check via Google Drive and Dropbox’s API at every query
Consulting the graphic below, we can see that the third party check adds 4 network hops per user query (steps 5-8), a pattern that scales linearly with the number of integrations you have
Parato v2.5
We prioritized permissions above all else in Parato v2.0 for demonstrative purposes, but to show how this can scale in a production-level use case, we implemented a new pattern in this tutorial. In Parato v2.5, we will be using our FGA database as a cache-like database to cut out the third party check on a per-query basis. The third party API (Google Drive, Dropbox) will only be used to update permissions in our FGA database; in other words, consulting with the third party integration at write-time with background jobs, rather than at read-time.
This speeds up permission checking and allows our design to scale as the number of integrations increase. Even with more integrations, Parato 2.5 will still only need to consult the FGA graph a single time. This gets us into our second topic: expanding the FGA graph for more integration types.
File storage integrations like Google Drive, Dropbox, and Sharepoint more or less share the same permissions structure - users have read/write access to files, files are kept in folders, folders can be subfolders, etc. However, when other integration types like CRMs are introduced, our permissions graph schema will need to be modified to handle their permissions cleanly. In Parato v2.5, we will be expanding our existing FGA graph schema to fit CRM permissions, opening the door for even more schema flexibility.
In summary, Parato v2.5 can scale with the number of integrations by speeding up permission checks via caching and expanding the permissions schema in our FGA graph. Let’s explore how each of these features are implemented
Tutorial Architecture and Schema
Caching permissions
As mentioned in the overview, Parato v2.5 will use our FGA graph as a cache, speeding up permission “reads.” Our application will only need to consult the FGA graph for permitted documents before going to our Pinecone database to extract the vectors for those permitted documents.
If you’re familiar with cache invalidation strategies, we will be using a “write-through”-like method to make sure our FGA database is always up-to-date. The write-through method is when new data is written to both a database as well as the cache. The tradeoff is that “write” operations are more expensive as they need to be performed twice, but “reads” are fast since we only the cache needs to be used. This tradeoff makes sense for a chat application as response time needs to be kept low for users.
In our case, permissions are updated in the third party (like a new user was added as an editor in Google Drive) and those same updates are written through to our cache (the FGA graph database).
To increase confidence that our FGA graph is always in-sync with third party permissions, we are also layering a TTL (time-to-live) for permissions in the cache. Each permission will only be kept in our FGA graph for 1 day before being invalidated and re-indexed.
Paragon is the middleware to receive webhook messages to perform the “write through” operations in real-time, as well as the scheduled job to enforce TTL on permissions in our FGA graph. These processes occur as “background” processes, rather than at query time.
Permissions Schema
When it comes to the FGA schema, the naming of objects and relationships may seem trivial, but they can help drive the way your application unifies and differentiates behavior with data from Google Drive versus data from Salesforce.
The principles for schema design we followed are:
Generalize objects enough to re-use application logic
i.e. get permitted users from a data artifact (this could be a file in Google Drive or the contacts table in Salesforce)
Allow for specificity where it makes sense for a specific permission structure
i.e. folders may be a good representation for Google Drive, but for Salesforce and integrations that do not have hierarchical permissions, they may not be relevant
Scaling Parato
Speeding up Parato
In terms of implementing cache reads, the application code we built in our last tutorial is more than enough. In our last tutorial, we were using the FGA database as a pre-filter for document IDs before checking with the third party API. Using our FGA database as a cache, we can rip out the code that performs the third party check at every query.
Where we need to add code is to the background jobs that make sure our cache is always up to date. First, we need a webhook triggered workflow that lets our application know whenever a permission is changed. This allows Parato to stay up-to-date on permission changes in real time for integrations with webhooks.
In this Paragon workflow example, whenever a file permission is updated in Google Drive, Google’s webhook will send Paragon’s webhook listener an event. In our workflow, we will receive the event, parse the event data via a custom javascript step, and send that data to our application backend for us to write through to our FGA graph.
This workflow enables the write-through-like cache invalidation strategy. To enable TTL cache refreshes where we re-index our graph every 24 hours, we can use another type of workflow trigger.
In this example, a CRON job is triggered at 5 AM, getting all Salesforce users, their profiles, extracting permissions from the profiles, and sending those permissions to our application for FGA graph re-indexing. This TTL workflow is built per integration.
Keeping Parato Flexible
Bringing back the illustration from the tutorial overview, you can see how our graph schema can fit both file tree schemas and Salesforce objects in the same graph.
Using Okta FGA to provision a ReBAC graph, we can define our schema in YAML format and explicitly lay out the relationships between object types. For Parato v2.5:
user: these are users of your SaaS application; they will need a unique identifier like email to map to their third party accounts
artifact: these are the most granular data assets that we need to keep permissions for
In this tutorial, we deemed Salesforce object level permissions to be sufficient
Extending it further, if we needed record level permissions, we would have to define Salesforce records as artifacts and create a new object type for the “Contacts” or “Leads” table
integration: this object helps us keep track of which artifact comes from which integration
team: allows us to propagate permissions indirectly, such as all members of the marketing team can be granted read access to an artifact
folder: this is used in file tree integrations; there is no requirement for an artifact to map to a folder. In the case of Salesforce artifacts, this object type will not be used
organization: another hierarchy to propagate permissions across all users and teams in your customers’ company; this can also be thought of as a tenant in your multi-tenant application
These defined relationships not only help us propagate permissions indirectly, they also allow Parato to grant specific permissions to an artifact (read, write, ownership access). While there are no set-in-stone rules (you could theoretically create a new object type per integration), a well-defined schema is:
generalizable enough to re-use logic for different integrations
In this method, we are returning all users that have read access to an artifact (integration agnostic)
extensible to allow for specificity where it makes sense for a new integration type
this specificity will force specific logic such as writing new relationship types or performing graph operations like traversal
in this example, we have a method that is only used in file storage integrations that writes parent child relationships for folders and files
Keeping these principles in mind, the possibilities for expanding your graph schema are endless, allowing your SaaS application to model permissions for even more integration types like messaging, ticketing, and beyond.
Wrapping Up
In this tutorial we covered two major enhancements that scale our permissions system:
Treating our FGA graph as an application cache: speeding up response times during chat interactions
Expanding our graph schema to generalize to Salesforce, demonstrating flexibility across CRM and file storage integrations
Thoughtful design patterns like these can turn frustrating roadblocks into manageable hurdles. Paragon has helped many enterprise-scale SaaS companies overcome integration-related roadblocks, such as data ingestion and permissions/access control. It’s always a fulfilling exercise to take advantage of different technologies like Paragon and methods like write-through to create solutions that bring applications to a more production-ready state.
CHAPTERS
TABLE OF CONTENTS
Jack Mu
,
Developer Advocate
mins to read