Ingest All Files from File Storage

Ingest all your users' current files and newly added files in real-time.

Popular integrations

Many AI applications are using Paragon to ingest their users’ external data for RAG (retrieval augmented generation), and much of that necessary knowledge lives in third-party file storage apps like like Google Drive, Dropbox, Sharepoint, OneDrive.

The first step is enabling your users to authenticate to the third-party file storage application. The Connect Portal can be embedded natively in your application to allow users to login with their credentials to their file storage integration provider of choice, and even specify folders and files that they would like to share for your application to ingest.

This is a an example of Paragon's connect portal embedded natively in a demo application

Using ingested file data

On your frontend, your users will be able to authenticate with their file storage applications using Paragon’s embeeded UI. On the backend, Paragon’s workflow engine acts as a middleware component between your application backend and your users’ file storage integration providers.

Generally, there are two workflows required for file ingestion:

  1. An initial ingestion of all files (or files in selected folders) when a user successfully authenticates the third-party file storage system

  2. A webhook-triggered ingestion of any updates or new files in the third-party file storage system

Initial Ingestion Here you can see a Paragon workflow that performs an ingestion job of all of your users' file data to your application as soon as the user logs in to the third-party file storage app, via an integration-enabled trigger.

Authentication, rate limiting, pagination, and other API specific details of a provider like Dropbox are all handled by Paragon's workflow engine and connectors, making ingesting large volumes of files and data easy.

Real-time ingestion

Beyond the initial ingestion, real-time ingestion workflows trigger whenever your users add a new file in their file storage application or make updates to those files.

Pictured above, workflows can be triggered by webhook events (when a file is created, updated, or deleted in Google Drive).

Ingesting Permissions

Although not explained in detail on this page, many AI companies use Paragon to ingest permissions metadata for each file in order to reconcile their users' external file permissions in their RAG application. The third-party permissions metadata is similarly ingested both at initial integration-enablement and with real-time webhooks when changes to permissions are made. This way, your application can have a full index of your users' files & their corresponding permissions metadata.

Wrapping Up

With Paragon, you can easily build data pipelines to ingest your customers' external knowledge from their third-party file storage systems. Paragon’s platform is invisible to your users as they stay on your application with no redirects, and makes it easier for your developer team to scale the number of integrations implemented without worrying about third-party API details.

If you’re interested in ingesting files from integration providers at scale and would like to get access to our pre-built file ingestion workflow templates, book a call with our team.

TABLE OF CONTENTS
    Table of contents will appear here.
Ship native integrations 7x faster with Paragon

Build data ingestion into your application across integrations

Join 150+ B2B companies that rely on Paragon as their integration infrastructure