We’re at the outset of the next big technological revolution. Generative AI is poised to deliver immense value, especially when it comes to the ways that companies manage, analyze, create, and extract value from their content. As organizations look to leverage AI, there are several essential components of a successful implementation. The first: ensuring the right governance, security, and compliance framework is in place so that your most critical data is protected. And the second: a solid grasp of the technology that generative AI uses to securely produce meaningful outputs and coherent content. That means understanding the technical enablement measures including encryption, inputs and outputs, retention requirements, access management, and administrative controls.
The adoption of AI brings unique challenges and risks that must be addressed responsibly. And at Box, ensuring the security, privacy, and conscientious use of AI on your most important business content is at the core of our mission. Here we will provide an explanation of large language models (LLMs) and insight into the technology Box AI uses to deliver accurate, comprehensive results without compromising the integrity of proprietary data or operations — and maintaining the enterprise-grade security that Box customers rely on.
About Box AI
Box AI is the platform-neutral set of capabilities built into Box’s intelligent and secure content platform. These capabilities enhance the user experience through intelligent interaction across multiple models on a wide range of content types to extract insights, automate workflows while keeping data secure.
Read on to dive into the comprehensive workflow of Box AI, from the initial user engagement to the delivery of tailored answers. This process occurs whether the user views a document in Box, opens a Box Note or a Box Hub.
Initiating content interactions with Box AI
Let’s take an example of where a user opens a 20-page legal contract that is stored in Box. Box AI checks the document’s permissions in the background to verify whether Box AI capabilities are enabled in this context. This preliminary step ensures a secure and authorized interaction.
Engaging with Box AI
Once the permissions are confirmed, the user can engage with Box AI by clicking on the Box AI icon. When the user engages Box AI and submits a question, Box AI performs a secondary check to reaffirm the user’s document permissions and their access to use Box AI features in the given context. This double-check mechanism reinforces the security framework around document interaction.
Uncovering key business insights with Box AI
Retrieval Augmented Generation (RAG) is the innovative process that helps AI models be more accurate. It works like this: first, the AI reads information relevant to the user’s question. Then, it focuses on the details of the content which are relevant to the question. Finally, it uses this knowledge to create a response. This ensures that real and contextual information is used, which improves accuracy and reduces risk of hallucinations.
Box AI leverages this intelligent response mechanism in a secure and permission-integrated manner to help answer users’ questions about their documents, which we refer to as Secure RAG.
- When the user is opening up the 20-page legal contract in Preview, even before the document loads, Box performs a pre-access security check and service enablement status check to ensure that the user has the right permissions and access to the document.
- Once in the document, the user clicks on “Box AI” button on the top right, and asks a question.
- Now, Box AI performs a secondary permission and eligibility status check. This second checkpoint is to ensure that this user is allowed to use Box AI and leverage Box’s AI capabilities.
- Box AI segments the document into smaller, manageable pieces of text known as chunks. This is called Document chunking.
- Next, Box AI securely engages an external AI provider to compute embeddings from these text chunks and the user’s question. Advanced embeddings models, such as Azure OpenAI’s ada-02, are utilized for this purpose. Let’s peek behind the curtain on this step, and see what all is done as part of Computing embeddings:
a. Secure data transfer: The embeddings computation involves an end-to-end encrypted network API call to one of Box supported AI providers (GCP, Azure, based on which AI model is being used).
b. In-memory processing: The AI provider processes the data in memory without storing any data in logs or writing it to disk, preventing any permanent record of the document. We take precautions to ensure our providers do not save the data during processing. For example, we disable capabilities to prevent model providers from performing any logging operations.
c. Rapid processing: Designed to be swift, this process is typically completed in under 60 seconds, after which the query, the answer and any context from the document is purged from the AI provider’s memory.
d. Data use assurance: AI model providers never use the customer data for training any of their models nor for logging any information.
- The embedding model provider send back embeddings which are saved in Box’s embeddings index
Steps 1 through 6 are all for Box to engage AI in a secure manner that takes into account the user’s permissions, as well as for the AI model to get the content organized, and get ready for answering the user’s question. Let’s take a look at how Box AI answers the query:
- Box AI selects the most relevant chunks of the user’s question based on something called cosine similarity calculations between the embeddings of the chunks and the question. This is how Box AI determines relevant pieces, or chunks, information to the query.
- The user’s original question, along with the relevant text chunks, is then sent to an external AI provider to generate a response employing a sophisticated LLM model, such as Azure’s OpenAI Service, or Google’s Gemini series. Let’s again take a peek behind the curtain as to what happens when this is sent:
a. Secure data transmission: Similar to the embedding process, this call is securely encrypted and transmitted to supported providers.
b. In-memory processing: The AI provider computes the answer entirely in memory, with no data storage occurring on disks or in logs.
c. Prompt response: The entire process is usually concluded within 60 seconds, ensuring quick and efficient user interaction.
d. Training data assurance: AI providers never use the data as input for their training models.
e. Delivery of the response: The computed answer is then relayed back to the user, providing them with the precise information they need.
- The LLM then returns the prompt’s answer, along with specific citations to indicate how it arrived at the answer.
- Box AI displays the contextual answer, and citation to the user.
Data privacy and security
At Box, we are steadfast in our commitment to data privacy and security. Throughout the interaction with Box AI, our customers experience:
- No training on data: Neither the questions, answers, nor document contents are inputs for training third party LLM models.
- User-controlled logging: Box AI does not log or store the clear text of user questions, document content, or answers unless explicitly directed and configured by the Box user.
- Temporary data handling: The raw text of the user question, document sections, and answers are kept in Box servers’ memory only for the duration required to complete the request (typically less than 60 seconds), and are never written to disk.
- Browser cache management: Box AI caches interaction history in the user’s browser as long as the document remains open. This memory cache is flushed when the document is closed, ensuring no residual data remains.
The benefits of Box AI
Box AI makes it possible for enterprises to leverage cutting-edge technology while keeping their data secure. By breaking down complex documents into manageable chunks and leveraging powerful AI models for real-time query responses, Box AI significantly enhances user productivity while maintaining top-tier security. This approach underscores Box’s commitment to delivering commitment to delivering secure, intelligent content management. Enterprises can confidently trust Box for secure and intuitive AI solutions, enabling them to adopt and roll out these technologies while maintaining privacy, security and compliance.
Read our Box AI Frequently Asked Questions or visit the Box AI page.