Shield AI Classification is available only as part of the Shield Pro add-on.
Box AI Classification helps to assess and classify your content, applying the appropriate classification label automatically.
AI Classification can work alongside existing classification policies. For example, you can keep automated classification policies used to detect specific information types or file extensions, then use AI Classification to label a broader set of content that wasn’t easily identifiable via specific data types or keywords.
One AI Classification policy is permitted per enterprise.
To classify your content using Box AI, you need to:
Classification text file types
AI Classification scans the text in files for all of the following extensions:
| Extensions | Text Extraction Limit |
| as, as3, bat, boxcanvas, boxnote, cmake, css, diff, doc, docx, gdoc, gslide, gslides, haml, htm, html, key, less, log, make, md, mm, msg, odp, odt, pages, pdf, ppt, pptx, properties, rst, rtf, sass, scm, script, sh, sml, txt, vi, vim, webdoc, wpd, xbd, xdw, xhtml, xml, xsd, xsl | 2MB |
| asm, c, cc, cpp, cs, csv, cxx, erb, groovy, gsheet, h, hh, java, js, json, m, ml, php, pl, py, rb, scala, sql, ods, xls, xlsm, xlsx, yaml | 100KB |
AI Classification supports different sizes of text extraction depending on the file type. The amount of text in a file is usually much less than the size of the file. For example, a 20MB PowerPoint file (.ppt) may have 200 KB of text that can be extracted for evaluation.
Box scans only up to the text extraction limit. For example, for a PDF file where the text extraction exceeds 2MB, the AI Classification policy is based on whether the text in the first 2MB meets the conditions specified in the classification policy.
Note: Automated classification in Box does not support optical character recognition (OCR), so Box cannot extract and consider text in scanned PDFs or images embedded in text-based files (for example, images in a PPT).
Classification image file types
Supported image file types are:
ai, bmp, cr2, crw, dng, eps, gif, heic, indd, idml, indt, inx, jpeg, jpg, nef, png, ps, psd, raf, raw, svg, svs, tga, tif, tiff, webp
Unlike traditional OCR, which extracts visible text from an image, AI Classification analyzes the whole image, including text and objects within that image, to determine what the content means - not just what the text says.
Note: AI Classification uses a version of the image that is a maximum of 2048 x 2048 pixels. This means very small or fine details might not be visible if the original image was larger. This may impact the classification result.
AI Classification policy limits
There is a total limit of 25,000 bytes for all combined criteria across labels. The limit varies by language, with the following an approximation of the number of characters supported:
| Language | Characters |
English |
25,000 |
Japanese |
8,500 |
French |
23,000 |
Chinese |
8,500 |
Korean |
8,500 |
Create an AI Classification policy
Admins, and co-admins with the following permissions, can create, modify, and delete AI Classification policies:
- Create and edit metadata templates for your company
- View Shield Dashboard for your company
- Create, edit, and delete Shield configuration for your company
To create an AI Classification policy:
- Navigate to the Admin Console.
- Select Classification.
- Select Create, then choose AI Classification Policy from the dropdown options.
Note: This option does not display if you already have an AI Classification policy configured and listed in the Classification policies list.
- Select a classification label, then enter detailed information about the type of content that should be classified. For example, an internal classification may include content such as payroll slips, resumes, or policy documents.
You can remove the classification label by selecting the X above the description box.
- Optionally, test and iterate using up to 10 files.
- Select the policy setting of Apply to all folders or Only selected folders.
- Select a conflict handling behavior.
- Click Next.
- Click either Save as Draft or Enable. After selecting Enable, the classification policy will be in effect for files that are triggered by classification events.
Notes:
- There is a limit of 50 classification policies per EID. AI Classification policies count towards this limit.
- If you have multiple auto-classification policies, the AI Classification policy will be set to the last priority by default. This is modifiable by changing the priority order.
- One AI Classification policy is permitted per enterprise.
- Content is only scanned prospectively after the policy is enabled.
Test and iterate
By selecting test files, you can ensure you are seeing the expected classification results and modify the criteria if needed.
You can select up to 10 files at a time. Once files are selected, the chosen inputs are used to create a prompt and sent to AI to evaluate each test file.
Follow the process to create an AI Classification policy up to step 5, then:
- Click Select Files in the Test and iterate section.
- Select up to 10 files to test.
- The test results will display, with a classification applied based on the provided guidance. Reasoning is shown for why the AI chose the label that it did.
If no classification is provided, reasoning will be given to justify the lack of classification. Common reasons for Box AI being unable to classify the file include:
- The file may no longer exist.
- We are unable to extract text from the file (AI only works on content with extractable text).
- The file is empty.
You can rerun a test by selecting the circular arrow that sits above the uploaded files. This is particularly helpful after refining your input in the classification label descriptions.
Select Clear at the bottom of the section to remove your test results.
AI Classification policy settings
Folder criteria
| Setting | Description |
| Apply to all folders | The policy will apply to files in all folders in your enterprise. |
| Only selected folders |
The policy will apply to files only in folders you select and in all sub-folders of those folders. To select folders:
|
Conflict handling
Determines the behavior of the AI Classification policy for conflicts when content has an existing classification label:
-
Skip files that already have a classification label (Recommended) - The policy will:
- Overwrite a classification label that was previously applied by another classification policy
- Skip files with classification labels applied by a user, by folder cascade, by workflow, or that were applied via Microsoft Purview Information Protection (MPIP) integration from MPIP sensitivity labels
-
Overwrite any existing classification label - The policy will overwrite any existing classification label, whether that label was previously applied by a user, by folder cascade, by a workflow, or by a previous policy, except when:
- The auto-classified label was overridden manually by a user for the latest file version
- A classification label was applied from an MPIP sensitivity label and the MPIP Prevent Modification setting is enabled
Note: Overwriting existing classification labels cannot easily be undone. It is recommended you only select Overwrite any existing classification label if you're confident in the accuracy of your AI Classification guidance.
Enable, disable, or delete an AI Classification policy
To enable, disable, or delete an AI Classification policy:
- Navigate to the Admin Console.
- Select Classification.
- Click the name of your AI Classification policy.
- Click either Enable, Disable, or Delete.
An AI Classification policy can be in a disabled state if you saved it as a draft when creating it, or if you disabled it any time after creating it.
Disabling a classification policy does not remove any classifications that have been already applied. It just stops application of the policy until the policy is enabled again.
When you delete an AI Classification policy, Box does not remove classifications that this policy applied to content.
Notes:
- You cannot duplicate an AI Classification policy, as you can only create one policy.
- Once enabled, the classification policy will be in effect for files that are triggered by classification events.
AI Classification user experience
AI Classification details are accessible by:
- Selecting a file within Box.
- Clicking the Details button in the panel on the right-hand side.
Information shows in the Applied by section, where it states the classification was applied by Box AI on a specific date.
A description is shown with the reasoning for why that label was applied. For example: “The file was marked internal only because it contains non-public financial results.”
AI Classification results information
If a classification label is applied to a file by AI Classification, you can view the AI’s reasoning in the side panel as explained above in AI Classification user experience. If the label was not applied, you need to select to make this information visible:
Make AI Classification Results information visible:
- In the Admin Console, select Content.
- Select the Metadata tab.
- Select AI Classification Results.
- In the Visibility setting, disable Hide template from users to make the template visible.
- Click Save.
Viewing AI Classification Results information:
- In the Box web application, select a file.
- Click the Metadata icon next to the right-hand pane.
The AI Classification Results section shows the Box AI Classification Agent reasoning for the decision it made when evaluating the file.
AI Classification policy best practices
Define effective label criteria
To ensure accurate AI Classification, label definitions should be:
- Distinct: Each label should have non-overlapping, clearly differentiated criteria that targets a unique set of document characteristics.
-
Descriptive: Use plain language to specify:
- Document types (e.g. contracts, strategy decks, spreadsheets)
- Topics or intent (e.g. product roadmap, security breach, deal terms)
- Data types (e.g. PII, source code, financials)
- Audience (e.g. internal teams, legal)
Avoid:
- Vague descriptors (e.g., "High risk to the company")
- Overlapping labels (e.g., "Confidential" vs. "Highly Confidential")
- Undefined technical jargon
Troubleshooting tips
If AI Classification results are not meeting expectations:
- Use fewer, well-defined labels: Add examples and tighten criteria
- Check for overlap: Ensure labels are clear and unambiguous without overlap and avoid “catch-all” labels
- Ensure the file is a supported file type: View the text and image file types that are supported
Known Limitations
AI Classification returns mixed and sometimes inaccurate information for criteria that includes the following conditions or topics:
- Calculations, table structures, and numbers
- Counting words or phrases
- Document metadata such as page number, authors, file size, word count, and collaborators (AI Classification doesn't take into account any of these document components)
- Images, charts, graphs, etc. that are within text documents (it can only analyze image files directly)