Automated classification enables you to automatically apply policy-based security classifications to your sensitive enterprise content. You can configure classification policies to look for matches of defined file types or of specific data types in file content and specify a security classification for the content when the file type is a match or when data type conditions are met.
This topic had the following sections:
- Classification Events
- Classification File Types
- Classification Policy Limits
- Classification Conflicts
- Classification User Experience
- Viewing a Classification Policy
- Creating a Classification Policy
- Duplicating a Classification Policy
- Enabling a Classification Policy
- Disabling a Classification Policy
- Deleting a Classification Policy
- Classification Policy Matches
- Classification Policy Examples
Classification Events
Automated classification is triggered when any of the following file events occur in folders specified in classification policies:
- Upload a new file
- Preview a file
- Update (edit) a file
- Download a file
- Move or copy a file
- Invite people to a file
- Create a shared link for an existing file
- Modify the scope of a shared link of a file, such as changing the scope from People in the company to People with the link
- Mark a file version current
- Undelete a file from trash
Classification File Types
The automated classification process recognizes many different data types, also called info types. See the Data Types section in Classification Settings for details. For matching data types, automated classification will scan the text in files with filenames that use any of the following extensions:
- as
- as3
- asm
- bat
- boxnote
- c
- cc
- cmake
- cpp
- cs
- css
- csv
- cxx
- diff
- doc
- docx
- erb
- gdoc
- groovy
- gsheet
- h
- haml
- hh
- htm
- html
- java
- js
- json
- less
- log
- m
- make
- md
- ml
- mm
- msg
- ods
- odt
- php
- pl
- ppt
- pptx
- properties
- py
- rb
- rst
- rtf
- sass
- scala
- scm
- script
- sh
- sml
- sql
- txt
- vi
- vim
- webdoc
- wpd
- xhtml
- xls
- xlsb
- xlsm
- xlsx
- xml
- xsd
- xsl
- yaml
Note
Automated classification in Box does not support optical character recognition (OCR), so Box cannot extract and consider text embedded in images.
Data type classification supports up to 1 MB of text extraction in a file, which covers the vast majority of files in Box. The amount of text in a file is usually much less than the size of the file. For example, a 20MB PowerPoint file (.ppt) may have 200 KB of text that can be extracted for evaluation. For text extraction that exceeds 1 MB, Box scans the only the first 1 MB and applies the classification policy if any text in the first 1 MB meets the conditions specified in classification policies.
Classification Policy Limits
The following table describes the limits in classification policies.
Item | Limit |
---|---|
Name | 80 characters |
Description | 255 characters |
# of conditions per group in File Criteria per policy | 10 |
# of groups in File Criteria per policy | 3 |
# of classification policies | 50 |
# of terms per entry | 50 |
# of characters per term | 100 |
Classification Conflicts
Classification conflicts can occur when a file has an existing classification and:
- A file event, such as preview, move, edit, copy, share, and so on (see the complete list above), causes the file's classification to be evaluated by existing classification policies.
A Conflict Handling setting in classification policies allows you to determine on a per-policy level what happens when there is a classification conflict. See the Classification Label section in Classification Settings for details.
Classification User Experience
Box users will have two visual experiences for files that are classified in any way:
- In both List View and Grid view, any file this has any classification applied with have a Classification badge () to the right of the file name. A folder that contains any file that has any classification applied will also have a Classification badge to the right of the folder name.
- When previewing a file or folder that has a classification label applied, the label and description are shown in the right-hand sidebar. See Classification Labels for details.
When you hover the mouse pointer over a Classification badge, a card will appear with details about any classifications applied, including:
- Classification name and its corresponding color
- Any security controls associated with the classification
Viewing a Classification Policy
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a classification policy name.
Creating a Classification Policy
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click Create Policy.
- Enter the information for the classification policy. See the Classification Policy Tab section in the Classification Settings topic for details.
- Click Next.
- Click either Save as Draft or Enable. Enable begins applying the classification policy to all your existing content immediately.
Editing a Classification Policy
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a classification policy name.
- Make any desired changes. See the Classification Policy Tab section in the Classification Settings topic for details.
- Click Save.
Duplicating a Classification Policy
To reuse folders or file criteria selected for an existing policy, you can duplicate the policy to start with. A duplicate policy does not contain the name, description, or applied classification label from the original.
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a policy name.
- Click Duplicate.
- Enter a name, description, and applied classification label to the new policy, and make any desired changes to the policy criteria. See the Classification Policy Tab section in the Classification Settings topic for details.
- Click Next.
- Click either Save as Draft or Enable.
Enabling a Classification Policy
You can have disabled classification policies if you saved them as drafts when you created them or if you disabled them anytime after creating them.
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a policy name of a policy with the status of Disabled.
- Click Enable.
Disabling a Classification Policy
Disabling a classification policy does not remove any classifications that have been already applied. It just stops application of the policy until the policy is enabled again.
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a policy name of a policy with the status of Enabled.
- Click Disable.
Deleting a Classification Policy
When you delete a classification policy, Box does not remove classifications that this policy applied to content.
- Go to Admin Console > Classification.
- Click the Classification Policies tab.
- Click a policy name.
- Click Delete.
Classification Policy Priority
When you have more than one classification policy, Box enables you to prioritize which policies Box executes when conflicts occur among policies. The smaller the Priority value is, the higher the priority.
If conditions for multiple policies are met for the same file, Box applies the policy with the smallest Priority value. For example, Box executes a conflicting policy whose Priority is 2 and does not execute a conflicting policy whose Priority is 3.
Change Classification Policy Priority
- Go to Admin Console > Classification.
- Click the Classification Policies tab. The Priority column lists, from top to bottom, the priorities in which Box executes the policies.
- Click Change Priority Order.
- Drag a policy up or down the list to the desired position.
- Click Save.
- In the Confirm Priority Order Change dialog box, click Confirm.
Note
After a new priority order takes effect, Box does not re-classify classified files.
Classification Policy Matches
When you are using classification policies, you can view what terms and info types have been found in your content. However, before that information can be viewed, it must be made visible.
Making Classification Policy Matching Information Visible
- In the Admin Console, click Content.
- Click the Metadata tab.
- Click Classification Policy Matches.
- In the Status drop-down list, select Visible.
- Click Save.
Viewing Classification Policy Matching Information
- In the Box web application, select a file.
- Click the Metadata icon () next to the right-hand pane.
The Auto Classification Policy Section lists the number of instances of each type of term and info type found in the file.
Classification Policy Examples
This section describes several examples of classification policy configuration.
Policy With Several Unique Terms and Data Types
This policy has a condition group with 4 conditions, 3 InfoTypes and 1 set of custom terms.
- The first condition of the group reads "If there are 1 or more unique U.S. social security numbers found with Low, Medium, or High confidence in this file, this condition is met."
- The second condition of the group reads "If there are between 1 and 10 (inclusive) unique U.S. driver's license numbers found with Medium or High confidence in this file, this condition is met."
- The third condition of the group reads "If there are no more than 5 unique passport numbers found with High confidence in this file, this condition is met."
- The fourth condition of the group, "6 Terms." is considered as one data type in the group. This condition reads "If there are 3 or more unique terms from the list of 6 custom terms found in the file, this condition is met."
- For the group of four conditions, the governing condition When a file contains the following conditions Any 2 reads "If two or more conditions in this group are met, the file criteria defined in this condition group is met."
Custom Terms Example
This example shows a set of common terms an enterprise might want to search for in documents to be able to classify those documents appropriately.