- Overview
- Classification text file types
- Classification image file types
- AI Classification policy limits
- Create an AI Classification policy
- Test and iterate
- AI Classification policy settings
- Enable, disable, or delete an AI Classification policy
- AI Classification user experience
- AI Classification results information
Overview
Box AI Classification helps to assess and classify your content, applying the appropriate classification label automatically. AI Classification can work alongside existing classification policies. For example, you can keep automated classification policies used to detect specific information types or file extensions, then use AI Classification to label a broader set of content that wasn’t easily identifiable via specific data types or keywords. One AI Classification policy is permitted per enterprise. To classify your content using Box AI, you need to:Classification text file types
AI Classification scans the text in files for all of the following extensions:| Extensions | Text Extraction Limit |
|---|---|
| as, as3, bat, boxcanvas, boxnote, cmake, css, diff, doc, docx, gdoc, gslide, gslides, haml, htm, html, less, log, make, md, mm, msg, odp, odt, pages, pdf, ppt, pptx, properties, rst, rtf, sass, scm, script, sh, sml, txt, vi, vim, webdoc, wpd, xbd, xdw, xhtml, xml, xsd, xsl | 2MB |
| asm, c, cc, cpp, cs, csv, cxx, erb, groovy, gsheet, h, hh, java, js, json, m, ml, php, pl, py, rb, scala, sql, ods, xls, xlsm, xlsx, yaml | 100KB |
Note:Automated classification in Box does not support optical character recognition (OCR), so Box cannot extract and consider text in scanned PDFs or images embedded in text-based files (for example, images in a PPT).
Classification image file types
Supported image file types are: ai, bmp, cr2, crw, dng, eps, gif, heic, idml, indt, inx, jpeg, jpg, nef, png, ps, psd, raf, raw, svg, svs, tga, tif, tiff, webp Unlike traditional OCR, which extracts visible text from an image, AI Classification analyzes the whole image, including text and objects within that image, to determine what the content means - not just what the text says.Note:AI Classification uses a version of the image that is a maximum of 2048 x 2048 pixels. This means very small or fine details might not be visible if the original image was larger. This may impact the classification result.
AI Classification policy limits
There is a total limit of 25,000 bytes for all combined criteria across labels. The limit varies by language, with the following an approximation of the number of characters supported:| Language | Characters |
|---|---|
| English | 25,000 |
| Japanese | 8,500 |
| French | 23,000 |
| Chinese | 8,500 |
| Korean | 8,500 |
Create an AI Classification policy
Admins, and co-admins with the following permissions, can create, modify, and delete AI Classification policies:- Create and edit metadata templates for your company
- View Shield Dashboard for your company
- Create, edit, and delete Shield configuration for your company
- Navigate to the Admin Console.
- Select Classification.
- Select Create, then choose AI Classification Policy from the dropdown options.
Note:This option does not display if you already have an AI Classification policy configured and listed in the Classification policies list.
- Select a classification label, then enter detailed information about the type of content that should be classified. For example, an internal classification may include content such as payroll slips, resumes, or policy documents.
- Optionally, test and iterate using up to 10 files.
- Select the policy setting of Apply to all folders or Only selected folders.
- Select a conflict handling behavior.
- Click Next.
- Click either Save as Draft or Enable. After selecting Enable, the classification policy will be in effect for files that are triggered by classification events.
Notes:
- There is a limit of 50 classification policies per EID. AI Classification policies count towards this limit.
- If you have multiple auto-classification policies, the AI Classification policy will be set to the last priority by default. This is modifiable by changing the priority order.
- One AI Classification policy is permitted per enterprise.
- Content is only scanned prospectively after the policy is enabled.
- Please do not perform large scale migrations or use Shuttle when you have AI Classification enabled on your account. If you are interested in scanning large volumes of content, reach out to your Account team.
- As LLMs are non-deterministic in nature, it is possible that the Security Classification Agent will not always return the same Classification result.
Test and iterate
By selecting test files, you can ensure you are seeing the expected classification results and modify the criteria if needed. You can select up to 10 files at a time. Once files are selected, the chosen inputs are used to create a prompt and sent to AI to evaluate each test file. Follow the process to create an AI Classification policy up to step 5, then:- Click Select Files in the Test and iterate section.
- Select up to 10 files to test.
- The test results will display, with a classification applied based on the provided guidance. Reasoning is shown for why the AI chose the label that it did.
- The file may no longer exist.
- We are unable to extract text from the file (AI only works on content with extractable text).
- The file is empty.
AI Classification policy settings
Folder criteria
| Setting | Description |
|---|---|
| Apply to all folders | The policy will apply to files in all folders in your enterprise. |
| Only selected folders | The policy will apply to files only in folders you select and in all sub-folders of those folders. To select folders: 1. Click Select Folders. 2. Enter a search term and press Enter. 3. Select one or more folders. 4. Click Save. |
Conflict handling
Determines the behavior of the AI Classification policy for conflicts when content has an existing classification label:- Skip files that already have a classification label (Recommended) - The policy will:
- Overwrite a classification label that was previously applied by another classification policy
- Skip files with classification labels applied by a user, by folder cascade, by workflow, or that were applied via Microsoft Purview Information Protection (MPIP) integration from MPIP sensitivity labels
- Overwrite any existing classification label - The policy will overwrite any existing classification label, whether that label was previously applied by a user, by folder cascade, by a workflow, or by a previous policy, except when:
- The auto-classified label was overridden manually by a user for the latest file version
- A classification label was applied from an MPIP sensitivity label and the MPIP Prevent Modification setting is enabled
Note:Overwriting existing classification labels cannot easily be undone. It is recommended you only select Overwrite any existing classification label if you’re confident in the accuracy of your AI Classification guidance.
Enable, disable, or delete an AI Classification policy
To enable, disable, or delete an AI Classification policy:- Navigate to the Admin Console.
- Select Classification.
- Click the name of your AI Classification policy.
- Click either Enable, Disable, or Delete.
Notes:
- You cannot duplicate an AI Classification policy, as you can only create one policy.
- Once enabled, the classification policy will be in effect for files that are triggered by classification events.
AI Classification user experience
AI Classification details are accessible by:- Selecting a file within Box.
- Clicking the Details button in the panel on the right-hand side.
AI Classification results information
If a classification label is applied to a file by AI Classification, you can view the AI’s reasoning in the side panel as explained above in AI Classification user experience. If the label was not applied, you need to select to make this information visible: Make AI Classification Results information visible:- In the Admin Console, select Content.
- Select the Metadata tab.
- Select AI Classification Results.
- In the Visibility setting, disable Hide template from users to make the template visible.
- Click Save.
- In the Box web application, select a file.
- Click the Metadata icon next to the right-hand pane.