> ## Documentation Index
> Fetch the complete documentation index at: https://docs.box.com/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Classification Prompts - Best Practices

<div className="article_labels_list" style={{display: 'none'}} dangerouslySetInnerHTML={{__html: "End User , Box Shield , Overview , Article , Product Utilization , Established"}} />

[*Shield AI Classification*](/en/box-shield/shield-classification-labels-and-policies/ai-classification) *is available only as part of the Shield Pro add-on.*

AI Classification helps to assess and classify your content, applying the appropriate classification label automatically. This guide explains how to write effective label prompts to quickly and easily classify content.

For setup and feature details, see [AI Classification](/en/box-shield/shield-classification-labels-and-policies/ai-classification).

<h2 id="how-ai-classification-works">
  How AI Classification works
</h2>

The Security Classification Agent:

* Reads your label definitions (prompts)
* Evaluates each file against all labels in a policy
* Applies the single best-fit label, or **none** if no definition is met confidently
* Displays applied labels and reasoning to end users in the file sidebar under Additional Details

<h2 id="configuration-recommendations">
  Configuration recommendations
</h2>

The following settings work well in most situations:

* Consider allowing end users to modify classifications using [Classification Modification Permissions](/en/box-shield/shield-classification-labels-and-policies/classification-modification-permissions)
* Set conflict handling to **Skip**, especially if users can apply or modify labels
* AI Classification performs best with 1 to 3 labels
  * Think like a human reviewer, where too many labels increase the likelihood of inconsistent classifications
  * You do not want all labels to be applied by AI, such as *Public* where you only want humans to apply the label

<h2 id="ai-classification-policy-best-practices">
  AI Classification policy best practices
</h2>

#### Define effective label criteria

To ensure accurate AI Classification, label definitions need to be:

* **Distinct**: Each label needs to have a non-overlapping, clearly differentiated criteria that targets a unique set of document characteristics.
* **Descriptive**: Use plain language to specify:
  * Document types (e.g. contracts, strategy decks, spreadsheets)
  * Topics or intent (e.g. product roadmap, security breach, deal terms)
  * Data types (e.g. PII, source code, financials)
  * Audience (e.g. internal teams, legal)

**Avoid**:

* Vague descriptors (e.g., "High risk to the company")
* Overlapping labels (e.g., "Confidential" vs. "Highly Confidential")
* Undefined technical jargon

#### Troubleshooting tips

If AI Classification results are not meeting expectations:

* **Use fewer, well-defined labels**: Add examples and tighten criteria
* **Check for overlap**: Ensure labels are clear and unambiguous without overlap and avoid “catch-all” labels
* **Ensure the file is a supported file type**: View the [text](/en/box-shield/shield-classification-labels-and-policies/ai-classification#classification-text-file-types) and [image](/en/box-shield/shield-classification-labels-and-policies/ai-classification#classification-image-file-types) file types that are supported

#### Known Limitations

AI Classification returns mixed and sometimes inaccurate information for criteria that includes the following conditions or topics:

* Calculations, table structures, and numbers
* Counting words or phrases
* Document metadata such as page number, authors, file size, word count, and collaborators (AI Classification doesn't take into account any of these document components)
* Images, charts, graphs, etc. that are within text documents (it can only analyze image files directly)

<h2 id="using-ai-to-improve-your-prompts">
  Using AI to improve your prompts
</h2>

<Note>
  **Note:**

  The guidance in this section reflects general best practices intended to help guide prompt design and usage. Results may vary depending on your specific use case, data, and configuration. These recommendations are intended as guidance only and may not produce consistent or expected results in all scenarios.
</Note>

Use Box AI to help refine your existing classification label definitions into LLM-friendly criteria:

1. Open the document in Box which contains your existing definitions.
2. Select **Box AI** from the right-hand sidebar, or from the top navigation bar.
3. Use the below example prompt to rewrite each label definition into a clear, descriptive criteria optimized for AI-based classification.
4. Copy the output into the relevant [classification label](/en/box-shield/shield-classification-labels-and-policies/classification-labels).

#### Example prompt

Please rewrite each **data classification label definition** so it is LLM-friendly, clear, and semantically precise, suitable for use as label criteria in an AI-based content classification system.

When rewriting the criteria:

* Focus only on what types of documents or content belong in each label.
* Use plain, descriptive language that reflects document types, topics, data sensitivity, and intended audience.
* Make each label distinct and clearly differentiated from the others.
* Prefer concise paragraphs or short bullet lists.

Do not include:

* System or instructional language (for example: “you should classify,” “evaluate,” or “only apply if”).
* Decision logic, prioritization rules, or conflict-resolution guidance.
* Any description of how an AI model should behave.

Output format:

* Label name
* Refined label criteria

Do not include explanations or additional commentary.

<h2 id="tips-tricks">
  Tips & tricks
</h2>

#### Exclusion criteria / negative examples

Explicitly exclude common false positives.\
**Example**\*:\* “Not restricted if data is anonymized or aggregated and cannot be linked to an individual (for example, ‘Average customer income by age group’).”

#### Label prioritization (tie breakers)

Define which label wins if multiple criteria are met.\
**Example**\*:\* “If criteria for both *Internal* and *Sensitive* are met, classify as *Sensitive*.”

#### Default label logic

Use one label as a fallback.\
**Example**\*:\* “Apply *Internal Only* if no other label criteria are met.”

#### Time sensitivity

If dates matter, explicitly state they should be evaluated against **today’s** date. Don’t assume the model infers the comparison.\
**Example**\*:\* “Only classify as Restricted if the date in the file is after January 1, 2023. Compare against today’s date.”

<h2 id="example-label-prompts">
  Example label prompts
</h2>

<Note>
  **Note:**

  These example prompts are provided for demonstration purposes only and do not constitute legal or compliance advice. Actual requirements and outcomes may vary based on your organization's policies, use cases, and regulatory obligations.
</Note>

#### Confidential Data

Includes:

* **Business records**: audit findings, management and board reports, strategic presentations, incident response materials, third-party risk documentation (e.g., SOC reports)
* **Operational data**: KPI reports, productivity metrics, security logs, meeting transcripts or recordings
* **Employee information**: performance reviews, disciplinary actions
* **Masked or anonymized data**: masked PII (e.g., last 4 digits of SSN), anonymized or aggregated NPI

#### Restricted Data

Includes:

* **Corporate information**: non-public strategies, pre-announcement M\&A data, legal or regulatory investigations
* **PII**: SSNs, driver’s license numbers, passport numbers, payment card numbers, medical records, biometric data (not restricted if masked or truncated and cannot be reconstructed)
* **NPI**: bank account numbers, balances, DOB, income or salary data, credit reports, transaction history, loan or insurance data\
  (not restricted if anonymized or aggregated)

<h2 id="known-limitations-1">
  Known limitations
</h2>

AI Classification may return mixed results for criteria involving:

* Calculations, numeric reasoning, or complex tables
* Counting words or phrases
* Document metadata (page numbers, authors, file size, word count, collaborators)
* Images, charts, or graphs embedded in text documents (images are only analyzed when the file itself is an image)
