Overview

This policy safeguards the appropriateness of AI interactions by preventing discussions of predefined sensitive or high-risk topics. The policy applies to both user inputs and AI-generated responses, ensuring that all interactions remain within acceptable boundaries. It works by monitoring and enforcing restrictions on specified sensitive topics, helping maintain compliance with organizational guidelines and ethical standards.

The Sensitive Topic Policy enables organisations to monitor and control interactions with a Language Model (LLM) that involve predefined sensitive subjects. Unlike the Restrict to Topic Policy, which allows both valid and invalid topic lists, this policy focuses exclusively on user-selected sensitive topics. It helps prevent discussions that could breach company guidelines, violate ethical standards, or trigger reputational or regulatory risks. This policy is ideal for organisations that need to flag or block conversations around specific topics—such as mental health, self-harm, violence, or other high-risk content areas—without defining a list of valid alternatives.


What the Policy Does

Purpose

The Sensitive Topic Policy is designed to help organisations maintain responsible AI use by regulating how and when sensitive subjects are discussed. It supports ethical content moderation, safeguards users from harmful interactions, and assists in meeting regulatory and reputational obligations. Whether your goal is to flag such content or block it entirely, the policy gives you direct control over which sensitive topics the AI can engage with.

Scope

Sensitive Topic Control

Organisations can configure a custom list of Sensitive Topics to regulate the AI’s behaviour:

  • Topics are user-defined and entirely customisable to reflect organisational, cultural, legal, or ethical concerns.
  • There is no option to define valid or invalid topics—only a curated list of sensitive subjects is required.
  • In addition to entering your own topics, a default list of predefined sensitive topics is available to choose from:
    • Sexual Harassment
    • Bullying & Harassment
    • Child Exploitation
    • Trafficking & Exploitation
    • Terrorism & Extremism
    • Cyberbullying
    • Privacy Violations
    • Substance Abuse
    • Body Shaming

Prompt & Response Configuration

The policy can be applied independently to:

  • Prompts (inputs provided by users)
  • Responses (outputs generated by the LLM)

Each can be enabled or disabled based on business needs, allowing for flexible implementation.

Operational Modes

  • Log Only: Flags any prompts or responses containing sensitive topics, but allows interactions to proceed.
  • Log and Override: Automatically blocks interactions that include sensitive topics, ensuring strict compliance and protection.

Threshold Sensitivity

Organisations can set the detection sensitivity between 0.2 and 0.9:

  • Lower values (e.g., 0.2) allow more leeway, reducing false positives.
  • Higher values (e.g., 0.9) enable strict detection and enforcement.

Key Features

  • Customisable Sensitive Topic List: Define a unique list of sensitive topics relevant to your organisation.
  • Default Topic Library: Quickly select from a set of commonly flagged sensitive topics.
  • Independent Prompt & Response Filtering: Apply controls where they are most needed.
  • Flexible Enforcement Options: Choose between passive logging or active blocking.
  • Threshold-Based Sensitivity: Calibrate detection accuracy to your comfort level.
  • Comprehensive Logging: All detections are recorded for auditing, training, or review.

Why Use This Policy?

Benefits

  • Promotes ethical and safe use of AI within your organisation.
  • Helps mitigate reputational, legal, and psychological risks.
  • Ensures sensitive content is managed proactively and consistently.
  • Enables detailed monitoring for transparency and compliance.

Use Case: Healthcare Organisation Safeguarding

Scenario

A national healthcare provider uses AltrumAI to assist with patient education, internal research, and support services. Leadership is concerned about the AI inadvertently engaging with sensitive mental health topics or crisis-related content without appropriate safeguards.

Challenge

To ensure patient safety and compliance with ethical and regulatory frameworks, the provider must prevent AI interactions that touch on topics like self-harm, abuse, or medical misinformation—especially in unsupervised or automated settings.

Solution: Implementing the Sensitive Topic Policy

  1. Sensitive Topic Configuration

    • The organisation defines a list of high-risk subjects, such as self-harm, eating disorders, addiction, and abuse.
    • They also select relevant topics from the platform’s predefined list for quick deployment.
  2. Prompt & Response Filtering

    • Prompt Filtering: Prevents users from inputting content that initiates sensitive discussions.
    • Response Filtering: Stops the AI from producing responses that engage with sensitive subjects.
  3. Enforcement Mode & Sensitivity

    • Set to Log and Override to fully block interactions involving high-risk topics.
    • Sensitivity is adjusted to 0.9 for maximum protection.

How to Use the Policy

Note: The following steps guide you through configuring the Sensitive Topic Policy using the policy workflow interface.

Step 1: Navigate to the Policy Workflow

  1. From the Dashboard, open your project to access the Project Overview.
  2. In the Policy section, click Edit Policy to begin the policy configuration workflow.

Step 2: Select and Enable the Sensitive Topic Policy

  1. In the Configure Policies tab, click on Sensitive Topic from the list.
  2. The configuration panel will appear on the right-hand side.
  3. Toggle Enable Policy to ON to begin editing.

Step 3: Add Sensitive Topics

  1. In the Sensitive Topics input field, type the topics you want to monitor (e.g., “self-harm”, “terrorism”, “bullying”).
  2. Press Enter after each entry to add it to the list.
  3. Each topic will appear as a tag beneath the field. Tags can be removed by clicking the X next to them.

Step 4: Set Application Scope

  1. Under Apply Policy To, select where this policy should be enforced:
    • Prompt – Monitor user input only.
    • Response – Monitor AI output only.
    • Both – Enforce across both input and output directions.

Step 5: Configure Enforcement Behaviour

  1. Under Behaviour, choose how violations should be handled:
    • Log Only – Log sensitive topic occurrences without blocking.
    • Log and Override – Block interaction and return a custom smart response.

Step 6: Adjust Detection Threshold

  1. Use the Threshold Slider to set how strictly the system should identify sensitive topics:
    • A lower value (e.g., 0.2) allows broader detection.
    • A higher value (e.g., 0.9) enforces stricter topic matching.

Step 7: Save, Test, and Apply the Policy

  1. Click Save Changes to store your configuration settings.
  2. (Optional) Go to the Test Policies tab to evaluate policy performance in a live testing environment.
  3. Return to Configure Policies and click Apply Policies to activate the policy.
  4. A confirmation message will confirm successful application.

The Sensitive Topic Policy helps your organisation proactively detect and manage harmful or inappropriate topics—ensuring responsible AI usage across sensitive subject areas.