Docs/Phase 5 Features

Human-in-the-Loop (HITL)

New in v0.9

Enforce human oversight for high-risk AI agent actions. Require approval for sensitive operations and prevent automation abuse.

Overview

HITL (Human-in-the-Loop) enforcement ensures that humans maintain oversight over AI agent actions. It provides approval workflows for sensitive operations and protection against automation abuse attacks.

Approval Workflows

Require human approval for sensitive or high-risk actions

Velocity Limits

Prevent approval fatigue by limiting approval rates

Slicing Detection

Detect attempts to bypass controls through multiple small requests

Cool-down Periods

Enforce breaks after high activity to prevent burnout

Security Consideration
HITL is designed to prevent "slicing attacks" where an attacker breaks a large harmful action into many small, individually harmless requests that collectively bypass controls.

Request Approval

Request human approval before executing a high-risk action:

python
from tork import TorkClient, HITLEnforcement

client = TorkClient(api_key="your_key")
hitl = HITLEnforcement(client)

# Request approval for a high-risk action
request = hitl.request_approval(
    agent_id="agent-1",
    action_type="delete_user_data",
    description="Delete all data for user-123",
    payload={
        "user_id": "user-123",
        "data_types": ["profile", "history", "preferences"]
    },
    risk_level="critical",  # 'low', 'medium', 'high', 'critical'
    timeout_minutes=60,  # Auto-expire after 60 minutes
    approvers=["admin@company.com"]  # Optional: specific approvers
)

print(f"Request ID: {request['id']}")
print(f"Status: {request['status']}")  # 'pending'
print(f"Expires: {request['expiresAt']}")

Check Approval Status

Poll for the approval decision or use webhooks for real-time notifications:

python
import time

# Poll for decision
while True:
    status = hitl.check_status(request['id'])

    if status['status'] == 'approved':
        print(f"Approved by: {status['decidedBy']}")
        print(f"Reason: {status['decisionReason']}")
        # Proceed with the action
        delete_user_data(user_id="user-123")
        break

    elif status['status'] == 'rejected':
        print(f"Rejected by: {status['decidedBy']}")
        print(f"Reason: {status['decisionReason']}")
        # Handle rejection
        notify_requester("Your request was rejected")
        break

    elif status['status'] == 'expired':
        print("Request expired without decision")
        break

    elif status['status'] == 'pending':
        print("Still waiting for approval...")
        time.sleep(30)  # Wait 30 seconds before checking again

Slicing Attack Detection

Slicing attacks attempt to bypass HITL controls by:

  • Making many small requests that individually seem harmless
  • Rapid-fire approvals to exhaust the approver
  • Aggregating value across multiple requests to exceed limits
python
# Detect slicing attacks
result = hitl.detect_slicing_attack(
    agent_id="agent-1",
    approver_id="approver-1",
    time_window_minutes=60
)

if result['attackDetected']:
    print(f"Slicing attack detected!")
    print(f"Confidence: {result['confidence']}")

    for alert in result['alerts']:
        print(f"  Alert: {alert['type']}")
        print(f"  Description: {alert['description']}")
        print(f"  Evidence: {alert['evidence']}")

    # Take protective action
    hitl.pause_approvals(agent_id="agent-1", duration_minutes=30)
else:
    print("No attack patterns detected")
Attack Types Detected
Tork detects several attack patterns: velocity abuse, value aggregation, temporal clustering, approver fatigue, and pattern repetition.

Velocity Limits

Velocity limits prevent approvers from being overwhelmed with too many requests:

python
# Check if approver is within velocity limits
velocity = hitl.check_velocity(
    agent_id="agent-1",
    approver_id="approver-1"
)

print(f"Approvals this hour: {velocity['currentCount']}/{velocity['maxApprovals']}")
print(f"Approvals today: {velocity['dailyCount']}/{velocity['dailyMax']}")
print(f"Can approve: {velocity['allowed']}")

if not velocity['allowed']:
    print(f"Reason: {velocity['reason']}")
    print(f"Reset at: {velocity['resetsAt']}")

Cool-down Periods

Cool-down periods are automatically triggered after high approval activity:

python
# Check if in cool-down
cooldown = hitl.is_in_cooldown(
    agent_id="agent-1",
    approver_id="approver-1"
)

if cooldown['active']:
    print(f"In cool-down until: {cooldown['endsAt']}")
    print(f"Reason: {cooldown['reason']}")
    print(f"Triggered by: {cooldown['trigger']}")

    # Wait for cool-down or escalate
    if is_urgent:
        escalate_to_manager(request)
else:
    print("No cool-down active, can proceed")

# Manually trigger cool-down if needed
hitl.trigger_cooldown(
    agent_id="agent-1",
    approver_id="approver-1",
    duration_minutes=30,
    reason="Manual security review"
)

Configuration

Configure HITL settings for your organization:

python
# Get HITL configuration
config = hitl.get_config("agent-1")

# Update configuration
hitl.update_config(
    agent_id="agent-1",
    config={
        "enabled": True,
        "requireApprovalFor": [
            "delete_data",
            "modify_permissions",
            "send_external_email",
            "access_pii"
        ],
        "riskThresholds": {
            "low": {"autoApprove": True},
            "medium": {"requireApproval": True, "timeout": 60},
            "high": {"requireApproval": True, "timeout": 30, "notifyAdmin": True},
            "critical": {"requireApproval": True, "timeout": 15, "notifyAdmin": True, "requireMFA": True}
        },
        "velocityLimits": {
            "perHour": 10,
            "perDay": 50
        },
        "cooldownConfig": {
            "threshold": 5,  # Trigger after 5 approvals in 15 min
            "duration": 30   # 30 minute cool-down
        }
    }
)

MCP Tools

ToolDescription
tork_hitl_request_approvalRequest human approval for an action
tork_hitl_check_statusCheck the status of an approval request
tork_hitl_check_velocityCheck velocity limits for an approver
tork_hitl_detect_slicingDetect slicing attack patterns
tork_hitl_cooldown_statusCheck or manage cool-down periods