Content Moderation

This is an example implementation of a content moderation workflow using Humancheck. This demonstrates how to integrate human review for borderline cases, but Humancheck can be adapted to any content review or moderation scenario.

Overview

The content moderation workflow:

AI analyzes content for policy violations
If confidence is low or content is borderline, request human review
Reviewer makes final decision
System applies the decision (approve, reject, flag, etc.)

Implementation

Basic Content Moderation

import httpx
import asyncio

async def moderate_content_workflow(content, content_type="text"):
    """Moderate content with human review for borderline cases."""
    async with httpx.AsyncClient() as client:
        # Analyze content with AI
        analysis = await analyze_content(content, content_type)
        confidence = analysis["confidence"]
        violation_type = analysis["violation_type"]
        
        # Threshold for auto-decision vs review
        AUTO_APPROVE_THRESHOLD = 0.95
        AUTO_REJECT_THRESHOLD = 0.05
        
        if AUTO_REJECT_THRESHOLD < confidence < AUTO_APPROVE_THRESHOLD:
            # Borderline case - request human review
            response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
                    "task_type": "content_moderation",
                    "proposed_action": f"Flag content as {violation_type}",
                    "agent_reasoning": f"Low confidence ({confidence:.2%}) in moderation decision. Content may violate policy.",
                    "confidence_score": confidence,
                    "urgency": "high",  # Content moderation is time-sensitive
                    "metadata": {
                        "content": content[:500],  # First 500 chars
                        "content_type": content_type,
                        "violation_type": violation_type,
                        "content_length": len(content),
                        "analysis_details": analysis,
                        "user_id": analysis.get("user_id"),
                        "content_id": analysis.get("content_id")
                    },
                    "blocking": True  # Wait for decision before publishing
                }
            )
            review = response.json()
            
            # Get decision
            decision = review.get("decision")
            if not decision:
                decision_response = await client.get(
                    f"https://api.humancheck.dev/reviews/{review['id']}/decision"
                )
                decision = decision_response.json()
            
            # Apply decision
            if decision["decision_type"] == "approve":
                # Approve content (allow publishing)
                await approve_content(review["metadata"]["content_id"])
                return {"status": "approved", "content_id": review["metadata"]["content_id"]}
            elif decision["decision_type"] == "modify":
                # Apply modification (e.g., redact specific parts)
                modified_content = decision.get("modified_action")
                await update_content(review["metadata"]["content_id"], modified_content)
                return {"status": "modified", "content_id": review["metadata"]["content_id"]}
            else:
                # Reject content
                await reject_content(review["metadata"]["content_id"], decision.get("notes"))
                return {"status": "rejected", "reason": decision.get("notes")}
        elif confidence >= AUTO_APPROVE_THRESHOLD:
            # Auto-approve
            await approve_content(analysis["content_id"])
            return {"status": "auto_approved"}
        else:
            # Auto-reject
            await reject_content(analysis["content_id"], "High confidence violation detected")
            return {"status": "auto_rejected"}

async def analyze_content(content, content_type):
    """Analyze content for policy violations."""
    # Your AI moderation logic here
    # This is a simplified example
    violations = {
        "hate_speech": 0.3,
        "spam": 0.1,
        "harassment": 0.2,
        "none": 0.4
    }
    
    # Determine most likely violation
    violation_type = max(violations, key=violations.get)
    confidence = violations[violation_type]
    
    return {
        "confidence": confidence,
        "violation_type": violation_type if violation_type != "none" else None,
        "content_id": generate_content_id(),
        "user_id": extract_user_id(content)
    }

Advanced Features

Image Moderation

async def moderate_image_workflow(image_url, image_id):
    """Moderate image content."""
    analysis = await analyze_image(image_url)
    
    response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
            "task_type": "content_moderation",
            "proposed_action": f"Flag image as {analysis['violation_type']}",
            "agent_reasoning": f"Image analysis shows potential {analysis['violation_type']} violation",
            "confidence_score": analysis["confidence"],
            "metadata": {
                "content_type": "image",
                "image_url": image_url,
                "image_id": image_id,
                "violation_type": analysis["violation_type"],
                "image_metadata": analysis.get("metadata", {})
            },
            "blocking": True
        }
    )

Video Moderation

async def moderate_video_workflow(video_url, video_id):
    """Moderate video content."""
    # Analyze video (may take longer)
    analysis = await analyze_video(video_url)
    
    response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
            "task_type": "content_moderation",
            "proposed_action": f"Flag video as {analysis['violation_type']}",
            "agent_reasoning": f"Video analysis indicates {analysis['violation_type']} at timestamp {analysis['timestamp']}",
            "confidence_score": analysis["confidence"],
            "metadata": {
                "content_type": "video",
                "video_url": video_url,
                "video_id": video_id,
                "violation_type": analysis["violation_type"],
                "timestamp": analysis.get("timestamp"),
                "duration": analysis.get("duration")
            },
            "blocking": True
        }
    )

Batch Moderation

async def batch_moderation_workflow(content_items):
    """Moderate multiple content items."""
    # Analyze all items
    analyses = await analyze_batch(content_items)
    
    # Find items needing review
    needs_review = [
        item for item, analysis in zip(content_items, analyses)
        if 0.05 < analysis["confidence"] < 0.95
    ]
    
    if needs_review:
        response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
                "task_type": "content_moderation",
                "proposed_action": f"Review batch of {len(needs_review)} content items",
                "agent_reasoning": f"{len(needs_review)} items need human review",
                "metadata": {
                    "content_type": "batch",
                    "item_count": len(needs_review),
                    "items": needs_review,
                    "total_items": len(content_items)
                },
                "blocking": True
            }
        )

Decision Actions

Approve Content

async def approve_content(content_id):
    """Approve content for publishing."""
    # Update content status
    await update_content_status(content_id, "approved")
    # Publish content
    await publish_content(content_id)

Reject Content

async def reject_content(content_id, reason):
    """Reject content."""
    # Update content status
    await update_content_status(content_id, "rejected")
    # Notify user
    await notify_user(content_id, f"Content rejected: {reason}")

Modify Content

async def modify_content(content_id, modifications):
    """Apply modifications to content."""
    # Apply redactions or edits
    modified_content = await apply_modifications(content_id, modifications)
    # Update content
    await update_content(content_id, modified_content)
    # Approve modified version
    await approve_content(content_id)

Routing Rules

Route content moderation to appropriate team:

# Route hate speech to specialized team
await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
        "name": "Hate speech to moderation team",
        "organization_id": 1,
        "priority": 100,
        "conditions": {
            "task_type": {"operator": "=", "value": "content_moderation"},
            "metadata.violation_type": {"operator": "=", "value": "hate_speech"}
        },
        "assign_to_team_id": 2,  # Specialized moderation team
        "is_active": True
    }
)

Dashboard Integration

The content moderation request appears in the dashboard with:

Content preview (text, image thumbnail, etc.)
Violation type and confidence
User information
Analysis details
Content metadata

Reviewers can:

✅ Approve content for publishing
❌ Reject with reason
✏️ Modify content (redact, edit, etc.) before approving

Best Practices

Set appropriate thresholds: Use different confidence thresholds for different violation types
Provide context: Show why content was flagged
Include content preview: Make it easy for reviewers to see the content
Use urgency levels: High urgency for time-sensitive content
Track patterns: Monitor which types of content need review most often
Feedback loop: Use feedback to improve AI confidence scores

Performance Considerations

Non-blocking for non-critical content: Use non-blocking for lower-priority content
Batch processing: Group similar content for batch review
Caching: Cache analysis results for similar content
Async processing: Process moderation requests asynchronously

Next Steps

Learn about SQL Execution workflows
Explore Routing Rules for automatic assignment
Check out Connectors for real-time notifications

Getting started

Products

Integrations

Use cases

Configuration

Connectors

Dashboard

Overview

Implementation

Basic Content Moderation

Advanced Features

Image Moderation

Video Moderation

Batch Moderation

Decision Actions

Approve Content

Reject Content

Modify Content

Routing Rules

Dashboard Integration

Best Practices

Performance Considerations

Next Steps

Getting started

Products

Integrations

Use cases

Configuration

Connectors

Dashboard

​Overview

​Implementation

​Basic Content Moderation

​Advanced Features

​Image Moderation

​Video Moderation

​Batch Moderation

​Decision Actions

​Approve Content

​Reject Content

​Modify Content

​Routing Rules

​Dashboard Integration

​Best Practices

​Performance Considerations

​Next Steps

Overview

Implementation

Basic Content Moderation

Advanced Features

Image Moderation

Video Moderation

Batch Moderation

Decision Actions

Approve Content

Reject Content

Modify Content

Routing Rules

Dashboard Integration

Best Practices

Performance Considerations

Next Steps