Skip to main content
This is an example implementation of a content moderation workflow using Humancheck. This demonstrates how to integrate human review for borderline cases, but Humancheck can be adapted to any content review or moderation scenario.

Overview

The content moderation workflow:
  1. AI analyzes content for policy violations
  2. If confidence is low or content is borderline, request human review
  3. Reviewer makes final decision
  4. System applies the decision (approve, reject, flag, etc.)

Implementation

Basic Content Moderation

import httpx
import asyncio

async def moderate_content_workflow(content, content_type="text"):
    """Moderate content with human review for borderline cases."""
    async with httpx.AsyncClient() as client:
        # Analyze content with AI
        analysis = await analyze_content(content, content_type)
        confidence = analysis["confidence"]
        violation_type = analysis["violation_type"]
        
        # Threshold for auto-decision vs review
        AUTO_APPROVE_THRESHOLD = 0.95
        AUTO_REJECT_THRESHOLD = 0.05
        
        if AUTO_REJECT_THRESHOLD < confidence < AUTO_APPROVE_THRESHOLD:
            # Borderline case - request human review
            response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
                    "task_type": "content_moderation",
                    "proposed_action": f"Flag content as {violation_type}",
                    "agent_reasoning": f"Low confidence ({confidence:.2%}) in moderation decision. Content may violate policy.",
                    "confidence_score": confidence,
                    "urgency": "high",  # Content moderation is time-sensitive
                    "metadata": {
                        "content": content[:500],  # First 500 chars
                        "content_type": content_type,
                        "violation_type": violation_type,
                        "content_length": len(content),
                        "analysis_details": analysis,
                        "user_id": analysis.get("user_id"),
                        "content_id": analysis.get("content_id")
                    },
                    "blocking": True  # Wait for decision before publishing
                }
            )
            review = response.json()
            
            # Get decision
            decision = review.get("decision")
            if not decision:
                decision_response = await client.get(
                    f"https://api.humancheck.dev/reviews/{review['id']}/decision"
                )
                decision = decision_response.json()
            
            # Apply decision
            if decision["decision_type"] == "approve":
                # Approve content (allow publishing)
                await approve_content(review["metadata"]["content_id"])
                return {"status": "approved", "content_id": review["metadata"]["content_id"]}
            elif decision["decision_type"] == "modify":
                # Apply modification (e.g., redact specific parts)
                modified_content = decision.get("modified_action")
                await update_content(review["metadata"]["content_id"], modified_content)
                return {"status": "modified", "content_id": review["metadata"]["content_id"]}
            else:
                # Reject content
                await reject_content(review["metadata"]["content_id"], decision.get("notes"))
                return {"status": "rejected", "reason": decision.get("notes")}
        elif confidence >= AUTO_APPROVE_THRESHOLD:
            # Auto-approve
            await approve_content(analysis["content_id"])
            return {"status": "auto_approved"}
        else:
            # Auto-reject
            await reject_content(analysis["content_id"], "High confidence violation detected")
            return {"status": "auto_rejected"}

async def analyze_content(content, content_type):
    """Analyze content for policy violations."""
    # Your AI moderation logic here
    # This is a simplified example
    violations = {
        "hate_speech": 0.3,
        "spam": 0.1,
        "harassment": 0.2,
        "none": 0.4
    }
    
    # Determine most likely violation
    violation_type = max(violations, key=violations.get)
    confidence = violations[violation_type]
    
    return {
        "confidence": confidence,
        "violation_type": violation_type if violation_type != "none" else None,
        "content_id": generate_content_id(),
        "user_id": extract_user_id(content)
    }

Advanced Features

Image Moderation

async def moderate_image_workflow(image_url, image_id):
    """Moderate image content."""
    analysis = await analyze_image(image_url)
    
    response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
            "task_type": "content_moderation",
            "proposed_action": f"Flag image as {analysis['violation_type']}",
            "agent_reasoning": f"Image analysis shows potential {analysis['violation_type']} violation",
            "confidence_score": analysis["confidence"],
            "metadata": {
                "content_type": "image",
                "image_url": image_url,
                "image_id": image_id,
                "violation_type": analysis["violation_type"],
                "image_metadata": analysis.get("metadata", {})
            },
            "blocking": True
        }
    )

Video Moderation

async def moderate_video_workflow(video_url, video_id):
    """Moderate video content."""
    # Analyze video (may take longer)
    analysis = await analyze_video(video_url)
    
    response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
            "task_type": "content_moderation",
            "proposed_action": f"Flag video as {analysis['violation_type']}",
            "agent_reasoning": f"Video analysis indicates {analysis['violation_type']} at timestamp {analysis['timestamp']}",
            "confidence_score": analysis["confidence"],
            "metadata": {
                "content_type": "video",
                "video_url": video_url,
                "video_id": video_id,
                "violation_type": analysis["violation_type"],
                "timestamp": analysis.get("timestamp"),
                "duration": analysis.get("duration")
            },
            "blocking": True
        }
    )

Batch Moderation

async def batch_moderation_workflow(content_items):
    """Moderate multiple content items."""
    # Analyze all items
    analyses = await analyze_batch(content_items)
    
    # Find items needing review
    needs_review = [
        item for item, analysis in zip(content_items, analyses)
        if 0.05 < analysis["confidence"] < 0.95
    ]
    
    if needs_review:
        response = await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
                "task_type": "content_moderation",
                "proposed_action": f"Review batch of {len(needs_review)} content items",
                "agent_reasoning": f"{len(needs_review)} items need human review",
                "metadata": {
                    "content_type": "batch",
                    "item_count": len(needs_review),
                    "items": needs_review,
                    "total_items": len(content_items)
                },
                "blocking": True
            }
        )

Decision Actions

Approve Content

async def approve_content(content_id):
    """Approve content for publishing."""
    # Update content status
    await update_content_status(content_id, "approved")
    # Publish content
    await publish_content(content_id)

Reject Content

async def reject_content(content_id, reason):
    """Reject content."""
    # Update content status
    await update_content_status(content_id, "rejected")
    # Notify user
    await notify_user(content_id, f"Content rejected: {reason}")

Modify Content

async def modify_content(content_id, modifications):
    """Apply modifications to content."""
    # Apply redactions or edits
    modified_content = await apply_modifications(content_id, modifications)
    # Update content
    await update_content(content_id, modified_content)
    # Approve modified version
    await approve_content(content_id)

Routing Rules

Route content moderation to appropriate team:
# Route hate speech to specialized team
await client.post(
        "https://api.humancheck.dev/reviews",
        headers={
            "Authorization": "Bearer your-api-key-here",
            "Content-Type": "application/json"
        },
        json={
        "name": "Hate speech to moderation team",
        "organization_id": 1,
        "priority": 100,
        "conditions": {
            "task_type": {"operator": "=", "value": "content_moderation"},
            "metadata.violation_type": {"operator": "=", "value": "hate_speech"}
        },
        "assign_to_team_id": 2,  # Specialized moderation team
        "is_active": True
    }
)

Dashboard Integration

The content moderation request appears in the dashboard with:
  • Content preview (text, image thumbnail, etc.)
  • Violation type and confidence
  • User information
  • Analysis details
  • Content metadata
Reviewers can:
  • ✅ Approve content for publishing
  • ❌ Reject with reason
  • ✏️ Modify content (redact, edit, etc.) before approving

Best Practices

  1. Set appropriate thresholds: Use different confidence thresholds for different violation types
  2. Provide context: Show why content was flagged
  3. Include content preview: Make it easy for reviewers to see the content
  4. Use urgency levels: High urgency for time-sensitive content
  5. Track patterns: Monitor which types of content need review most often
  6. Feedback loop: Use feedback to improve AI confidence scores

Performance Considerations

  • Non-blocking for non-critical content: Use non-blocking for lower-priority content
  • Batch processing: Group similar content for batch review
  • Caching: Cache analysis results for similar content
  • Async processing: Process moderation requests asynchronously

Next Steps