Everything you need to integrate SafeModeration into your platform.
Get up and running in under 60 seconds.
Step 1
Get your API key
Sign up at safemoderation.com/pricing to start your free trial. Your API key will be emailed to you immediately after checkout. It looks like this: sm_live_a1b2c3d4e5f6…
Step 2a
Moderate text
curl -X POST https://api.safemoderation.com/.netlify/functions/moderate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"media_type": "text",
"content": "Hello, how are you today?",
"reference_id": "comment_84729"
}'const response = await fetch(
'https://api.safemoderation.com/.netlify/functions/moderate',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.SAFEMODERATION_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
media_type: 'text',
content: 'Hello, how are you today?',
reference_id: 'comment_84729',
}),
}
);
const data = await response.json();
console.log(data.decision); // "allow"import requests
response = requests.post(
'https://api.safemoderation.com/.netlify/functions/moderate',
headers={
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json',
},
json={
'media_type': 'text',
'content': 'Hello, how are you today?',
'reference_id': 'comment_84729',
}
)
data = response.json()
print(data['decision']) # "allow"Step 2b
Moderate an image
curl -X POST https://api.safemoderation.com/.netlify/functions/moderate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"media_type": "image",
"content": "https://example.com/user-upload.jpg",
"reference_id": "post_a8f9c2"
}'const response = await fetch(
'https://api.safemoderation.com/.netlify/functions/moderate',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.SAFEMODERATION_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
media_type: 'image',
content: 'https://example.com/user-upload.jpg',
reference_id: 'post_a8f9c2',
}),
}
);
const data = await response.json();
console.log(data.decision); // "block"import requests
response = requests.post(
'https://api.safemoderation.com/.netlify/functions/moderate',
headers={
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json',
},
json={
'media_type': 'image',
'content': 'https://example.com/user-upload.jpg',
'reference_id': 'post_a8f9c2',
}
)
data = response.json()
print(data['decision']) # "block"Step 3
Read the response
Text response:
{
"request_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"reference_id": "comment_84729",
"decision": "allow",
"confidence": 0.95,
"categories": {
"hate_speech": 0.01,
"harassment_bullying": 0.02,
"adult_content": 0.00,
"violence_gore": 0.00,
"spam_scam": 0.01,
"suicide_self_harm": 0.00,
"pii_exposure": 0.00,
"profanity": 0.00
},
"warnings": [],
"usage": {
"credits_used": 1,
"monthly_credits": 15000
}
}
Image response:
{
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"reference_id": "post_a8f9c2",
"decision": "block",
"confidence": 0.92,
"categories": {
"adult_content": 0.04,
"violence_gore": 0.92,
"hate_speech": 0.01,
"suicide_self_harm": 0.00,
"weapons": 0.78,
"drugs": 0.00,
"alcohol": 0.00,
"tobacco": 0.00
},
"warnings": [],
"usage": {
"credits_used": 47,
"monthly_credits": 15000
}
}
The decision field tells you what to do with the content.
reference_id is a required field that links each moderation result back to the record in your own data model. It is stored in the log and echoed in every response. It has no effect on the moderation decision itself.
| Field | Required | Format | Description |
|---|---|---|---|
reference_id |
Yes | String, 1-256 chars, alphanumeric plus ._-/: |
Your unique identifier for this content. Use your database row ID, post slug, or comment ID: any key that lets you look up the original record. |
reference_id is requiredSafeModeration assigns its own request_id to every call. reference_id is yours: it makes the moderation log immediately actionable without a secondary lookup. When a result comes back block, your code already knows exactly which record to act on.
A forum stores user comments in a comments table, each with an integer primary key. When a user submits a new comment, the forum's backend calls SafeModeration before writing the row to the database:
const result = await fetch(
'https://api.safemoderation.com/.netlify/functions/moderate',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.SAFEMODERATION_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
media_type: 'text',
content: comment.body,
reference_id: `comment_${comment.id}`, // e.g. "comment_84729"
}),
}
).then(r => r.json());
if (result.decision === 'block') {
// reference_id is echoed back, no extra lookup needed
await markCommentRejected(result.reference_id);
}
The response echoes "reference_id": "comment_84729", so the forum can act on the result without tracking SafeModeration's internal request_id at all.
The optional metadata field accepts any plain JSON object you want stored alongside the log record and echoed in the response. It has no effect on the moderation decision itself.
| Field | Required | Format | Description |
|---|---|---|---|
metadata |
No | Plain object, max 4,096 bytes (UTF-8 JSON) | Arbitrary key-value pairs you want stored with the log record. Common uses: content type, author ID, thread ID, locale, and other platform-specific context. |
Metadata is a free-form envelope: use any keys that make sense for your platform. Common examples:
content_type: your category for the content (e.g. comment, post, profile), useful for filtering in your moderation dashboardauthor_id: identifier of the user who created the content, enabling author-level abuse tracking and repeat-offender detectionthread_id, locale, client_version: any other context your team finds useful when reviewing flagged contentconst result = await fetch(
'https://api.safemoderation.com/.netlify/functions/moderate',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.SAFEMODERATION_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
media_type: 'text',
content: comment.body,
reference_id: `comment_${comment.id}`,
metadata: {
content_type: 'comment',
author_id: `user_${comment.authorId}`,
thread_id: `thread_${comment.threadId}`,
},
}),
}
).then(r => r.json());
// result.metadata is echoed back exactly as sent
console.log(result.metadata.author_id); // "user_84729"
The metadata object is echoed in the response unchanged, so your downstream code can read any field it needs without an extra lookup.
SafeModeration uses API key authentication. Pass your key as a Bearer token in the Authorization header on every request.
Authorization: Bearer sm_live_your_key_here
Keep your API key secret. Never expose it in client-side code, public repositories, or browser requests. Always make API calls from your server.
API keys are issued immediately after checkout and emailed to you. If you lose your key, contact support@safemoderation.com to have it revoked and reissued.
POST https://api.safemoderation.com/.netlify/functions/moderate
Headers
| Header | Required | Value |
|---|---|---|
Authorization | Yes | Bearer YOUR_API_KEY |
Content-Type | Yes | application/json |
Body parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
media_type | string | Yes | Either "text" or "image". |
content | string | Yes | For text: the text to moderate, max 1,024 characters. For image: a public HTTPS URL pointing to a JPEG or PNG. |
reference_id | string | Recommended | Your internal ID for this content (e.g. post_id, comment_id). Echoed in every response. 1-256 characters; alphanumeric plus ._-/:. |
metadata | object | No | Arbitrary key-value pairs stored with the log record and echoed in the response. Any plain JSON object up to 4,096 bytes (UTF-8). Has no effect on the moderation decision. |
For text requests, content is limited to 1,024 characters. Requests with longer text are rejected with a 400 error. Trim content to the relevant portion before submitting.
| Field | Type | Description |
|---|---|---|
request_id | string | Unique identifier for this request assigned by SafeModeration. Reference this ID when contacting support. |
reference_id | string | Echoes the reference_id you sent. Always present in the response. |
metadata | object | Echoes the metadata object you sent, unchanged. Only present if provided in the request. |
decision | string | One of: allow, flag, block |
confidence | number | Confidence score from 0.00 to 1.00 |
categories | object | Score for each moderation category (0.00-1.00). Keys vary by media_type. See the categories reference below. |
warnings | array | Reserved for future warning codes. Currently always an empty array. |
usage.credits_used | number | Credits consumed this month so far |
usage.monthly_credits | number | Your plan's monthly credit limit |
These properties are stable and guaranteed in every response:
media_type are always present, even if their score is 0.00decision is always one of: allow, flag, or blockconfidence reflects the classifier's certainty in the decision, not an average of category scorescurl -X POST https://api.safemoderation.com/.netlify/functions/moderate \
-H "Authorization: Bearer sm_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"media_type": "text",
"content": "I know where you live. Im going to find you and hurt you.",
"reference_id": "post_a8f9c2"
}'const response = await fetch(
'https://api.safemoderation.com/.netlify/functions/moderate',
{
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.SAFEMODERATION_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
media_type: 'text',
content: 'I know where you live. Im going to find you and hurt you.',
reference_id: 'post_a8f9c2',
}),
}
);
const data = await response.json();
// data.decision === "block"import requests
response = requests.post(
'https://api.safemoderation.com/.netlify/functions/moderate',
headers={
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json',
},
json={
'media_type': 'text',
'content': 'I know where you live. Im going to find you and hurt you.',
'reference_id': 'post_a8f9c2',
}
)
data = response.json()
# data["decision"] == "block"{
"request_id": "9b2d3e4f-1a2b-3c4d-5e6f-7a8b9c0d1e2f",
"reference_id": "post_a8f9c2",
"decision": "block",
"confidence": 0.97,
"categories": {
"hate_speech": 0.08,
"harassment_bullying": 0.96,
"adult_content": 0.00,
"violence_gore": 0.45,
"spam_scam": 0.01,
"suicide_self_harm": 0.00,
"pii_exposure": 0.00,
"profanity": 0.02
},
"warnings": [],
"usage": {
"credits_used": 42,
"monthly_credits": 15000
}
}
Every response includes a decision field. Here is how to act on each value:
| Decision | Meaning | Recommended action |
|---|---|---|
allow | Content passes moderation | Publish |
flag | Ambiguous, possible violation | Route to human review, or add friction |
block | Clear violation | Reject |
switch (data.decision) {
case 'allow':
return publishContent();
case 'flag':
return sendToHumanReview();
case 'block':
return rejectContent();
}
How you act on each decision is entirely up to your platform's policies. Many platforms auto-block on block, route flag to a human review queue, and auto-approve on allow. You decide the right thresholds for your use case.
The confidence field reflects overall certainty in the decision, from 0.00 (uncertain) to 1.00 (very certain). Individual category scores reflect how strongly each category applies to the content.
Confidence scores are probabilistic, not deterministic. No automated system is 100% accurate. We recommend human review for high-stakes decisions and for content near decision boundaries.
When media_type is "text", the response includes scores for the following eight categories. Each value is a float from 0.00 to 1.00. All keys are always present.
| Category | Key | What it detects |
|---|---|---|
| Hate speech | hate_speech |
Slurs, dehumanizing language, and content targeting people based on race, religion, ethnicity, gender, sexual orientation, or other protected characteristics. |
| Harassment & bullying | harassment_bullying |
Targeted abuse, threats, doxxing, coordinated harassment, and content designed to intimidate or demean specific individuals. |
| Adult content | adult_content |
Explicit sexual content, graphic nudity, and solicitation. |
| Violence & gore | violence_gore |
Graphic violence, threats of physical harm, glorification of violence, and disturbing imagery descriptions. |
| Spam & scams | spam_scam |
Phishing attempts, fake prizes, fraudulent solicitation, impersonation, and promotional abuse. |
| Suicide & self-harm | suicide_self_harm |
Content that promotes, glorifies, or provides methods for self-harm or suicide. Context-aware: prevention and awareness content is not flagged. |
| PII exposure | pii_exposure |
Personally identifiable information including Social Security numbers, credit card numbers, bank account details, passwords, and similar sensitive data. |
| Profanity | profanity |
Explicit language and offensive terms. |
SafeModeration automatically detects common evasion techniques including l33tspeak substitution, spaced characters, Unicode homoglyphs, and repeated character patterns.
The classifier detects harmful content in 50+ languages. Coverage is most thoroughly tested in English, Spanish, French, German, Portuguese, Arabic, and Russian, with strong performance across other major European, East Asian, South Asian, and Middle Eastern languages.
When media_type is "image", the response includes scores for the following eight categories. Each value is a float from 0.00 to 1.00. All keys are always present.
| Category | Key | What it detects |
|---|---|---|
| Adult content | adult_content |
Nudity, sexual content, or sexually suggestive imagery. |
| Violence & gore | violence_gore |
Violence, gore, or graphic disturbing imagery. |
| Hate speech | hate_speech |
Hate symbols, extremist iconography, or supremacist imagery. |
| Suicide & self-harm | suicide_self_harm |
Self-injury imagery or suicide-related imagery. |
| Weapons | weapons |
Firearms, knives, or weapons in threatening contexts. |
| Drugs | drugs |
Illegal substances, paraphernalia, or drug use imagery. |
| Alcohol | alcohol |
Alcoholic beverages or drinking imagery. Use thresholds appropriate to your jurisdiction. |
| Tobacco | tobacco |
Tobacco products or smoking imagery. Use thresholds appropriate to your jurisdiction. |
SafeModeration fetches the image from the URL you provide, classifies it, then discards the bytes. We store the URL and a one-way hash of the image content for caching and audit purposes. The image bytes are not retained.
| Requirement | Value |
|---|---|
| Supported formats | JPEG, PNG |
| Maximum file size | 10 MB |
| Minimum dimensions | 80 × 80 pixels |
| Maximum dimensions | 8192 × 8192 pixels |
| URL protocol | HTTPS only. The URL must be publicly accessible. |
| Fetch timeout | 5 seconds |
| Maximum redirects | 3 |
GIF and WebP are not supported. Requests with unsupported formats return a 400 error with code IMAGE_FORMAT_UNSUPPORTED. You are not charged for failed image requests.
Pass the URL of the image as stored on your own infrastructure or CDN. The URL must be reachable from our servers at the time of the request. Pre-signed URLs with short expiration windows may fail if the window closes before the request is processed.
| Status | Description |
|---|---|
400 | INVALID_MEDIA_TYPE: media_type must be "text" or "image". |
400 | INVALID_CONTENT: content must be a non-empty string. |
400 | Content must be 1024 characters or fewer: the content field exceeded 1,024 characters for a text request. Trim the content before retrying. |
400 | "reference_id" is required: reference_id was not included in the request body. |
400 | "reference_id" must be 1-256 characters: reference_id was empty or exceeded the length limit. |
400 | "reference_id" contains invalid characters (allowed: a-z A-Z 0-9 . _ - / :): reference_id contained whitespace, quotes, or other disallowed characters. |
400 | metadata must be a plain object: metadata was not a JSON object (e.g. an array or string was sent). |
400 | metadata exceeds 4096-byte limit (5200 bytes): the JSON-serialised metadata object exceeded 4,096 bytes (UTF-8); 5200 is the actual byte count from your request. |
| Image-specific errors | |
400 | INVALID_IMAGE_URL: the image URL is malformed or does not use HTTPS. |
400 | IMAGE_FETCH_FAILED: the image could not be fetched. The server returned an error, the request timed out, or a network error occurred. |
400 | IMAGE_FORMAT_UNSUPPORTED: the image is not JPEG or PNG. GIF, WebP, and other formats are not supported. |
400 | IMAGE_TOO_LARGE: the image file exceeds 10 MB. |
400 | IMAGE_TOO_SMALL: the image dimensions are below 80 × 80 pixels. |
400 | IMAGE_DIMENSIONS_TOO_LARGE: the image dimensions exceed 8192 × 8192 pixels. |
400 | IMAGE_URL_BLOCKED: the image URL was rejected by URL safety checks (private IP ranges, localhost, non-public hosts). |
401 | Unauthorized: missing, invalid, or revoked API key. |
429 | Rate limit exceeded. Maximum 600 requests per minute per API key. Slow down and retry. Burst limit hit. The retry_after field in the response body and the Retry-After header indicate how many seconds until the current window resets. |
429 | Monthly credit limit reached. Upgrade your plan or wait until the next billing period. Monthly cap exhausted. The response body includes "limit_type": "monthly". All requests return 429 until the 1st of next month or you upgrade your plan. |
502 | Internal error: retry the request. If it persists, contact support@safemoderation.com. |
Failed image requests do not consume credits. If the image cannot be fetched, is in an unsupported format, or fails any safety check, the request is not charged.
{
"error": "Invalid or revoked API key."
}
Image errors include an additional code field:
{
"error": "Image is not JPEG or PNG.",
"code": "IMAGE_FORMAT_UNSUPPORTED"
}
There are two distinct 429 conditions. A burst 429 is temporary: wait the number of seconds in retry_after and resend. A monthly 429 ("limit_type": "monthly") blocks all requests until the 1st of next month or until you upgrade your plan.
Each moderation request consumes credits from your monthly allowance:
| Content type | Credits |
|---|---|
| Text | 1 credit |
| Image | 3 credits |
Failed image requests do not consume credits. If the image cannot be fetched, is in an unsupported format, or fails any safety check, you are not charged.
| Plan | Monthly credits | Price |
|---|---|---|
| Starter | 15,000 | $99/mo |
| Growth | 150,000 | $249/mo |
| Pro | 500,000 | $499/mo |
| Enterprise | Custom | Contact us |
Credits reset on the 1st of each calendar month. Unused credits do not roll over.
All plans share a burst limit of 600 requests per minute per API key. Exceeding this returns a 429 with a Retry-After header and a retry_after field in the response body indicating the seconds remaining in the current window.
async function moderateWithRetry(content) {
const res = await fetch(url, options);
if (res.status === 429) {
const data = await res.json();
if (data.limit_type === 'monthly') throw new Error('Monthly limit reached');
const wait = (data.retry_after ?? 60) * 1000;
await new Promise(r => setTimeout(r, wait));
return fetch(url, options);
}
return res;
}
Track your credit usage with the usage object returned in every response. You'll also receive email alerts at 80%, 90%, and 100% of your monthly limit.
How quickly does the API respond?
Most text requests complete in under 200ms. Image requests typically take 500ms to 2 seconds, depending on image fetch time and file size.
Does SafeModeration store the content I submit?
For authenticated production API requests, we store the moderation result, your submitted text content (for text moderation), and the image URL (for image moderation). This data is accessible to you through your dashboard and supports moderation review, audit, and analytics. For image moderation, we fetch the image, classify it, then discard the bytes. We do not retain raw image content. We store a SHA-256 hash of the image bytes for caching purposes only. Moderation logs are retained for one year and then automatically deleted. See our Privacy Policy for full details on retention and your rights.
What image formats are supported?
JPEG and PNG are supported. GIF, WebP, and other formats are not supported and will return a 400 error. Failed image requests are not charged.
How are image categories different from text categories?
Both content types return eight categories. Four overlap: hate_speech, adult_content, violence_gore, and suicide_self_harm. Text adds harassment_bullying, spam_scam, pii_exposure, and profanity. Image adds weapons, drugs, alcohol, and tobacco.
Do failed image requests count against my monthly credits?
No. If the image cannot be fetched, is in an unsupported format, or fails any safety check, the request is not charged. Credits are only consumed when a moderation result is successfully returned.
What languages are supported?
SafeModeration handles text content in 50+ languages, including all major European, East Asian, South Asian, and Middle Eastern languages. English, Spanish, French, German, Portuguese, Arabic, and Russian have the most thoroughly tested coverage.
What should I do with flag decisions?
That depends on your platform's policies. Common approaches: route to human review, add a friction step before posting, hold content pending secondary analysis, or treat identically to block. There is no single right answer.
Can I test without a paid plan?
Your 7-day free trial includes full API access. No charge until the trial ends.
What happens if the API is down?
Email support@safemoderation.com for urgent issues.
How do I cancel?
Manage your subscription from the billing section of your dashboard. Cancellation takes effect at the end of your current billing period.