Unified chat endpoint for the Handa Uncle assistant.
Option 1: Registered users (bearerAuth)
/api/v1/auth/signup or /api/v1/auth/signin).Authorization: Bearer <token> on every chat request.Option 2: Guest/device users (deviceAuth)
GET /app/launch with x-device-id and x-platform to receive a synthetic userId.x-device-id, x-platform, and optionally x-user-id (from launch) on /api/v1/ai/chat until FREE_MESSAGE_THRESHOLD is hit.(threshold + 1) request returns 429 FREE_LIMIT_EXCEEDED with error.details.is_guest = true and requires_signup = true; show the signup prompt.userId. Optional x-user-email / x-user-phone hints help link devices to existing accounts.X-Stream-Response: true (alias Stream: true) to receive text/event-stream chunks (token, tool_call, tool_result, done).false for the default buffered JSON response.| Scenario | Required headers |
|---|---|
| Option 1: Registered user (bearerAuth) | Authorization: Bearer <Auth0 access token> |
| Option 2: Guest / device flow (deviceAuth) | x-device-id (required), x-platform (required: android/ios/web), x-user-id (optional), x-user-email (optional), x-user-phone (optional) |
| Streaming toggle | X-Stream-Response: false or Stream: false to disable streaming (streaming is ON by default) |
The two authentication methods (bearerAuth and deviceAuth) are mutually exclusive—choose one based on whether the user is registered or a guest. The playground will show the appropriate fields based on your selection.
X-Stream-Response: false or Stream: false in your request headers.usageLimitMiddleware.
android, ios, or webGET /app/launchGET /app/launch with x-device-id and x-platform. The
response returns a synthetic userId (e.g. U-813e62a0-...) even when no
Auth0 account exists.POST /api/v1/ai/chat with the device headers plus
x-user-id returned from the launch step. Requests work immediately until
the free counter reaches the configured FREE_MESSAGE_THRESHOLD (default 100).(threshold + 1) request the API responds with
429 FREE_LIMIT_EXCEEDED and the payload includes
error.details.isGuest = true and error.details.requiresSignup = true.
Frontends should show the signup modal at this point.userId and conversation
history stay intact because the backend links the identity via the stored
device metadata.
Device headers are treated with the same chat-rate limiting and audit trails as
JWTs. Spoofed or missing headers are rejected with 401 UNAUTHORIZED.
| Field | Type | Required | Notes |
|---|---|---|---|
message | string | ✅ | 1–2000 characters of user input. |
conversationId | string | ⛔ | MongoDB ObjectId; omit to create a new conversation. |
model | string | ⛔ | Overrides the configured default (subject to allowlists). |
attachments[].fileId | string | ⛔ | Reference to a file uploaded via the File Upload service. |
attachments[].data | string | ⛔ | Base64 payload for inline uploads (max 20MB after decoding). |
attachments[].mimeType | string | ⛔ | Required when data is provided. |
attachments[].filename | string | ⛔ | Required when data is provided. |
prepromptKey | string | ⛔ | Optional identifier for a backend-managed pre-prompt. When present, the chat pipeline injects the masked instructions tied to this key right after the system prompt. |
fileId or (data + mimeType + filename). The
backend validates file size, ownership, and converts supported media into the AI SDK
multimodal format before invoking the LLM.
POST /api/v1/preprompts, protected by the backend
secret). Client applications should fetch the user-facing catalog from
GET /api/v1/public/preprompts and send the selected prepromptKey alongside the chat
payload. Users only see the friendly label, while the backend silently injects the
corresponding hidden prompt into the LLM context.
usageLimitMiddleware inspects c.get('isGuest') and returns tailored limit
errors:
"Please upgrade your account..." plus
"isGuest": false, "requiresSignup": false. Use the flags to decide whether
to show a signup modal or upsell screen.
data: <json>\n\n. Expect the following shapes:
{"type":"token","content":"..."} – incremental tokens.{"type":"toolCall","toolName":"...","args":{...}} – tool invocation start.{"type":"toolResult","toolName":"...","result":{...}} – tool output.{"type":"done","conversationId":"...","messageId":"...","tokenCount":123,"meta":{"timestamp":"...","streaming":true,"durationMs":1400}}{"type":"error","message":"Streaming failed","error":"..."} – terminal failures.done or error event arrives.
| HTTP | error.code | When it fires |
|---|---|---|
401 | UNAUTHORIZED | Missing bearer token and device headers. |
429 | RATE_LIMIT_EXCEEDED | More than 20 requests/min per user/device. |
429 | FREE_LIMIT_EXCEEDED | Guest or free-tier user exhausted the free quota (details.requiresSignup will be true for guests). |
500 | INTERNAL_ERROR | Upstream failure (LLM, storage, RAG, etc.). |
FREE_LIMIT_EXCEEDED, surface a
signup or upgrade prompt before re-sending traffic.
Note: No X-Stream-Response header needed—streaming is the default behavior.
Auth0 access token for registered users.
Required when Authorization header is omitted. Stable identifier for the calling device.
1Client platform. Required with device auth.
android, ios, web Optional hint for mapping the device to an existing user (returned by GET /app/launch). Strongly recommended so conversations persist once the guest signs up.
Optional email hint for device-auth flows.
Optional phone hint in E.164 format for device-auth flows.
^\+?[1-9]\d{7,14}$Streaming is ON by default. Set to false to receive buffered JSON instead of SSE streaming.
true, false Alias for X-Stream-Response. Streaming is ON by default. Set to false to disable.
true, false 1 - 2000Existing conversation ID. Omit to start a new thread.
^[0-9a-fA-F]{24}$Optional override for the default model.
Optional identifier for a backend-managed pre-prompt. When supplied, the associated masked instructions are injected right after the system prompt.
2 - 64^[a-zA-Z0-9][a-zA-Z0-9_-]{1,63}$Optional hint for how the conversation was initiated. Use 'voice' for voice input, 'file' for file uploads (auto-detected if attachments present), or 'text' for plain messages (default).
text, file, voice