Video API
Overview
The EnableX Video API is a session provisioning and reporting system that powers video communications. It does not handle media transport directly—instead, it manages the server-side infrastructure that coordinates participants, enforces access control, and tracks session data. Your app server uses the API to create rooms, authorize users, and retrieve post-session analytics.
Think of it this way: the Video API is the control plane. The EnableX media infrastructure is the data plane. Your app server owns the user experience; the API connects you to our platform.
What the API gives you:
- Room provisioning (create, configure, list, delete video sessions)
- User authorization via tokens (JWT-based, role-aware, time-limited)
- Session analytics and CDR (call detail records, participant duration, quality metrics)
- Post-session data retrieval (recordings, chat transcripts, transcripts, metadata)
- Webhook notifications (session start, end, participant join/leave, recording ready)
Base URL: https://api.enablex.io/video/v2
All endpoints require HTTP Basic Auth (App ID + App Key). Your app server makes these calls from a backend environment, never from client code.
Core Concepts
Room
A room is a server-side session container that exists independently of any active participants. You create a room via REST API before anyone joins; the room persists and can host multiple sessions over time.
Why rooms exist: They decouple session provisioning from participant availability. You can schedule a room days in advance, share the room ID with users, and they join when ready. Or create ad-hoc rooms on demand. The room is the unit of access control and billing.
Room types:
- Permanent: Exists indefinitely. Useful for persistent spaces (team channels, support queues).
- Scheduled: Created for a specific time window. Useful for classes, webinars, meetings.
- Ad hoc: Created on demand and destroyed after use. Useful for instant peer connections, support chats.
Room modes: Each room operates in one of three modes that determines media flow and participant roles:
- Group (SFU): Up to 100 participants, all can publish and subscribe. Peer-to-peer-like, but centralized through our SFU (Selective Forwarding Unit).
- Lecture (MCU): Up to 2000 participants, strict role separation. Moderators publish; participants subscribe (with optional floor access for Q&A). Ideal for broadcasts and large events.
- P2P: Two participants. Direct media flow. Lowest latency, minimal signalling overhead.
Token
A token is a short-lived JWT credential that authorizes a specific user to join a specific room with a specific role. Your app server generates tokens using your App Key; clients use tokens to authenticate with EnableX media servers.
Why tokens exist: They enforce fine-grained access control without exposing your App Key to clients. A token is bound to a room, a user, a role, and an expiry time. If a user should not join a room, don't issue them a token.
Token properties:
- Bound to: One room, one user ID, one role (moderator or participant).
- Lifetime: Default 24 hours; configurable per token.
- Single-use: Token is valid until the session ends or expires, whichever comes first.
- Generated server-side: Via REST API call with your App Key.
- Passed to client: Your backend sends the token to the client (via your own API, WebSocket, email, QR code, etc.). The client uses it to join the room.
Session
A session is the active instance of a room. It begins when the first participant joins and ends when the last participant leaves (or the session duration limit is reached).
Why sessions exist: They represent the actual video call. Multiple sessions can occur in the same room over time (e.g., a classroom room hosts a new session every day). Sessions are the unit of recording, analytics, and billing.
Session lifecycle:
- Session starts → first participant joins with a valid token.
- Session active → participants publish/subscribe media, send chat, share screens.
- Session ends → last participant leaves or session duration limit is hit.
- Post-session data available → ~5 minutes after session ends (CDR, recordings, transcripts).
Session Lifecycle: Step by Step
Here is how a video session flows from creation to completion:
Step 1: Create a Room (Server-side)
Your app server calls the REST API to create a room. You specify the room name, mode (group/lecture/p2p), duration limit, and any settings (e.g., auto-record, broadcast settings). EnableX returns a room_id.
POST https://api.enablex.io/video/v2/rooms
Authorization: Basic [Base64(AppID:AppKey)]
Content-Type: application/json
{
"name": "Sales Demo",
"mode": "group",
"duration": 3600,
"settings": {
"auto_record": true
}
}Response: A JSON object containing room_id, room settings, and status.
Step 2: Generate a Token (Server-side)
When a user is ready to join, your backend generates a token for that user. The token is bound to the room, the user, and their role.
POST https://api.enablex.io/video/v2/rooms/{room_id}/tokens
Authorization: Basic [Base64(AppID:AppKey)]
Content-Type: application/json
{
"user_id": "user_12345",
"user_name": "Alice",
"role": "moderator",
"ttl": 86400
}Response: A JWT string. Your backend sends this to the client via your own API, email, QR code, or other secure channel.
Step 3: Client Connects Using Token
The client receives the token and calls the EnableX SDK (or WebRTC API) to join the room. The client passes the token; EnableX media servers validate it and establish WebRTC connection. This involves ICE negotiation, STUN/TURN, and codec selection. The SDK handles all of this—your app does not.
Step 4: Session is Active
The first participant joins → session starts. Media flows through EnableX media servers (SFU or MCU, depending on room mode). Participants publish audio/video, send chat messages, share screens. EnableX records media (if enabled), logs CDR events, and monitors quality.
Step 5: Session Ends
When the last participant leaves (or duration limit is reached), the session ends. EnableX:
- Stops recording (if enabled).
- Finalizes CDR and session metadata.
- Triggers a
session.endedwebhook to your server. - Makes post-session data available via REST API (~5 min later).
Step 6: Post-Session Data Retrieval
Your server can now retrieve:
- CDR (Call Detail Record): Participant list, duration per participant, quality metrics (bandwidth, latency, packet loss).
- Recording: MP4 or WebM video file (if auto-recorded or manually recorded).
- Chat transcript: All chat messages, with timestamps and sender info.
- Session transcript: AI-generated speech-to-text of audio (if transcription enabled).
- Metadata: Room settings, mode, duration, participant list.
What You Can Build
The Video API and media infrastructure support a wide range of use cases:
- Telehealth consultations: Doctor-patient 1-on-1 (P2P) or group clinic visits (group mode). Auto-record for compliance, retrieve CDR for billing.
- Virtual classrooms: Instructor (moderator) streams to 100+ students (lecture mode). Breakout rooms for group work. Retrieve recordings for asynchronous learning.
- Customer support video: Support agent (moderator) video chats with customers (p2p or group). Screen sharing for troubleshooting. Chat for quick reference.
- Team meetings and collaboration: Up to 100 participants in group mode. Screen sharing, chat, breakout rooms. Persistent meeting rooms for recurring standups.
- Interview platforms: Interviewer and candidate (p2p). Auto-record for review. CDR for scheduling analytics.
- Live webinars: Speaker (moderator) broadcasts to thousands (lecture mode, view-only or with Q&A floor access). RTMP/HLS live streaming option.
- Peer-to-peer marketplaces: Service providers and customers connect via p2p video. Automated room creation and token issuance via webhooks.
Frequently Asked Questions
What platforms does the EnableX Video SDK support?
EnableX Video SDK is available for iOS (Swift/Objective-C), Android (Java/Kotlin), Web (JavaScript/WebRTC), Flutter, React Native, and Apache Cordova. A prebuilt Video UI Kit is available for iOS, Android, Flutter, and React Native, ready to embed with a single class and a session token.
How many participants can join an EnableX video session?
EnableX video rooms support up to 1,000 participants per session, depending on the room configuration and subscription plan. Typical use cases include 1-on-1 video calls, group meetings, webinars, and large virtual events.
Does the EnableX Video API support session recording?
Yes. Cloud recording is built into the platform. A moderator or your server can initiate recording, and the output is delivered to cloud storage. Recordings are available for download from the Portal or via API.
What is the difference between the Video SDK and the Video UI Kit?
The raw Video SDK gives you full API access to manage streams, layouts, and controls so you build the UI yourself. The Video UI Kit is a prebuilt, customizable video call interface with chat, screen sharing, participant list, and recording controls that you embed with a single class and a token, typically in under an hour.
Does EnableX Video support screen sharing and annotation?
Yes. Screen sharing and annotation are supported on Android and Web. On iOS, screen sharing uses ReplayKit via a broadcast extension. Annotation is available for Android and Web participants; iOS participants can receive annotations but cannot initiate them in the current release.
Is a free trial available for the EnableX Video API?
Yes. Sign up on the EnableX Portal to create a free project. New accounts receive trial credits to test the full Video API feature set before committing to a paid plan.
Next Steps
Ready to integrate? Here's the recommended path:
- Architecture Deep Dive: Understand the three-tier model and how your server interacts with EnableX.
- Quickstart Guide: Create your first room and token in 5 minutes.
- App Server Implementation Guide: Best practices for server-side room and token management, webhooks, and error handling.
- Full API Reference: All endpoints, parameters, and response codes.
- SDK Integration Guide: How your clients use the SDK to join rooms.