Guide — Building a Video Application
A complete EnableX video application is made up of two parts: an App Server and a Client Application. Neither works in isolation — the app server provisions the session, and the client application is what users actually interact with. Both must be in place before a user can join a live video session.
This guide walks through how the two parts fit together, how a user gets into a session, and the essential steps your client application performs from device access to disconnect. Code examples use the Web SDK. The same logical steps apply to all other SDK variants — refer to the relevant SDK reference for the exact method names.
The App Server is a backend web service you build and host. It holds your EnableX API
credentials (App ID and App Key) and is the only component
that talks directly to the EnableX Video API. Your client applications never
call the Video API directly.
The App Server is responsible for:
-
Creating video rooms — A room defines the configuration of the
session: maximum participants, recording settings, mode (group or lecture), and more.
Rooms are created via the Video API and identified by a unique
room_id. A room can be reused across many sessions over time. - Issuing access tokens — Each user who joins a session needs a short-lived token generated for that specific room. The token encodes the user's role (moderator or participant) and is signed using your App credentials. The App Server generates this token and delivers it to the client application before connection.
- Handling webhooks — EnableX posts notifications to your App Server when sessions start and end, when recordings are ready, and when files are transferred. Your server acts on these to automate workflow (updating your database, notifying users, fetching CDR reports, and so on). See Webhook Notifications.
- Post-session reporting — After a session ends, your App Server can fetch the Call Detail Report (CDR) and recording files via the Video API and store them in your own infrastructure.
Your App ID and App Key must only ever be used on your App
Server — never embedded in a client application, a browser page, or a mobile app.
The client receives a token, not credentials.
The client application is what your end users run — a browser tab, an Android app, an iOS app, or a hybrid mobile app. It is built using one of the EnableX Video SDKs and handles everything the user sees and interacts with during a session.
Choosing an SDK
Select the SDK that matches the platform your application targets:
- Web SDK — Browser-based applications on desktop and mobile browsers.
- Android SDK — Native applications for Android devices.
- iOS SDK — Native applications for iOS devices.
- Flutter SDK — Hybrid applications on the Flutter framework (iOS, Android, Web).
- React Native SDK — Hybrid applications on the React Native framework.
- Cordova / Ionic SDK — Hybrid applications on Cordova-based frameworks.
All SDK variants connect to the same EnableX infrastructure. A user on a browser, an Android app, and an iOS app can all join the same video room and communicate with each other in real time — the SDK handles platform differences transparently.
All code examples below use the Web SDK. The same flow — request a token, connect, publish, subscribe, handle active talkers, disconnect — applies to every SDK. Refer to your platform's SDK reference for exact method signatures and parameters.
Before a user can join a session, they need a token. The token flow is always the same regardless of which SDK or platform you are using:
- User opens your application and requests to join a session (e.g., by clicking a meeting link or entering a room code).
-
Your client application calls your App Server — an authenticated
request to your own backend endpoint (e.g.,
GET /api/token?roomId=…). -
Your App Server calls the EnableX Video API — specifically the Token
route — passing the
room_id, the user's role, and any user metadata. The Video API returns a signed token. - Your App Server returns the token to the client — in the HTTP response body or via a push mechanism.
- The client application uses the token to connect — passed directly to the SDK's join/connect method. EnableX validates the token before granting access to the room.
Tokens carry an expiry. Request a fresh token shortly before the user connects — do not cache tokens across sessions. If a token expires before the user joins, connection will be rejected.
Once the client application has a token, the session follows a predictable sequence. Each step below shows the Web SDK call and links to the SDK reference for full detail.
Step 1 — Access Media Devices
Before connecting to the room, your application should confirm that the user's camera and microphone are accessible. The Web SDK provides a method to enumerate available media devices and check permissions. This allows you to surface the right UI — for example, letting the user pick a camera before joining, or gracefully handling the case where permissions are denied.
// Enumerate available media devices
EnxRtc.getDevices(function(devices) {
// devices: { audioinput: [...], videoinput: [...], audiooutput: [...] }
// Let the user choose or apply defaults before connecting
});
See Web SDK — Device Access for full enumeration options and permission handling.
Step 2 — Connect, Publish, and Subscribe
Joining a room, publishing your local media stream, and subscribing to remote streams
all happen through a single method call: EnxRtc.joinRoom(). The SDK
connects the signalling socket, negotiates media, publishes your stream, and returns
the room object along with any remote streams already in the session.
// Define what you want to publish (camera + mic)
var publishOptions = {
audio: true,
video: true
};
// Join the room — connect, publish, and get remote streams in one call
var localStream = EnxRtc.joinRoom(TOKEN, publishOptions, function(success, error) {
if (error) {
// Handle connection failure — check error.code for the reason
return;
}
// 'success.room' is the connected room object — save it for later method calls
room = success.room;
// 'success.streams' is the list of remote streams already in the room
// Subscribe to each one so you can receive their audio and video
success.streams.forEach(function(stream) {
room.subscribe(stream);
});
});
New remote streams published after you connect are delivered via the
stream-added event — subscribe to those too:
room.addEventListener("stream-added", function(event) {
room.subscribe(event.stream);
});
See Web SDK — Connecting to a Session and Web SDK — Stream Management for the full set of publish options, stream events, and subscription handling.
Step 3 — Play Streams
Once subscribed, call stream.play() to render a stream into a DOM element.
This works for your local stream as well as any subscribed remote stream, screen share
stream, or canvas stream.
// Play your local stream in an element with id "local-player"
localStream.play("local-player");
// Play a remote stream — call this inside stream-subscribed or active-talkers-updated
room.addEventListener("stream-subscribed", function(event) {
event.stream.play("remote-player-" + event.stream.streamId);
});
Step 4 — Handle Active Talkers
In a multi-party session, EnableX continuously tracks who is speaking and sends the client an updated Active Talker list — up to 12 of the most actively speaking participants, ordered with the latest talker first. This is the primary signal your application uses to decide which video tiles to show on screen.
Listen to the active-talkers-updated event and re-render your video grid
whenever the list changes:
room.addEventListener("active-talkers-updated", function(event) {
var talkers = event.message.activeList;
talkers.forEach(function(talker) {
var stream = room.remoteStreams.get(talker.streamId);
if (stream) {
// Render this stream in the grid slot for this talker
stream.play("grid-slot-" + talker.streamId);
}
});
});
The active talker list also includes screen share (streamId: 101) and
canvas streams (streamId: 102) when those features are active.
Step 5 — Disconnect
When the user leaves the session, call room.disconnect(). The SDK
releases media tracks and closes the signalling socket. Listen to
room-disconnected to clean up your UI.
// Disconnect from the room
room.disconnect();
room.addEventListener("room-disconnected", function(event) {
// Session is over — clean up the UI, navigate away, etc.
});
// Also handle remote users leaving
room.addEventListener("user-disconnected", function(event) {
// event.message contains the disconnected user's info
// Remove their video tile from the grid
});
See Web SDK — Session Management for moderator controls, room mode switching, and forced disconnection of participants.
The steps above cover the essential session lifecycle — enough to have users joining rooms, seeing each other, and leaving cleanly. Everything beyond this is feature development driven by your use case:
- Recording — Start and stop session recording, configure layouts.
- Screen Share — Share the user's screen as a separate stream.
- Live Streaming / HLS — Broadcast to a larger audience via HLS or RTMP.
- Floor Access Control — Manage who can speak in lecture-mode sessions.
- Breakout Rooms — Split participants into sub-sessions.
- Chat, Reactions, Annotations — In-session communication features.
All of these are covered in the SDK reference pages for your chosen platform: