Technical Specifications

This page documents the exact media parameters, codec support, streaming constraints, recording behaviour, and infrastructure details of EnableX Video. Use it as a reference when designing your integration, sizing bandwidth, configuring quality layers, or troubleshooting media quality.

Video Streaming Quality

EnableX Video supports three quality tiers for camera streams: HD, SD, and LD. Each tier defines a maximum and minimum resolution at a fixed 26 FPS frame rate. The actual quality delivered to each subscriber is selected automatically based on available bandwidth — you specify the quality layers you want to publish; EnableX handles the rest.

Tier	Label	Maximum Resolution	Minimum Resolution	Frame Rate
HD	720 pixels	1280 × 720	320 × 180	26 FPS
SD	480 pixels	640 × 480	320 × 180	26 FPS
LD	240p / 180 pixels	640 × 360	80 × 45	26 FPS

Video Layers (Simulcast)

Simulcast lets a publisher send multiple quality versions of their stream simultaneously. Subscribers with high-bandwidth connections receive HD; those on slower connections automatically receive SD or LD — all from a single published stream. You configure the number of layers when initialising the local stream.

1 Layer — HD only (720p). Suitable for small rooms or fixed high-bandwidth environments.
2 Layers — HD (720p) + SD (480p). Recommended for most group sessions.
3 Layers — HD (720p) + SD (480p) + LD (240p/180p). Best for large, bandwidth-heterogeneous audiences.

More simulcast layers increase the publisher's upload bandwidth requirement proportionally, but give subscribers the best adaptive experience. For webinars or lecture mode with many participants, 3 layers is recommended.

Codec Support

Video Codec

EnableX uses VP8 as the standard video codec across all platforms and SDKs. VP8 is royalty-free, well-supported across all WebRTC-capable browsers and devices, and produces consistent quality results on constrained networks.

Support for H.264 has been discontinued. Do not design your integration around H.264 codec assumptions. EnableX will negotiate VP8 for all sessions regardless of client preference.

Audio Codec

All audio is encoded and decoded using OPUS (RFC 6716). OPUS operates at adaptive bitrates from 6 kbps to 510 kbps, has built-in Forward Error Correction (FEC), and handles packet loss gracefully — making it the ideal codec for real-time voice over varying network conditions.

Active Talker Streams

In a multi-participant session, not all streams are delivered at full quality simultaneously. EnableX processes, records, and transmits a maximum of 9 top active talker streams at any given time. The platform dynamically determines the "top 9" based on audio activity (who is currently speaking or has spoken most recently).

Streams beyond the top-9 threshold are still present in the session but are not actively forwarded to other participants until a speaker falls out of the active set. This model allows sessions with many participants to remain bandwidth-efficient without manual stream management.

Screen sharing transmits the presenter's display (or application window) as a dedicated stream. It uses a separate stream slot with its own resolution and bandwidth budget — independent of the camera stream. Enable screen sharing in your room settings with settings.screen_share: true.

Parameter	Value
Default quality	1080p HD
Maximum resolution	1920 × 1080 @ 6 FPS
Transmit bandwidth	300 KBps
Receive bandwidth	300 KBps

The actual resolution may vary depending on the physical size and DPI of the source screen. A 4K source screen is downscaled to 1920×1080; a smaller source screen may produce a lower effective resolution.

Canvas Streaming

Canvas Streaming allows you to publish a programmatically rendered HTML5 canvas as a video stream. This is used for whiteboard sessions, overlays, composite video scenes, or any scenario where your application generates the video frame rather than a camera. Enable it with settings.canvas: true in your room configuration.

Parameter	Value
Maximum resolution	1280 × 720 @ 23 FPS
Minimum resolution	320 × 180 @ 6 FPS
Transmit bandwidth	300 KBps
Receive bandwidth	300 KBps

Recording

EnableX recording captures each participant stream individually and then composites them into a single playable video file. Understanding the recording pipeline helps you plan for storage, post-processing, and playback integration.

Recording Pipeline

Source streams: Each individual stream in a session is recorded separately in MKV format. MKV is used for raw capture because it is resilient to connection drops mid-session.
Post-session processing: After the session ends, EnableX transcodes and composites the individual MKV streams into a single MP4 file. The MP4 is what you download and play back.
Maximum recording quality: Up to 720p HD (as received on the server).
Maximum transcoded quality: 480p SD. The playable MP4 output is capped at 480p regardless of the source quality.

File Sizes

Quality	Approximate File Size
720p HD (source)	~11 MB per minute
480p SD (transcoded MP4)	~4 MB per minute

Output Format & Storage

The final playable recording is delivered as MP4 (H.264 + AAC).
Recordings are stored on EnableX servers and are downloadable via REST API using a signed URL.
Retention is 90 days by default; download and archive before this window closes.
A recording.ready webhook is emitted when the file is available for download.

Recording Triggers

Auto-record: Set auto_record: true when creating the room. Recording starts automatically with the session.
On-demand: Start and stop recording during an active session via SDK or REST API.

In regulated industries (healthcare, finance), verify local data protection laws before enabling recording. Many jurisdictions require explicit participant consent prior to recording.

Quality Adaptation

EnableX Video automatically adapts to changing network conditions without any intervention from your application code. The adaptation lifecycle works as follows:

Auto adaptation: The platform continuously monitors available bandwidth for each participant and adjusts the received video quality (resolution and bitrate) to deliver the best experience possible within the current network constraint.
Audio-only fallback: When available bandwidth drops below the threshold required for any video quality tier, the platform falls back to audio-only mode automatically. The participant remains in the session without any UI code change required.
Auto restoration: When network connectivity improves, the platform automatically restores video communication, stepping the quality back up through LD → SD → HD as bandwidth allows.

Media Tunneling

EnableX Video sends real-time media over UDP for the lowest latency. When UDP is blocked (corporate firewalls, restrictive NATs), the platform falls back to a TURN relay automatically.

Parameter	Value
Protocol	UDP (RTP)
Port range	30000 – 35000
Fallback	TURN server relay (configurable)
Signalling	WSS (WebSocket Secure) over TCP 443

If your users are behind a corporate firewall, ensure UDP ports 30000–35000 are open outbound. If UDP is blocked entirely, EnableX TURN fallback will maintain the session — at slightly higher latency — over TCP.

PSTN / SIP

EnableX supports both inbound and outbound telephony integration for hybrid sessions that mix WebRTC participants with phone-line participants.

Dial-In to Session: Participants can join a session from a regular phone by dialing a PSTN number. Audio is bridged into the video session.
In-Session Dial-Out: The moderator or application can dial out to a phone number from within an active session, bridging the call into the room.

SIP trunking is also supported for enterprise PBX and contact-centre integrations. Contact EnableX sales for PSTN/SIP provisioning.

Platform Capabilities at a Glance

Capability	Specification
Participants per room	Up to 100 (group mode), up to 2000 (lecture mode, view-only), 2 (p2p)
Session duration	Configurable per room, up to 24 hours (extendable on request)
Recording	Auto-record or on-demand. MP4 and WebM output. Multiple layouts (grid, spotlight, sidebar).
Live streaming	RTMP and HLS output to external services (YouTube Live, Twitch, etc.)
Screen sharing	Built-in. Presenter shares screen; other participants see screen + presenter video.
Chat	In-session text chat. Persistent (available in post-session transcript). Supports rich text, files, emojis.
Whiteboard	Collaborative whiteboard for brainstorming and teaching. Recordable.
Breakout rooms	Facilitator can divide participants into subgroups. Automated regrouping.
Transcription	Live and post-session speech-to-text. Multiple language support. Timestamps and speaker identification.
Metadata and CDR	Participant list, duration per participant, bitrate, latency, packet loss, connection changes, platform/browser info.
Webhooks	Real-time event notifications: session start/end, participant join/leave, recording ready, transcription ready.
Global deployment	Media servers in 50+ countries. Auto-routing by geography. No manual region selection.
Security	DTLS-SRTP for media, WSS for signalling, HTTPS for API. TLS 1.2+. Support for end-to-end encryption (E2EE) on request.

Room Modes: Detailed Comparison

Aspect	Group (SFU)	Lecture (MCU)	P2P
Max participants	100	2000 (view-only)	2
Media routing	SFU: each participant publishes one stream, SFU forwards to others	MCU: streams composited into single output	Direct peer-to-peer
Publish capability	All participants can publish	Only moderators publish by default; participants with floor access can publish	Both participants can publish
Roles	Participant (all equal)	Moderator (publishes, controls) and Participant (receives, requests floor)	Symmetric (both equal)
Typical bandwidth per participant	500-1500 kbps (incoming, varies by participant count)	500-2000 kbps (incoming, depends on layout)	300-1000 kbps (direct, low latency)
Latency	100-300 ms (SFU relay)	200-500 ms (MCU processing)	20-100 ms (direct peer)
Moderator controls	Mute participant audio/video, remove participant, lock room	Mute, remove, grant floor access, control recording, switch layouts	None (symmetric)
Use cases	Team meetings, collaborative sessions, group interviews, peer support	Webinars, live streams, large classes, town halls, broadcasts	1-on-1 support, pair interviews, doctor-patient, sales calls
Recording layouts	Grid (all participants), gallery (featured + grid)	Grid, spotlight, sidebar, custom layouts	Side-by-side
Screen sharing	Yes (presenter screen + video visible to all)	Yes (screen + layout adjusts)	Yes (both see each other and shared screen)
Scaling recommendations	Keep under 50 participants for optimal quality; beyond 50, consider lecture mode	Optimized for 100s to 1000s; no performance degradation with participant count	Always 2; no scaling needed

Session Behavior and Lifecycle

Session Start

A session starts when the first participant joins a room with a valid token. At this moment:

A session instance is created (assigned a session_id).
A session.started webhook is emitted to your server.
CDR recording begins (start time, initial participant list).
If auto-record is enabled, recording begins.
If live streaming is enabled, RTMP/HLS broadcast begins.

Session Active

While the session is active:

Participants can publish/subscribe media (mode-dependent).
Participants can send chat, share screens, collaborate on whiteboard, request floor access.
Transcription (if enabled) is continuously processed.
Quality metrics are sampled (bandwidth, latency, packet loss, jitter).
Webhooks are emitted for participant join/leave events.
Moderators can mute, remove participants, switch recording layouts.

Session End

A session ends when one of these conditions is met:

The last participant leaves voluntarily.
The session duration limit is reached (configured per room, default or custom). All participants are auto-disconnected.
The room is explicitly deleted via API (all active sessions end).
A moderator ends the session via SDK or dashboard.

When a session ends:

Media streams stop.
A session.ended webhook is emitted with final metadata.
Recording is finalized (if enabled).
Live stream is stopped.
CDR is finalized and marked as complete.
Post-session data becomes available (recordings, transcripts, final metrics) within ~5 minutes.

Post-Session Data Availability

After a session ends, post-session data is available for 90 days by default. This includes:

CDR (full participant list, durations, quality metrics).
Recording (MP4/WebM file).
Chat transcript.
Audio transcript (if transcription enabled).
Session metadata (participants, duration, mode, settings).

Retrieve this data via REST API or via webhook event. Plan your storage and archival strategy accordingly.

Warning

Data older than 90 days is automatically purged. If you need longer retention, retrieve and store the data before the 90-day window closes, or contact EnableX to arrange extended retention.

Encryption and Protocols

Protocols

WebRTC (RTP over UDP): Bi-directional media. Adaptive bitrate, FEC (forward error correction), jitter buffers.
WSS (WebSocket Secure): Signalling (SDP, ICE candidates). TLS-encrypted, runs over TCP, traverses most firewalls.
HTTPS: All Video API calls. TLS 1.2 or higher.

Encryption and Security

Media encryption: DTLS-SRTP. All RTP media is encrypted using keys negotiated via DTLS. Verified via SRTP authentication tags.
Signalling encryption: WSS (WebSocket over TLS). All SDP and ICE candidate exchange is encrypted.
API encryption: HTTPS. All REST API calls use TLS 1.2 or higher.
Certificate pinning: Optional. If your app requires pinning, contact EnableX for our certificate chain.
End-to-End Encryption (E2EE): Available upon request for HIPAA/regulated use cases. Ensures media is encrypted before leaving the client; EnableX never has plaintext access.

Media quality adapts automatically to available bandwidth. See the Quality Adaptation section above for details.

SDK and Platform Support

EnableX provides native SDKs for Web (JavaScript), iOS, Android, React Native, Flutter, and Cordova. Native SDKs use the platform's WebRTC implementation directly, providing better performance and battery efficiency. See the Video SDK overview and the Browser Compatibility guide for full support matrices.

Scalability and Performance

Horizontal Scaling

EnableX media servers are deployed globally and scale horizontally. You do not provision or manage capacity:

Automatic scaling: As you add rooms and sessions, our infrastructure scales transparently. No configuration, no quota requests.
Geographic distribution: Media servers are deployed in 50+ countries. Clients automatically route to the nearest server. Latency is minimized without manual intervention.
No region selection: You do not specify regions when creating rooms. The system handles routing.
Multi-region redundancy: If a region fails, users are automatically rerouted to a neighboring region.

Performance Characteristics

Metric	Typical Value	Note
Connection setup time	2-5 seconds	From SDK join() call to media flowing. Varies by network and distance to media server.
One-way latency	50-150 ms	P2P: 20-100 ms. SFU: 100-300 ms. MCU: 200-500 ms. Depends on geography and network path.
Session start time	<100 ms	From first participant join to session.started webhook.
Participant add/remove time	<200 ms	Media streams added/removed. Renegotiation for other participants.
Recording initialization	1-3 seconds	Auto-record starts when session starts. First frames may be brief.
Post-session data availability	~5 minutes	Recording, transcripts, CDR available via API ~5 min after session ends.

Capacity Limits

Rooms per app: Unlimited. You can create as many rooms as needed.
Sessions per room: Unlimited. A room can host multiple sequential sessions over time.
Participants per session: 100 (group), 2000 (lecture), 2 (p2p). Hard limits enforced by the platform.
Concurrent sessions: Depends on plan. Contact EnableX for limits specific to your pricing tier.
API rate limits: 100 requests per second per App ID (standard tier). Higher limits available on enterprise plans.

Recording Specifications

Formats and Output

Format	Container	Video Codec	Audio Codec	Use Case
MP4	ISO Base Media	H.264	AAC	Web playback, email, mobile sharing. Widely compatible.
WebM	Matroska	VP8 or VP9	Opus	Web-native, lower file size, good for streaming.

Recording Layouts

Grid: All active participants in a grid. Scales automatically as participants join/leave.
Spotlight: One participant full-screen, others in a strip below (lecture mode). Moderator can switch spotlight.
Sidebar: Main speaker full-screen on left, others in a sidebar on right.
Custom: Custom layouts available for enterprise customers. Contact EnableX for details.

Recording Resolutions

480p: 854x480, 30 fps. Suitable for mobile playback, lower bandwidth, smaller files.
720p: 1280x720, 30 fps. Balanced quality and file size. Recommended for most use cases.
1080p: 1920x1080, 30 fps. High quality. Larger file size. For screen sharing and detailed content.

You specify resolution per room during creation or update.

Storage and Retrieval

Hosted storage: Recordings are stored on EnableX servers by default. Retention: 90 days. Downloadable via REST API.
Download: Recordings are available as direct HTTP downloads (signed URLs, expire in 24 hours). Stream or save to your storage.
Webhook notification: When recording is ready, a recording.ready webhook is sent with download URL and metadata.
Custom storage: Enterprise customers can configure S3 bucket sync. Recording files are automatically uploaded to your S3 bucket.
Lifecycle policies: Set up automatic deletion or archival based on retention policies.

Recording Triggers

Auto-record: Set auto_record: true during room creation. Recording starts automatically when session starts, ends when session ends.
On-demand recording: Start/stop recording via REST API or SDK during an active session. Useful for selective recording or recording only certain segments.

Important

In regulated industries (healthcare, finance), ensure compliance with local data protection laws before enabling auto-record. Some jurisdictions require explicit participant consent for recording.

Limits and Quotas Summary

Resource	Limit	How to Request Increase
Rooms per app	Unlimited	N/A
Sessions per room	Unlimited	N/A
Participants per group session	100	Not extendable (use lecture mode instead)
Participants per lecture session	2000	Not extendable (architecture limit)
Session duration	24 hours (configurable)	Contact support for longer sessions
Concurrent active sessions	Plan-dependent	Upgrade plan or contact sales
API requests per second	100 (standard tier)	Upgrade plan or contact sales
Post-session data retention	90 days	Contact sales for extended retention
Recording storage	Plan-dependent	Upgrade plan or configure external S3

Tip

Monitor your usage via the EnableX dashboard. Set up billing alerts to avoid surprises. For large-scale deployments, engage EnableX sales to discuss custom limits and pricing.

Technical Specifications

Video Streaming Quality

Video Layers (Simulcast)

Codec Support

Video Codec

Audio Codec

Active Talker Streams

Screen Share

Canvas Streaming

Recording

Recording Pipeline

File Sizes

Output Format & Storage

Recording Triggers

Quality Adaptation

Media Tunneling

PSTN / SIP

Platform Capabilities at a Glance

Room Modes: Detailed Comparison

Session Behavior and Lifecycle

Session Start

Session Active

Session End

Post-Session Data Availability

Encryption and Protocols

Protocols

Encryption and Security

SDK and Platform Support

Scalability and Performance

Horizontal Scaling

Performance Characteristics

Capacity Limits

Recording Specifications

Formats and Output

Recording Layouts

Recording Resolutions

Storage and Retrieval

Recording Triggers

Limits and Quotas Summary