Tencent Cloud Audio and Video (TRTC) Actual Combat: Low Latency Live Online and Multi-person Video Call Construction

cloud 2026-05-29 阅读 88

Today, when online connectivity, telecommuting, online teaching and interactive live broadcasting are popular, many developers have received the need to "build a low-latency audio and video call function. If you start from scratch and knock on WebRTC protocol, video streaming server, attenuation network optimization and echo cancellation (AEC), it is estimated that you may not be able to produce a stable commercial version if you lose your hair.

For enterprise-level audio and video development, the most time-saving and labor-saving path is to directly access Tencent Cloud.

TRTC (Real-Time Audio Video, Tencent Real-Time Communication)

. It encapsulates the complex audio and video bottom layer into a few lines of simple SDK calls, and naturally inherits Tencent Global Acceleration Network, which can put the global end-to-end delay on the dead.

Within 300 milliseconds

Today do not read the official documents of the sutra, refuse any nonsense. Take your computer with you. Let's talk about the core architecture and weld a low-latency live online and multi-person video call system of your own.

The first stage: understand TRTC's underlying call network and "room" concept

Before you write code, you have to model the physical world of TRTC in your head, or you won't even know how the stream is transmitted.

All audio and video interactions of TRTC are in a

"Room"

in the virtual space.

Entering the room (EnterRoom): Any user (whether it is an anchor or an audience) who wants to talk or watch others talk must first enter the same room with a "room number.

Push Stream (Publish): After entering the room, if you want others to see you, you "push" the audio and video data collected by your mobile phone or computer camera to the cloud through the edge node of Tengxun Cloud.

Pull Flow (Subscribe): If you want to see Zhang San in the room, the SDK will automatically go to Tengxun Cloud to "pull" Zhang San's "flow" down, decode and play.

During the call, the 3A algorithm (echo cancellation AEC, noise suppression ANS, automatic gain control AGC) in the background of tengxunyun will be automatically connected to the whole process, which is why you don't need to write your own code to "remove noise" from the sound.

Phase II: Tengxun cloud background configuration and life-saving voucher UserSig computing power generation

Log on to the Tencent Cloud console, search and enter "Real-time Audio and Video TRTC".

Click "App Management"-> "Create App" and give your app a name (such as My Low Latency Audio and Video System).

After successful creation, the system will issue you two core certificates, write them down in a small notebook, and don't disclose them: SDKAppID: your application's unique ID card (a string of pure numbers). Key (SecretKey): The string used to encrypt the signature.

Core Pit Avoidance: What the UserSig is?

In order to prevent others from maliciously embezzling your TR

TC traffic, any user who wants to enter your room must carry a name

UserSig

The security signature (equivalent to a temporary pass).

Development and testing phase (shortcut flow): micro-build or console provides a "basic configuration" page, you can directly enter your user name (UserId) on the web page, it will help you calculate a temporary UserSig with one click, directly copy into the code.

Production on-line phase (hard core stream): Never hard-code your SecretKey in App or front-end code! The correct way is to write the logic of the computing UserSig on your back-end server (for example, using Node.js, Java or Python scripts). Every time App enters the room, it requests its own server interface to take UserSig to ensure safety.

The third stage: actual combat exercise 1-multi-person video call scene construction (full interactive combat)

Video calls (e. g. executive meetings, online script killing) are characterized:

Everyone in the room is the main character. Everyone has to push the flow and also depends on the flow of others. Delay is extremely demanding.

We are currently the most common

Web/H5 JavaScript SDK

For example (iOS/Android logic is exactly equivalent),5 lines of core code take you through:

1. Introduce and initialize the SDK

JavaScript

import TRTC from 'trtc-js-sdk ';

// 1. Create a TRTC client object

const client = TRTC.createClient ({

mode: 'rtc', // rtc stands for multiplayer video call mode, pursuing extreme low latency

sdkAppId: 1400xxxxxx, // fill in your SDKAppID

userId: 'user_boss, '// ID of the current user

userSig: 'xxxxxxxxx' // The signature calculated in the cloud

});

2. Entry into the room and collection of push flow

JavaScript

// 2. Enter the room (room number: 12345)

await client.join({ roomId: 12345 });

// 3. Capture local camera and microphone audio and video

const localStream = TRTC.createStream({ audio: true, video: true });

await localStream.initialize(); // Initialize camera

// 4. Mount the local screen to a <div> tag on the web page and show it to yourself.

local

Stream.play('local-video-view');

// 5. Push your own flow to Tengxun Cloud for others in the room to see.

await client.publish(localStream);

3. Monitor and pull other people's pictures

When other people come to the room (such

user_employee

) and push the stream, the SDK will trigger an event, we just need to monitor and pull the stream:

JavaScript

// 6. Listen for remote stream add events

client.on('stream-added', event => {

const remoteStream = event.stream;

// Subscribe to this person's picture

client.subscribe(remoteStream);

});

// 7. Listen for the remote stream subscription success event and mount it to the webpage

client.on('stream-subscribed', event => {

const remoteStream = event.stream;

// Create a new div block and put the video of the remote user into it.

remoteStream.play('remote-video-view-' remoteStream.getUserId());

});

As long as the devices on both sides run this logic, a multi-person video conference system with high-definition picture quality and a delay as low as 200ms will be revived directly in situ.

Phase IV: Actual Combat Exercise II-Low Latency Online Live Scene Construction (10,000 People Watch Lian Mai)

The characteristics of live broadcast scenes (such as live broadcast with goods and online red PK) are completely different from those of meetings:

There are only one or two anchors in the room who are frantically pushing the stream, and there are tens of thousands or even hundreds of thousands of viewers watching. If hundreds of thousands of people are allowed to enter the room and push each other at the same time, the server bandwidth will explode instantly and the cost will be high.

When TRTC handles live broadcast requirements, it uses

"Role Switch"

and

"Cloud mixed flow"

Mechanism.

1. Mode switching

When initializing the client, the mode must be changed

live

JavaScript

const client = TRTC.createClient({

mode: 'live', // live stands for interactive live mode

sdkAppId: 1400xxxxxx

userId: 'user_audience ',

userSig: 'xxxxxxxxx'

});

2. Distinguish between anchor and audience roles

When entering the room, you must clearly declare your true identity:

Anchor (Anchor): With the right to push the stream, you can talk to the camera.

Ordinary Audience (Audience): By default, it can only be viewed by pulling streams, and does not occupy the upstream bandwidth of Tengxun Cloud, which is extremely economical.

JavaScri

// Audience into the room

await client.join({ roomId: 88888, role: 'audience' });

// If the audience wants to apply for "going to Lianmai", they do not need to check out, and directly call a line of command "change in place":

await client.switchRole('anchor');

// After becoming an anchor, you can copy the code of the third stage, turn on the camera and publish your own stream.

3. The ultimate money-saving trick: enable CDN bypass live broadcast

If your live broadcast is watched by millions of people at the same time, and all of them use TRTC's real-time flow to go through the backbone network of large factories, the downstream traffic fee will be so expensive that you doubt your life.

Standard factory solution: open the "bypass live" option in the background of Tencent cloud.

Operation logic: the anchor pushes the stream in the TRTC room, and tengxunyun backstage automatically "copies" this high-fidelity real-time stream, directly transcodes it into an ordinary standard live stream (RTMP/HLS/WebRTC) in the cloud, and then distributes it to the ordinary millions of onlookers through CDN distribution network.

In this way, Lianmai's anchors enjoy an extremely low delay of 300ms, while the crowd watching the scene spend an extra 1-2 seconds of ordinary CDN delay, but it saves you up to 70% of the bandwidth budget.

The fifth stage: the history of avoiding tears in daily operation and maintenance

Device permission is blocked: the web page (H5) develops audio and video. due to the security policy, the browser must be in HTTPS environment or local localhost to normally arouse the camera and microphone. If HTTP is used to deploy to the server, the SDK will directly report an error even when it is initialized.

Mobile Weak Network Optimization: In the console, remember to check the "Smooth First (Smooth)" policy. When the user's 4G/5G signal suddenly deteriorates, the system will automatically compress the resolution and reduce the frame rate, giving priority to ensuring that the sound does not sound stuck and does not become PPT, which is called "packet loss compensation mechanism (PLC)".

Summary

Tengxunyun TRTC's actual combat know-how is very pure:

The rtc mode is selected for the meeting and all members have equal rights. Live mode is selected for live broadcast, and master-slave is distinguished through role. Large traffic must be coordinated with bypass CDN.

As long as this logic is straightened out, whether it is customized video customer service for enterprises or tens of millions of interactive live broadcast platforms, you can use the most elegant code to completely chew down the hard bone of audio and video development.