Tengxun cloud authentication-free account number: tengxun cloud city double live and off-site multi-live (MySQL Redis) architecture building guide
Under the background of high concurrency and large flow of Internet, the stability of the system is the lifeline of the enterprise. At the beginning of the project, many technical teams had servers and databases piled in the same available area (computer room) of Tencent Cloud. At ordinary times, the situation is calm, but once the computer room is cut off, the optical cable is cut off or a large area of network failure, the whole business will be instantly paralyzed.
In order to achieve "high availability", architects usually offer two levels of big moves:
City Double Live (Multi-AZ)
and
Multi-Region
.
Many people feel unattainable when they hear "live more" and think that it is the bottom black technology that big factories can play. In fact, using Tencent Cloud's mature cloud products (such as TDSQL, CDB, TcaplusDB and DTS data transmission services), small and medium-sized teams can also build a highly available multi-active architecture within a few days.
Today's tutorial goes directly to the core, does not pull empty theories, and takes you to dismantle them in vernacular.
Double live in the same city
with
Live More in Different Places
The architecture logic, and hand-in-hand to teach you how to configure.
MySQL + Redis
The double live landing.
Core Review: What's the difference between living in the same city and living in different places?
Before starting work, we must make clear the boundary between the two sets of plans, otherwise we will not only burn money, but also solve the problem if we choose the wrong type.
1. Double living in the same city (multi-usable area in the same city)
How to build: Deploy applications and databases in two different data centers in the same city (for example, Shanghai) (for example, Shanghai Availability Zone 2. Shanghai Availability Zone 3).
Network latency: very low (usually <2ms), because the dedicated line fiber is pulled between the two computer rooms.
Data synchronization: strong consistency. MySQL and Redis can achieve almost zero latency synchronization.
Risk prevention: perfect defense single room power failure, single room fire and other physical failures.
2. Live more in different places (live more across regions)
How to build it: Build a complete system in two cities far apart (such as Beijing and Guangzhou).
Network latency: higher (Beijing to Guangzhou speed of light transmission also has a physical delay of about 30ms).
Data synchronization: eventual consistency. Because of the existence of delay, there is absolutely no real-time strong synchronization, only asynchronous replication.
Risk prevention: defense against urban disasters (e. g. earthquakes, backbone network breaks).
Big vernacular advice: 95% of enterprises, to do the same city double work is enough to deal with the vast majority of downtime disasters. Only when your DAU exceeds 10 million, or when there are extreme requirements for compliance, do you need to consider living in different places.
The first stage: the city double live (MySQL + Redis) actual combat construction
The core idea of double living in the same city is:
Application layer double active (traffic cut at will), data layer one master and one standby (automatic second-level switching)
.
1. Infrastructure: delineate Vpc and subnet
When you buy servers in Tengxun Cloud, you must settle them in different availability zones.
Create a VPC in the Shanghai region.
Create two subnets: Subnet A belongs to Shanghai Zone 2, and Subnet B belongs to Shanghai Zone 3.
2. Application Layer (CLB CVM) Configuration
Buy two lightweight servers or CVM, one in subnet A and one in subnet B.
Buy a Tengxun Cloud Load Balancer (CLB) and mount both servers in the back-end server list with a weight of 50:50.
In this way, no matter which computer room is hung up, CLB can direct all the traffic to the surviving applications in another computer room within one second.
3. Data layer: MySQL(CDB high availability edition) configuration
Don't go to the server to install the cluster, directly buy tengxunyun
Cloud Database CDB (High Availability Edition)
.
Purchase selection: On the purchase page, select Cross-zone deployment.
Node allocation: select Shanghai zone 2 for the master node and Shanghai zone 3 for the standby node.
Underlying principle: The underlying layer of Tengxun Cloud will use a strong synchronization mechanism (Semi-Sync) to ensure that the primary and backup data are absolutely consistent. In the event of a power outage in Zone 2, CDB will automatically upgrade the standby node in Zone 3 to the primary node within 30 seconds, and the application's database connection address (VIP) will remain unchanged without manual code modification.
4. Cache layer: Redis (local double live)
Redis is usually used for caching. Under the condition of double living in the same city, it is recommended to use tengxunyun's
Redis Premium (cross-zone high availability)
, its primary and secondary architecture is similar to MySQL: zone 2 reads and writes, and zone 3 is asynchronous and synchronous. Because the same city latency is extremely low, the probability of cache penetration is minimal.
The second stage: live in different places (MySQL Redis) actual combat construction
The difficulty of living in different places increases exponentially. Because the delay between Beijing and Guangzhou is 30ms, if Beijing's application reads Guangzhou's database across regions, the network overhead will make your system card look like PPT.
live in different places of iron law:
Unit deployment, each reading each, data asynchronous mixed calculation.
1. Core trick: flow dyeing and unitization (GSLB)
Suppose you divide by user ID: those with odd ID go to Beijing computer room and those with even ID go to Guangzhou computer room.
Use Tengxun Cloud's Global Load Balancing (GSLB) or control DNS resolution to directly resolve to the CLB of the Beijing computer room when requested by Beijing users.
Cross-regional calls are absolutely prohibited: applications in Beijing can only read and write MySQL and Redis in Beijing, while applications in Guangzhou can only read and write in Guangzhou.
2. Data layer: MySQL remote dual active (using DTS two-way synchronization)
Beijing has a set of MySQL, Guangzhou has a set of MySQL, both sides are writing data at the same time, how to keep in sync?
Go to the Tencent Cloud console and search for DTS.
Create a two-way data synchronization task.
CDB in Beijing is selected as the source database,
CDB of Guangzhou is selected as the target database.
Configure conflict resolution strategies: Focus! Either Stagger Primary Key or Timestamp First must be selected ". For example, the IDs generated by Beijing are odd, and the IDs generated by Guangzhou are even, to prevent primary key conflicts when both sides insert data into the database at the same time.
🚨"Invisible Pit"-Data Loop: If Beijing's data is synchronized to Guangzhou, Guangzhou mistakenly thinks it is newly generated locally and synchronizes back to Beijing, the system will cycle to death. Tengxun Cloud's DTS background has its own "anti-loopback" mechanism, which will label the synchronized data to ensure that the data is copied only once.
3. Cache layer: Redis live in different places.
The data in Redis usually has an expiration time and changes very quickly. In the scenario of multi-live in different places, ordinary Redis replication simply cannot afford it.
Official artifact: Tencent cloud provides Redis global multi-live (Global Replication) products.
Configuration method: You can buy a Redis instance in Beijing and Guangzhou, and bind them as a "global multi-active group" in the console ".
Business logic: Redis on both sides can be read locally. For caches that need to be shared (such as the Token status of user login), after Beijing writes, the bottom layer of Tengxun Cloud will asynchronously push them to Guangzhou within tens of milliseconds through the dedicated line. Although there is a short time difference, it can fully meet the business needs.
The third stage: disaster recovery exercise and flow cutting (do not blind at critical moments)
When the architecture is set up, how can you verify that it really saves lives? You need to conduct regular drills.
Scenario 1: the same city double active single computer room hung up (such as available zone 2 failure)
Automatic part: Tengxun Cloud CLB will automatically eliminate the abnormal CVM in Zone 2. MySQL will automatically trigger the master/standby switch, and the standby database in zone 3 will take over reading and writing.
Manual confirmation: Operation and maintenance log in to the console, check the business log, confirm that there is no dirty data, and the drill ends.
Scene 2: The city level of multi-activity in different places is suspended (such as the overall loss of the Beijing computer room)
At this time, manual intervention is required for routing and flow cutting:
Log on to the Tencent Cloud DNS resolution/traffic scheduling console.
The 50% odd-numbered user traffic originally diverted to Beijing is switched to CLB in Guangzhou computer room.
At this time, the application in Guangzhou began to carry 100 percent of user requests at the same time. As DTS has been synchronizing in the background, MySQL in Guangzhou already has 99.9 percent of the data before the Beijing computer room died.
Cost acceptance: During the switching process, there may be a very small number of Beijing users who register in the last few milliseconds to prompt "login failure" or "data lag", which is completely allowed in the "final consistency" theory of multi-live in different places.
Summary and high availability of four formulas
Highly available architecture is not a silver bullet, but an art of compromising the laws of physics (the speed of light and network delay). To sum up in four sentences:
The most cost-effective double living in the same city: buy services across available areas and replace them.
There is no need to change the code line, and disaster prevention is done by 95%.
More work in different places should be divided first: the business must be unitized. Beijingers look at the Beijing library and cross-city reading and writing is a disaster.
Primary key conflicts should be avoided: DTS is used for double writing in different places, and parity primary keys are staggered to prevent data from becoming messy.
Regular drills are the only way to feel at peace: don't wait for something to really happen before turning over the documents and looking at the current. train more troops at ordinary times and show your power in wartime.
