AWS | Azure | GCP and International Alibaba Cloud/Tencent Cloud/Huawei Cloud Security Group and Network Configuration Nanny-level Tutorial
In the cloud computing universe, there are two words of wisdom soaked in the tears of countless network engineers:
"The network is not connected, and the high probability is that the route or security group is not open."
"The network is hacked, and the high probability is that the security group has opened 0.0.0.0/0."
As an IT person for offshore business, cross-border architecture or multi-cloud deployment, the most important thing to deal with every day is the network configuration of major cloud vendors. Many people think that it is just "click the mouse to open a port"? However, when it comes to the multi-cloud architecture, you will find that the jurisdiction logic, terminology and even pit position of each large factory are completely different. AWS's stateful security group, Azure's NSG priority, GCP's global VPC... As long as one detail is not fully understood, the network will not be able to be checked until late at night, and the core database will "run naked" directly on the public network ".
Today's in-depth hard-core long article does not talk about the virtual PPT theory, but directly remakes the network topology of the six major overseas mainstream cloud vendors (AWS, Azure, GCP, OCI, Alibaba Cloud International, Tencent Cloud International) and the "nanny-level" practical operation guide of the network firewall (security group). It is suggested to collect it. This is your network troubleshooting manual at the critical moment.
Alignment of 1. core concepts: Don't be confused by the black words of big factories.
Before the formal keyboard, we must first pull the "black words (terms)" of the six major manufacturers to the same dimension. Although the name is strange, but the underlying logic is nothing more than three things:
Private network (pit), router/routing table (signpost), firewall (doorman)
.
Cloud provider
Private network (VPC/VNet)
Subnet (Subnet)
Stateful Firewall (Security Group/Component)
subnet/network level firewall
AWS
VPC (regional level)
Subnet (Availability Zone level)
Security Group (stateful)
Network ACL (stateless)
Azure
VNet (regional level)
Subnet (Subnet)
Network Security Group (NSG)
Also use NSG to bind subnets
Google Cloud Platform
VPC (Global/Global)
Subnet (regional level)
VPC Firewall Rules (stateful)
Tag/service account combination control
Oracle (OCI)
VCN (regional level)
Subnet (Region/Availability Zone)
Security List / NSG
Security List (component level)
Alibaba Cloud International
VPC (regional level)
VSwitch (Availability Zone Level)
Security group (Security group)
Network ACL (Network ACL)
Tencent Cloud International
VPC (Regional)
Subnet (Availability Zone level)
Security Group
Network ACL (Network ACL)
Core underlying logic: stateful (Stateful) vs stateless (Stateless)
This is where newcomers are most likely to stumble:
Stateful (Security Group/Firewall Rules): When you come in, I will let you out by default. For example, if you allow external port 80 access, then the server does not need to open up an additional random high port in the outbound rule when responding to port 80 back to the packet.
Stateless (Network ACL/some specific routes): clear entry and exit, each going its own way. You have opened the inbound port 80. If the outbound rule does not open the temporary port 1024-65535 (Ephemeral Ports), the traffic will still die on the way.
2. Details of Six Major Overseas Plants: Practical Operation and Pit Avoidance Guide
AWS (Amazon Web Services): Big Brother's "Two-Tier Defense"
AWS's network design is extremely classic, and it is also the object of many cloud vendors to imitate. Its core thinking is:
The regional VPC is divided into different AZ subnets, and then the stateless NACL is used to protect the subnet door. Finally, the stateful Security group (security group) is used to protect the instance door.
🛠️ Gold Configuration Path:
Create a custom VPC: Don't use a Default VPC and plan your own network segment (for example, 10.0.0.0/16).
Split subnets are at least divided into Public Subnet (associated with Internet Gateway, with public network routing) and Private Subnet (no public network routing, only through NAT Gateway).
Security group configuration: inbound (Inbound): accurate open. For example, the Web server only opens 80 and 443, and the source address is set to 0.0.0.0/0 or the security group ID of ALB (load balancing). Outbound (Outbound): The default value is 0.0.0.0/0. All of them are allowed. If the compliance is strict, you need to limit them to specific private CIDR blocks.
⚠AWS History of Blood and Tears in Avoidance of Pit:
Security Group Reference: Don't write IP addresses foolishly! The strongest feature of AWS security groups is that the source address can be filled with another security group ID ". This means that you can set "only back-end security group B is allowed to receive traffic from front-end security group a", no matter how the front-end EC2 is elastically scaled and how the IP is changed, the network logic is solid.
NACL default rule: Deny All is the default custom NACL. If you built a custom NACL
I forgot to add rules again, and the entire subnet instantly became an isolated network island.
2. Microsoft Azure: Priority-oriented "security group leader"
Microsoft's Azure calls private networks
VNet
. Azure's logic in security control is fundamentally different from AWS: its
NSG(Network Security Group)
It can be bound to a network card (NIC) or a subnet (Subnet). Also, Azure introduced the concept of priorities (Priority,100-4096).
🛠Golden Allocation Path:
Create VNet and Subnet:Azure's Subnet is not bound to the Availability Zone, it is regional, which makes high-availability deployment more flexible.
Configure NSG rules: The smaller the rule number, the higher the priority. Inbound (Inbound): Allow the administrator to access 22/3389 from a specific IP, with priority set to 100; Allow HTTP/HTTPS, with priority set to 200; The rest default.
⚠Azure to avoid the history of tears:
The default three hidden rules: Azure NSG has three default rules at the bottom (you can't delete them, you can only overwrite them with higher priority):AllowVnetInBound: all subnets inside VNet communicate by default! AllowAzureLoadBalancerInBound: Allow Azure probes. DenyAllInBound: Block all the rest. Many Xiaobai thought that they were isolated by dividing different subnets. The results showed that the development environment directly touched the database of the production environment because they did not explicitly write a high-priority Deny rule to block cross-subnet communication.
3. GCP (Google Cloud Platform): Break the common sense of "global network" and "tag master"
The network architecture of Google Cloud (GCP) is the most unique of the six factories.
Its VPC is global and does not belong to a specific region.
As long as you are in a VPC, subnets around the world communicate with each other through Google's intranet backbone by default.
More importantly, GCP
There is no traditional security group bound to the network card or subnet.
. It uses
Firewall rules (Firewall Rules) Network tags (Network Tags) or service accounts (Service Accounts)
.
🛠Golden Allocation Path:
Define network labels: For example, label your virtual machines http-server or backend-db.
Write a firewall rule: Create a rule: "Allow traffic in, target port 80,443". Target (Targ
Et): select "specified network tags (Specified target tags)" and fill in the http-server. Source: Fill in 0.0.0.0/0.
⚠History of GCP Avoidance of Pit Blood and Tears:
Misspelling of label: The label is a plain text string. If you label the virtual machine as web-server, but your hand trembles into a web_server in the firewall rules (underscore becomes hyphen), Google does not have any error correction prompt, and your service will disappear from the public network forever.
Default deny vs default allow: GCP creates some default-allow-internal rules by default. If you are using the default network, machines in all regions of the world can ping each other by default. Remember to streamline these rules in the production environment.
4. OCI (Oracle Cloud Infrastructure): The ultimate "default denial"
Oracle Cloud (OCI) has sprung up in recent years because of a variety of cost-effective free packages and strong hardware. OCI's network is called
VCN(Virtual Cloud Network)
. Its security logic combines the characteristics of AWS and Azure,
Security List (safe list, subnet level)
and
Network Security Group(NSG)
.
🛠Golden Allocation Path:
Create a VCN: It is recommended to use the "VCN Wizard", which will automatically help you complete the Internet Gateway, NAT Gateway, and routing table, eliminating the pain of manual association.
Choose NSG instead of Security List:OCI officials currently recommend NSG because the safety list is applied to the entire subnet and the granularity is too coarse.
Add inbound rules: OCI's rules interface is very intuitive, select Stateless (stateless) or stateful, specify TCP protocol and port.
⚠History of OCI Avoidance of Pit Blood and Tears:
Iron wall, default full off: OCI's initial severity is the most stringent of the major cloud manufacturers. Even if you assign a public network IP when creating a virtual machine and set NSG to the maximum in the console (0.0.0.0/0 allows all), you still cannot connect to the machine with a high probability.
System internal firewall (Ubuntu/Oracle Linux): This is OCI's most famous crater! Its official image internally (the operating system's iptables or ufw) blocks all inbound traffic by default. You must SSH in (if you can), or execute the empty iptables in the initialization script (Cloud-init).
The order of the outside traffic can really come in.
5. Aliyun International Edition (Alibaba Cloud International): Efficient convergence of large traffic and high concurrency
The network design (VPC) and security group of Aliyun International Edition are very similar to AWS in logic, but many localization and Asian developers' habits have been optimized in terms of management complexity. Its safety component is
Common security group
and
Enterprise Security Group
.
🛠Golden Allocation Path:
Create a VPC and a switch (VSwitch): Note that the subnet of Alibaba Cloud is called a switch (VSwitch).
Select a security group type: Common security group: Instances in the group are interconnected by default, and a single group supports more instances. Enterprise-level security group: instances in the group are isolated by default and have high security, which is suitable for production environments such as finance and e-commerce.
Rule authorization: supports "address segment access" and "security group group mutual access".
⚠Aliyun International Edition Tears and Tears:
ICMP(Ping) is not available by default: after the novice builds the ECS instance, he habitually pings the public network IP locally, and when he finds it is not available, he thinks the network is dead. In fact, the ICMP protocol is often not allowed in the default security group rules of Aliyun International Edition. Add a "protocol type: ICMP" to the security group to solve the problem.
outbound interception: alibaba cloud globally blocks some specific ports (such as port 25 commonly used for mail) at the bottom of the network. If you want to build a self-built mail server on the cloud, you must submit a work order to apply for unsealing separately. It is useless to open it in a security group.
6. Tengxun Cloud International Edition (Tencent Cloud International): Minimal Streamlining and Refined Management
The network architecture of Tencent Cloud International (Tencent Cloud) is also based on VPC. Its security group design places a strong emphasis on "readability"
and
"Templating". You can create several standard security group templates (such as "Universal Web Server Template" and "Database Only Allows Intranet Template") and then bind them to different instances with one click.
🛠Golden Allocation Path:
Create a VPC and subnet.
Use security group instance association: Tencent cloud supports binding multiple security groups to one network card.
The order in which the rules take effect: from top to bottom, once the match is successful, it will no longer match down (similar to the logic of Azure and traditional firewall, which is different from the union of all AWS rules).
⚠Tengxunyun International Edition: History of Blood and Tears Avoiding Pit:
The "coverage effect" when multiple security groups are superimposed: because a CVM instance of tengxun cloud can bind multiple security groups, and its rules are matched in order and there are denial/permission points in a single group. If you accidentally move a security group with "Deny All" to the top, all the carefully configured "Allow" rules below will be invalid instantly.
3. high order
Architecture Pit Avoidance: A General Disease and Solution of Multi-cloud Network Configuration
When you become an architect and need to schedule several of these vendors in a project at the same time, the real challenge is just beginning.
1. The "MTU cliff" of cross-cloud connectivity"
When you connect AWS VPC and Azure VNet with VPN or dedicated line, you often encounter a strange phenomenon:
Ping and SSH with small traffic are smooth, but once a large file is transferred or a large SQL is run, the connection will be disconnected (Timeout) without any reason.
Cause: The underlying network encapsulation of major manufacturers is different, resulting in inconsistent MTU (Maximum Transmission Unit). AWS defaults may be 9001 (jumbo frames) or 1500, while the MTU for a classic VPN must be 1420 or 1350.
Solution: Be sure to turn on MSS Clamping(MSS clamp) on the virtual machines or routers at both ends, or manually reduce the MTU of the instance network card to 1420.
2. Overlap of multi-cloud interconnected network segments (Overlap)
At the beginning of planning a network, remember:
Never use duplicate local network segments anywhere.
If you set AWS to 10.0.0.0/16 and Azure to 10.0.0.0/16, when business development needs to open up cross-cloud peer-to-peer connections (VPC Peering/Transit Gateway), you can only do it over again. The best practice is to strictly divide by vendor and line of business:
AWS Production: 10.1.0.0/16
Azure Production: 10.2.0.0/16
GCP Production: 10.3.0.0/16
4. summary: the "four military rules" of the cloud network"
No matter which cloud you use, there are four general rules on the road to a good network architect:
Principle of minimum permission: can open/32 (single IP) never open/24 (network segment), can open/24 never open 0.0.0.0/0. Strict inbound control, outbound fine: limit the intranet core database, middleware outbound permissions. Even if the hacker gets the Webshell of the server through the Web vulnerability, he cannot download the Trojan horse in due to outbound blocking, let alone transfer the data out (C2 connection failed). Use Infrastructure as Code (IaC): When you exceed 10 security group rules, the human eye and memory are no longer reliable. Use Terraform or Pulumi to manage your network so that every security group change is tracked and reviewed. Don't forget the system layer: when the network is not working, the troubleshooting sequence is always: local service monitoring status-> operating system kernel firewall (iptables/ufw/Windows Firewall) -> cloud vendor security group-> cloud vendor subnet ACL/route
Table.
If you control the network configuration bases of these six major manufacturers, you will get a ticket to the evolution of the multi-cloud architecture. In the face of complex cross-border business, you can also calmly type on the keyboard: Connection Allowed.

