Summary and solutions of common problems of Alibaba Cloud server ECS instances
Introduction: When many people first come into contact with Aliyun ECS(Elastic Compute Service), they often have the illusion: "After buying the server, will the website be closed as soon as it is passed on?" results after the real on-line found that waiting for you is the site can not open, CPU surge, SSH can not connect, bandwidth explosion ...... This article does not speak Mandarin, only from the perspective of real operation and maintenance, the ECS 10 most common problems and solutions to a clear.
Suitable for:
Novice webmaster: I just came into contact with the cloud server and felt confused about Linux operation and maintenance.
Application developers: Deploy applications such as WordPress, Java, Python, and Docker.
Cross-border e-commerce/foreign trade: maintenance of independent stations, cross-border business.
Linux Operation and Maintenance Xiaobai: People who bought ECS but do not know how to maintain it on a daily basis.
1. why can't public network IP be opened?(90% of them are this problem)
This is the most frequently asked question for beginners: browser access timeout, SSH connection failure, pagoda failure.
Root cause:
Alibaba Cloud is enabled by default
Security group firewall
which by default blocks the vast majority of external access.
Solution
Go to the ECS console → Security Groups → Inbound Rules.
Make sure the following common ports are released:
Service Name
Port number
Suggested Description
Secure Shell
22
Basic connection required
Hypertext Transfer Protocol
Eighty
Website Services
HTTPS
Four hundred forty-three
Encrypted Website Services
Pagoda Panel
Eight thousand eight hundred eighty-eight
Default port (later modification recommended)
MySQL
3306
It is strictly prohibited to open the public network
Redis
Six thousand three hundred seventy-nine
Intranet access only recommended
Hidden Pit: System Internal Firewall
If you open the security group and still can't open it, it is usually the firewall (Firewalld or UFW) that comes with the Linux system that blocks it.
Bash
# CentOS Stop Firewall
systemctl stop firewalld
systemctl disable firewalld
# Ubuntu Stop Firewall
ufw disable
Why is the 2. CPU load often 100 percent full?
1. Website under CC attack
Especially WordPress users, attackers will frantically request
/wp-login.php
or
/xmlrpc.php
Instantly run out of CPU.
Countermeasure: Enable Nginx throttling or use Alibaba Cloud WAF.
2. Configuration selection is too low
The configuration of 1 core 1G or 1 core 2G is basically in the "fake death" state after running MySQL + Docker + Java.
Real configuration recommendations:
Personal Blog: 2 Nuclear 2G Start
WordPress/Foreign Trade Station: 2-Core 4G Start
Java project: at least 8G memory to start
What 3. the remote connection (SSH) fails?
When prompted
Connection timed out
, please check in this order:
Check the security group: Check whether port 22 is allowed.
Use VNC function: Alibaba Cloud console provides "send remote command (VNC)", which is a "lifesaver" to enter the system even if SSH crashes ".
Check the SSH service status:
Bash
# View service status
systemctl status sshd
# Restart the SSH service
systemctl restart sshd
4. website suddenly appeared 502 Bad Gateway?
502 usually means Nginx is normal,
The back-end program is dead.
。
Common cause: Insufficient memory (OOM Killer) causes the process to be killed by the system.
Troubleshooting command:
Bash
# Check the system log to see if there is the word "Killed process"
dmesg -T | grep -i oom
Solution: Optimize program memory usage or increase Swap swap partitions.
5. disk space inexplicably full?
I didn't pass on much,
df -h
Look at 100 percent.
Log accumulation: System logs or application logs in the/var/log directory are not configured for rolling cleanup.
Docker images: Residual from frequent builds. Bash# Clean up useless Docker data docker system prune -a
Find large files: Quickly locate the root cause.
Bash
# View the size of each folder under the root directory
du -sh /*
'''
---
##6. why bandwidth costs are getting more expensive?
The public network bandwidth of ECS is very expensive. If you spit out static resources such as pictures and videos directly through ECS, the bandwidth will soon explode.
**Correct schema:** 'User-> CDN (Content Delivery Network) -> ECS '.
**Benefits:** Static resources are cached on CDN nodes. ECS only processes the core logic, which greatly reduces bandwidth costs.
---
## The more you use the 7. database, the slower it will be?
Just started flying fast, six months later Caton. This is usually not a decrease in ECS performance, but a **missing index** or **slow query**.
* **Optimization tool:** Install 'mysqltuner' for automated diagnosis.
**View Live Connections:**
sql
-- Execute in MySQL
SHO
W PROCESSLIST;
8. safety protection: don't wait to be "black" to regret it
As long as you have a public IP, robots are scanning your password every second.
Modify the default SSH port: Change 22 to a random port between 20000 and 60000.
prohibit root direct login:
Bash
# Modify/etc/ssh/sshd_config
PermitRootLogin no
'''
3. **Use SSH key:** Completely disable password login.
4. **Installation Fail2ban:** Automatically ban brute-force crackers.
---
##9. the "pit" of snapshot recovery"
Snapshots are not a panacea. If you create a snapshot when the database is frequently written, data may be inconsistent after recovery.
* **Best practice:** Stop the database service before restoration. Take a snapshot of the system disk and data disk to prevent version mismatch.
---
##10. Summary: From "Construction" to "Operation and Maintenance"
The real threshold for ECS lies not in "buying" but in "raising".
A mature O & M system should include:
**Monitoring:** Alibaba Cloud monitors alarms (CPU, memory, and bandwidth).
**Security:** Regularly patch system vulnerabilities.
**Backup:** Automated snapshot backup of important data.
**In the server world, stability is more important than fancy features.**
---
