How does Amazon Cloud GPU Server Billing? Amazon EC2 G4dn/G5 Instance Computing Power and Price Full Resolution

cloud 2026-06-03 阅读 95

In today's era of artificial intelligence, large model fine-tuning and graphics rendering, buying your own high-performance GPU graphics card is not only expensive, but often out of stock. As a result, the vast majority of developers, architects and entrepreneurial teams will look to the cloud-especially the cloud computing big brother Amazon Cloud (AWS).

Amazon EC2 GPU Instances

In the AWS GPU family,

G4dn

and

It belongs to the "all-round cost-effective magic machine" that has been bought all the year round ". They can run AI reasoning, fine-tuning small models, and can withstand 3D rendering and cloud games.

However, when many people first come into contact with AWS, they are often confused by its maze-like billing rules and various instance specifications. People often receive a huge, painful bill at the end of the month because they choose the wrong billing model or forget to turn it off.

Today's tutorial directly cuts into hard core dry goods, does not pull the concept of PPT, and uses the most down-to-earth language to take you to G4dn and G5 examples

Power differences, ledger details, and money-saving moves.

Thoroughly stripped to the point of clarity.

The first stage: hardware and computing power disassembly (what is the difference between G4dn and G5?)

Before settling accounts, we must first find out what "donkey" we bought ". The core difference between G4dn and G5 is that they are essentially packed in their stomachs.

Graphics Architecture

Different.

Amazon EC2 G4dn Instances: The Cost-Effective King of Inference"

Core graphics: NVIDIA T4 (based on Turing architecture).

Video memory capacity: each card has 16 GB video memory.

Advantage interval: Its single-precision floating-point arithmetic (FP32) is average, but it supports Tensor Core. It is very suitable for running trained AI model inference (Inference), lightweight object detection, or 3D rendering and video transcoding with less than extreme image quality requirements.

Big Vernacular: If your big model has been trained and now you want to deploy it online to provide API access to users, G4dn is the cheapest and highest production ratio choice.

2. Amazon EC2 G5 Instances: The "All-Fighter" of an all-out outbreak"

Core graphics card: NVIDIA A10G (based on Ampere architecture).

memory capacity: each card has 24 GB memory.

Advantage range: It's a big leap in computing power over T4. Graphics rendering performance is improved by up to 3 times, AI training and inference performance is improved by up to 3.3 times. It is not only perfectly competent for highly concurrent reasoning, but also can be used for fine-tuning (Fine-tuning) and lightweight training of small and medium-sized large models due to the expansion of video memory to 24G and stronger computing power.

Big vernacular: If you want to run Stabl yourself

E Diffusion XL high-definition drawing, Llama language model with fine-tuning several B parameters, or high-precision cloud 3D real-time rendering, it will be much easier to spend a little more money on G5.

Phase 2: Amazon Cloud's three billing models (determining how many bills you receive at the end of the month)

AWS billing is not one-size-fits-all, it offers three completely different "play". The same server, choose the wrong mode, the price can be different.

3 to 4 times

Model 1: On-Demand-Flexible but Most Expensive

How to charge: the real "pay as much as you want" is charged by the second (at least one minute). You can always kill it when you're not renting it.

Suitable for scenarios: temporarily write code debugging, run a few hours of testing tasks.

Invisible pit: Never treat on-demand instances as fixed servers! If you drive a G5 instance and leave it alone for that month, the next month's bill may directly bankrupt you. In addition, because on-demand instances do not guarantee inventory, in today's AI boom, you may encounter the embarrassing situation of "the system prompts that there is no video card to create in the available area" during the peak business period.

Mode 2: Reserved Instances (RI)/Reserved Savings Plan (Savings Plans)-Long-term stability is the most cost-effective

How to charge: You sign a contract with AWS and promise to rent the machine for 1 or 3 years. In return, AWS gives you a direct discount, usually about 60% off for a 1-year period and even 3-40% off for a 3-year period. You can choose to pay in one lump sum, monthly or no upfront.

Suitable for the scene: your AI business has been online, this server can not be shut down 365 days a year, 24 hours a day, thunder can not move.

In the vernacular: as long as your machine is turned on for more than half a month, it is definitely the wisest to buy a savings plan directly.

Mode 3: Spot Instance (Spot Instance)-Master's Favorite "Wool" Artifact

How to Billing: It is the most amazing existence in AWS billing system. AWS will auction the "idle video cards" that are currently unused in the computer room, with a discount as low as 1 to 3 percent (equivalent to saving 70% to 90% of the money)!

Fatal disadvantage: AWS may force the server back at any time. When someone in the market offers a high price to buy on-demand instances, causing the computer room graphics card to be tight, AWS will send you a notice 2 minutes in advance and then forcibly shut down your server and take it away.

Suitable for scenarios: distributed large-scale AI training, video rendering tasks that do not require real-time online. You must write breakpoint resume (Checkpoints) in the code, even if the server suddenly dies, you can still run on another machine.

Stage 3: G4dn and G5 price actuarial tables (hold your books steady)

AWS pricing in different regions of the world is

Not the same (usually the cheapest in the United States, China, Japan, Europe slightly more expensive). We take the most classic

US East (Northern Virginia) Region

For example, the official standard pricing (the actual price may be fine-tuned over time, but the ratio is basically fixed):

Instance Name

Number of GPU Cards & Model

Total memory capacity

CPU Core/Memory

On-demand unit price (per hour)

1-year reservation conversion (per hour)

g4dn.xlarge

1 x NVIDIA T4

16 GB

4 Core/16GB

Approximately $0.526

About $0.35 (30% +)

g4dn.12xlarge

4 x NVIDIA T4

64 GB

48 cores/192GB

Approximately $3.912

Approximately $2.55

g5.xlarge

1 x NVIDIA A10G

24 GB

4 Core/16GB

Approximately $1.006

About $0.63 (about 40% less)

g5.12xlarge

4 x NVIDIA A10G

96 GB

48 cores/192GB

Approximately $5.672

Approximately $3.57

💡Small case of ledger actuarial: if you buy a basic g5.xlarge run drawing or model fine-tuning. If you use the on-demand mode for one month (720 hours):1.006*720=724.32 USD (about 5000 + RMB). If you buy a 1-year savings plan: about 0.63*720=$453.6 a month. More than 2,000 yuan was saved in an instant.

Stage 4: Three "invisible vampires" in AWS GPU billing"

Many people think that if I calculate the cost according to the $1 per hour in the table, everything will be fine. As a result, I received a bill and found that there were hundreds of dollars extra. Remember, AWS is modular billing, GPU servers are on, and the following three places are also running meters at the same time:

EBS cloud hard disk fee (you have to deduct money if you only shut down the machine and don't delete it): in order to run the big model, you downloaded a 200 GB HuggingFace model weight and bought a 300 GB gp3 hard disk. Note: Even if you shut down (Stopped) the EC2 server, as long as you don't completely cancel (Terminated) the server, the 300 GB hard disk will continue to deduct your storage fee every day! (In the eastern United States, a 300G hard drive costs about $24 a month.).

Data Transfer Out: AWS is free to receive data (uploaded locally to the server), but there is a charge for outbound data (downloaded from the server to your local or client). If you use GP

U renders a large amount of ultra-high definition video, or calls a large model at high frequency to spit out a huge amount of text. When the public network traffic exceeds 100GB, a traffic fee of about $0.09 per GB will be charged.

Elastic public network IP idle fee (don't leave IP when you stop): if you apply for a fixed elastic IP(EIP) for the server. When the server is on, this IP is free for you to use. However, if you shut down the server and the IP is idle, AWS will charge a punitive idle fee of about $0.005 per hour to prevent you from occupying valuable public network IP resources.

Summary and Do-Not-Drop Tips

Managing GPU servers in the Amazon cloud is essentially a balancing act between performance needs and wallet budgets. Finally, I'll give you four self-defense tips that veterans are using:

G4 for lightweight reasoning: the model that has been trained and launched on a small scale is the most cost-effective with T4 graphics card.

G5: 24G large memory and Ampere new architecture on fine-tuning rendering, A10G is the best experience for fine-tuning drawing.

Long-term purchase plan, sprint on demand: as long as the server is turned on for more than 12 hours a day, firmly buy Savings Plans.

After work, the root must be cut off: after the experiment is finished, not only must it be shut down, but also remember to check the hard disk and IP, and the unused machines must be Terminate decisively.