- DevsWorld
- Posts
- Reducing the Cost of Network Bandwidth within AWS
Reducing the Cost of Network Bandwidth within AWS
How I reduced my bandwidth costs by 80% using VPC endpoints
If you are running private services on AWS, it is quite likely that you are utilizing a NAT gateway. A NAT gateway helps your private services reach networks out on the public internet. Which is a great service and running a managed NAT is much easier than running your own. However, this comes at a cost. A single managed NAT gateway costs about ~$37/month on AWS just for running it. Next, you have bandwidth charges for all the data the NAT gateway processes. Then you have charges for bandwidth that goes to other regions with AWS. When you add it all up, it gets a pricey.
For example, let’s say you transfer 615 GB of data through the NAT gateway. Here’s the cost breakdown: (Can be found under EC2-other in the cost explorer)
Running the NAT gateway for the month:
$0.052 per hour (24 × 30) = $37.44
$0.052 per GB of data transferred = $31.98
$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using elastic IPs or ELB = $6.15
These are my real charges for the month of January. How the hell did I transfer 615 GB of data? Was I downloading movies? Was I streaming Netflix on a private VPN? Did I download World of Warcraft through my EC2 server and download back onto my local machine?
No.
I deployed my applications on ECS.
But wait, shouldn’t the traffic be within the same VPC if you’re using ECS and ECR? They should just talk to one another if you’re within private subnets, right?
Also, no.
You see, AWS doesn’t really let you know about this in a nice way when you setup a NAT gateway. (With good reason $$$). Here’s a basic diagram of my current services that run for my ECS application.

Here’s how a reasonable person would expect the network to flow in this system.

A reasonable person would be wrong. This is more like how the network traffic flows.

Every network request gets sent through the NAT gateway if it comes from the private subnet and gets sent out to the AWS public endpoints to then get routed back to the AWS service. This is how I ended up getting to 615 GB. I was deploying my application frequently because I was testing out new fixes and working on the architecture. This required me to update my docker images and redeploy them frequently. Therefore, I was pulling tons of data from ECR, fetching data from S3, and also doing some load testing to ensure resiliency (which called to my database).
How do we fix it?
It’s surprisingly simple to fix this problem though. AWS has a notion on VPC endpoints. These are endpoints for AWS services that you utilize that you can create within your VPC. Instead of the NAT gateway routing the traffic out to the public AWS service endpoints, it won’t even touch the NAT. Your services will automagically talk to the internal service endpoint you setup instead.
This drastically reduces the cost of bandwidth for your NAT gateway because, well, you are no longer using the NAT gateway to transfer data within AWS. If you are using the interface type, what happens is that an ENI gets setup on the subnets you specify. You need to specify the private subnets that your application is running on for this to work! Otherwise, there’s no way for your app’s networking to see these endpoints.
It’s important to know that there is a VPC endpoint for every service and sometimes multiple for each. For instance, ECS has 3 separate service endpoints that it uses. ECR uses 2. You may also find that some services, such as CloudWatch have different names for their endpoints. CloudWatch’s endpoint is called “monitoring”.
Although we are only a few days into February, my bill for NAT gateway charges is down significantly and I have not stopped deploying the same amount of times. I am sitting at 0.15GB of data transfer, which is likely only coming from external API calls. I’m quite happy with these changes.
I hope this helps someone out there that is using NAT gateways on AWS. I was surprised at first that this was not a feature that just exists when you run things on private subnets. Why should my egress traffic be going out to public endpoints on the same platform? It’s a bit bizarre. Regardless, this fix is in place for me and I’ve reduced my charges by about 25-30% which makes me happy.
If you are a freelancer in Czech Republic and looking for a premium invoicing solution, check out my app Živno! The app has a clean, modern interface, helps you track clients and payments easily, and most importantly, generate professional looking invoices with ease.
Best of luck out there devs!