AWS VPC Networking Guide

Yashrajsinh

·January 15, 2025·15 min read·Intermediate

AWS VPC Networking Guide

Amazon Virtual Private Cloud is the networking foundation of every AWS deployment. When you launch an EC2 instance, deploy a container on ECS, provision an RDS database, or run a Lambda function inside your account, each of these resources operates within a VPC. Understanding VPC architecture is not optional for cloud engineers. It is the layer that determines how your resources communicate with each other, how they reach the internet, and how you isolate workloads from unauthorized access. Without a properly designed VPC, even the most carefully built application is vulnerable to misconfiguration, data leakage, and connectivity failures.

A VPC gives you complete control over your virtual networking environment. You choose the IP address range, create subnets across multiple Availability Zones, configure route tables that direct traffic, attach internet gateways for public access, deploy NAT gateways for outbound-only connectivity, and layer security groups and network ACLs to filter traffic at both the instance and subnet level. This level of control means you can replicate traditional data center network architectures in the cloud while gaining the elasticity and global reach that AWS provides.

This guide takes you through every VPC component you need to design and operate production networks on AWS. We start with the fundamentals of CIDR blocks and subnets, move through routing and internet connectivity, cover security filtering at multiple layers, explain VPC peering and Transit Gateway for multi-VPC architectures, and finish with production best practices that keep your networks secure and performant. If you are following the AWS services roadmap, VPC knowledge is essential because services like EC2 and RDS depend entirely on your network design.

What You Will Learn

After completing this guide, you will have a thorough understanding of AWS VPC networking and how to design production-grade network architectures. Specifically, you will learn:

How VPCs provide isolated network environments within AWS and how CIDR block selection affects your ability to scale and peer networks
How subnets divide your VPC across Availability Zones and how the public versus private subnet pattern protects backend resources
How route tables direct traffic between subnets, to the internet, and across peered VPCs, and why each subnet must have exactly one associated route table
How internet gateways enable bidirectional internet access for public subnets and how Elastic IPs provide stable public addresses
How NAT gateways allow private subnet resources to reach the internet for updates and API calls without exposing them to inbound traffic
How security groups act as stateful firewalls at the instance level with allow-only rules that simplify management
How network ACLs provide stateless subnet-level filtering with both allow and deny rules for defense-in-depth
How VPC peering and Transit Gateway connect multiple VPCs and on-premises networks into a unified architecture
How VPC Flow Logs capture network traffic metadata for troubleshooting and security auditing

Each section builds on the previous one, giving you a coherent path from basic VPC creation to production-ready multi-tier network design.

Prerequisites

Before working through this guide, ensure you have the following ready:

An active AWS account with permissions to create VPC resources including subnets, route tables, internet gateways, NAT gateways, security groups, and network ACLs
The AWS CLI installed and configured with credentials using aws configure so you can execute networking commands from your terminal
Basic understanding of IP addressing including CIDR notation, subnet masks, and the difference between public and private IP ranges as defined in RFC 1918
Familiarity with IAM policies and permissions since VPC resource creation requires specific IAM actions like ec2:CreateVpc, ec2:CreateSubnet, and related permissions
Comfort with the concept of Availability Zones and AWS regions, as VPC design is inherently multi-AZ for high availability

No prior VPC experience is required, but understanding basic networking concepts like IP addresses, ports, and protocols will help you absorb the material faster.

Concept Overview

A VPC is a logically isolated section of the AWS cloud where you launch resources in a virtual network that you define. Think of it as your own private data center within AWS, except you do not manage physical hardware, cabling, or rack space. You define the network topology, and AWS handles the underlying infrastructure that makes it work at scale.

Every VPC exists within a single AWS region but spans all Availability Zones in that region. When you create a VPC, you assign it a CIDR block, which is the range of private IP addresses available for your resources. For example, a VPC with CIDR 10.0.0.0/16 provides 65,536 IP addresses. You then carve this range into smaller subnets, each placed in a specific Availability Zone, creating the foundation for high-availability architectures.

The critical architectural pattern in VPC design is the separation between public and private subnets. Public subnets have a route to an internet gateway, meaning resources inside them can have public IP addresses and communicate directly with the internet. Private subnets have no direct internet route. Instead, they reach the internet through a NAT gateway sitting in a public subnet, which allows outbound connections while blocking all unsolicited inbound traffic. This pattern protects your databases, application servers, and internal services from direct internet exposure while still allowing them to download updates and call external APIs.

Traffic filtering in a VPC happens at two levels. Security groups operate at the network interface level, are stateful, and only support allow rules. Network ACLs operate at the subnet level, are stateless, and support both allow and deny rules. Using both together provides defense-in-depth, where security groups handle fine-grained instance-level access and network ACLs provide broad subnet-level guardrails.

Step-by-Step Explanation

This section walks through the essential implementation steps in order. Each step builds on the previous one, providing a clear path from initial configuration to a production-ready setup that follows AWS best practices.

Creating a VPC with CIDR Block Planning

The first decision in VPC design is choosing your CIDR block. This choice is permanent for the primary CIDR and affects your ability to peer with other VPCs, connect to on-premises networks, and scale your subnet count. The RFC 1918 private ranges available are 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. For production VPCs, a /16 block from the 10.x.0.0 range gives you 65,536 addresses, which is sufficient for most workloads while leaving room for future growth.

When planning CIDR blocks across multiple VPCs, avoid overlapping ranges. If your production VPC uses 10.0.0.0/16 and your staging VPC uses 10.1.0.0/16, you can peer them without conflicts. If both use 10.0.0.0/16, peering is impossible because the routing tables cannot distinguish between local and remote destinations.

# Create a VPC with a /16 CIDR block
aws ec2 create-vpc \
  --cidr-block 10.0.0.0/16 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=production-vpc},{Key=Environment,Value=prod}]'
 
# Enable DNS hostnames (required for public DNS resolution)
aws ec2 modify-vpc-attribute \
  --vpc-id vpc-0abc123def456 \
  --enable-dns-hostnames '{"Value": true}'
 
# Enable DNS support (enabled by default but verify)
aws ec2 modify-vpc-attribute \
  --vpc-id vpc-0abc123def456 \
  --enable-dns-support '{"Value": true}'
 
# Verify VPC creation
aws ec2 describe-vpcs \
  --vpc-ids vpc-0abc123def456 \
  --query 'Vpcs[0].{VpcId:VpcId,CidrBlock:CidrBlock,State:State}'

Enabling DNS hostnames is essential if you want EC2 instances in public subnets to receive public DNS names automatically. Without this setting, instances get public IP addresses but no corresponding DNS resolution, which breaks many service discovery patterns.

Designing Subnets Across Availability Zones

Subnets are the building blocks of your VPC network topology. Each subnet exists in exactly one Availability Zone and has its own CIDR block carved from the VPC CIDR. The standard production pattern uses at least two Availability Zones with both public and private subnets in each zone, giving you four subnets minimum for a basic highly available architecture.

A common subnet sizing strategy for a /16 VPC allocates /20 subnets, each providing 4,091 usable IP addresses after AWS reserves five addresses per subnet for internal use. The first four addresses and the last address in each subnet CIDR are reserved by AWS for the network address, VPC router, DNS server, future use, and broadcast address respectively.

# Create public subnets in two AZs
aws ec2 create-subnet \
  --vpc-id vpc-0abc123def456 \
  --cidr-block 10.0.0.0/20 \
  --availability-zone ap-south-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-subnet-1a},{Key=Tier,Value=public}]'
 
aws ec2 create-subnet \
  --vpc-id vpc-0abc123def456 \
  --cidr-block 10.0.16.0/20 \
  --availability-zone ap-south-1b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=public-subnet-1b},{Key=Tier,Value=public}]'
 
# Create private subnets in two AZs
aws ec2 create-subnet \
  --vpc-id vpc-0abc123def456 \
  --cidr-block 10.0.32.0/20 \
  --availability-zone ap-south-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-subnet-1a},{Key=Tier,Value=private}]'
 
aws ec2 create-subnet \
  --vpc-id vpc-0abc123def456 \
  --cidr-block 10.0.48.0/20 \
  --availability-zone ap-south-1b \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=private-subnet-1b},{Key=Tier,Value=private}]'
 
# Enable auto-assign public IP on public subnets
aws ec2 modify-subnet-attribute \
  --subnet-id subnet-0pub1a \
  --map-public-ip-on-launch

The auto-assign public IP setting on public subnets means any EC2 instance launched there automatically receives a public IPv4 address without requiring an Elastic IP. This is convenient for web servers and bastion hosts but should never be enabled on private subnets.

Configuring Route Tables and Internet Gateways

Route tables determine where network traffic is directed. Every subnet must be associated with exactly one route table, though a single route table can be associated with multiple subnets. When you create a VPC, AWS creates a main route table automatically with a single local route that enables communication between all subnets within the VPC.

The internet gateway is a horizontally scaled, redundant, and highly available VPC component that enables communication between your VPC and the internet. It performs network address translation for instances with public IP addresses and serves as the target in route table entries for internet-bound traffic.

# Create an internet gateway
aws ec2 create-internet-gateway \
  --tag-specifications 'ResourceType=internet-gateway,Tags=[{Key=Name,Value=prod-igw}]'
 
# Attach it to the VPC
aws ec2 attach-internet-gateway \
  --internet-gateway-id igw-0abc123 \
  --vpc-id vpc-0abc123def456
 
# Create a route table for public subnets
aws ec2 create-route-table \
  --vpc-id vpc-0abc123def456 \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=public-rt}]'
 
# Add a route to the internet gateway for all non-local traffic
aws ec2 create-route \
  --route-table-id rtb-0public \
  --destination-cidr-block 0.0.0.0/0 \
  --gateway-id igw-0abc123
 
# Associate public subnets with the public route table
aws ec2 associate-route-table \
  --route-table-id rtb-0public \
  --subnet-id subnet-0pub1a
 
aws ec2 associate-route-table \
  --route-table-id rtb-0public \
  --subnet-id subnet-0pub1b

The 0.0.0.0/0 route means all traffic not matching the local VPC CIDR route gets sent to the internet gateway. This is what makes a subnet public. Without this route, even if an instance has a public IP, it cannot communicate with the internet because the VPC router does not know where to send the packets.

Deploying NAT Gateways for Private Subnet Outbound Access

Private subnets intentionally lack a route to the internet gateway, which means resources inside them cannot initiate connections to the internet. However, these resources often need outbound internet access to download software updates, pull container images, call third-party APIs, or send logs to external services. NAT gateways solve this by providing outbound-only internet connectivity.

A NAT gateway sits in a public subnet, has an Elastic IP address, and performs network address translation. When a private subnet resource sends a packet to the internet, the route table directs it to the NAT gateway, which replaces the source IP with its own Elastic IP, forwards the packet through the internet gateway, and routes the response back to the originating resource. Inbound connections initiated from the internet cannot reach private resources through the NAT gateway because it only tracks connections that originated internally.

# Allocate an Elastic IP for the NAT gateway
aws ec2 allocate-address \
  --domain vpc \
  --tag-specifications 'ResourceType=elastic-ip,Tags=[{Key=Name,Value=nat-eip-1a}]'
 
# Create a NAT gateway in the public subnet
aws ec2 create-nat-gateway \
  --subnet-id subnet-0pub1a \
  --allocation-id eipalloc-0abc123 \
  --tag-specifications 'ResourceType=natgateway,Tags=[{Key=Name,Value=nat-gw-1a}]'
 
# Wait for the NAT gateway to become available
aws ec2 wait nat-gateway-available \
  --nat-gateway-ids nat-0abc123
 
# Create a route table for private subnets
aws ec2 create-route-table \
  --vpc-id vpc-0abc123def456 \
  --tag-specifications 'ResourceType=route-table,Tags=[{Key=Name,Value=private-rt-1a}]'
 
# Add a route to the NAT gateway
aws ec2 create-route \
  --route-table-id rtb-0private1a \
  --destination-cidr-block 0.0.0.0/0 \
  --nat-gateway-id nat-0abc123
 
# Associate private subnets with the private route table
aws ec2 associate-route-table \
  --route-table-id rtb-0private1a \
  --subnet-id subnet-0priv1a

For high availability, deploy one NAT gateway per Availability Zone and create separate route tables for private subnets in each zone. If a single NAT gateway serves both zones and its AZ experiences an outage, private resources in the other zone lose internet access. The cost of running multiple NAT gateways is justified in production by the availability guarantee.

Implementing Security Groups as Stateful Firewalls

Security groups are the primary traffic filtering mechanism in AWS. They operate at the elastic network interface level, meaning every EC2 instance, RDS database, Lambda function in a VPC, and ECS task has one or more security groups attached. Security groups are stateful, which means if you allow inbound traffic on port 443, the response traffic is automatically allowed regardless of outbound rules. This simplifies configuration significantly compared to stateless firewalls.

Security groups use allow-only rules. You cannot create a deny rule in a security group. If traffic does not match any allow rule, it is implicitly denied. This default-deny posture means you only need to think about what to permit, not what to block. Rules can reference CIDR blocks, individual IP addresses, or other security groups as sources, which enables powerful patterns like allowing all instances in a web-tier security group to communicate with instances in an application-tier security group without hardcoding IP addresses.

# Create a security group for web servers
aws ec2 create-security-group \
  --group-name web-servers-sg \
  --description "Allow HTTP and HTTPS from the internet" \
  --vpc-id vpc-0abc123def456 \
  --tag-specifications 'ResourceType=security-group,Tags=[{Key=Name,Value=web-servers-sg}]'
 
# Allow inbound HTTPS from anywhere
aws ec2 authorize-security-group-ingress \
  --group-id sg-0web123 \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0
 
# Allow inbound HTTP from anywhere (for redirect to HTTPS)
aws ec2 authorize-security-group-ingress \
  --group-id sg-0web123 \
  --protocol tcp \
  --port 80 \
  --cidr 0.0.0.0/0
 
# Create a security group for application servers
aws ec2 create-security-group \
  --group-name app-servers-sg \
  --description "Allow traffic only from web tier" \
  --vpc-id vpc-0abc123def456
 
# Allow inbound from web-servers security group on port 8080
aws ec2 authorize-security-group-ingress \
  --group-id sg-0app123 \
  --protocol tcp \
  --port 8080 \
  --source-group sg-0web123
 
# Create a security group for databases
aws ec2 create-security-group \
  --group-name database-sg \
  --description "Allow PostgreSQL from app tier only" \
  --vpc-id vpc-0abc123def456
 
# Allow inbound PostgreSQL from app-servers security group
aws ec2 authorize-security-group-ingress \
  --group-id sg-0db123 \
  --protocol tcp \
  --port 5432 \
  --source-group sg-0app123

This tiered security group pattern creates a chain of trust. The internet can reach web servers on ports 80 and 443. Web servers can reach application servers on port 8080. Application servers can reach databases on port 5432. No other paths exist. If an attacker compromises a web server, they cannot directly access the database because the database security group only allows traffic from the application tier security group.

Configuring Network ACLs for Subnet-Level Defense

Network Access Control Lists provide an additional layer of security at the subnet boundary. Unlike security groups, NACLs are stateless, meaning you must explicitly allow both inbound and outbound traffic for a connection to work. They also support deny rules, which lets you block specific IP ranges or ports regardless of what security groups allow. NACLs are evaluated before security groups, so a NACL deny rule takes precedence.

The default NACL that AWS creates with every VPC allows all inbound and outbound traffic. For production environments, you should create custom NACLs with explicit rules. NACL rules are evaluated in order by rule number, with lower numbers evaluated first. Once a rule matches, evaluation stops, so place your most specific deny rules at low numbers and broader allow rules at higher numbers.

# Create a custom NACL for public subnets
aws ec2 create-network-acl \
  --vpc-id vpc-0abc123def456 \
  --tag-specifications 'ResourceType=network-acl,Tags=[{Key=Name,Value=public-nacl}]'
 
# Allow inbound HTTPS (rule 100)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0pub123 \
  --rule-number 100 \
  --protocol tcp \
  --port-range From=443,To=443 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow \
  --ingress
 
# Allow inbound HTTP (rule 110)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0pub123 \
  --rule-number 110 \
  --protocol tcp \
  --port-range From=80,To=80 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow \
  --ingress
 
# Allow inbound ephemeral ports for return traffic (rule 120)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0pub123 \
  --rule-number 120 \
  --protocol tcp \
  --port-range From=1024,To=65535 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow \
  --ingress
 
# Allow all outbound traffic (rule 100)
aws ec2 create-network-acl-entry \
  --network-acl-id acl-0pub123 \
  --rule-number 100 \
  --protocol -1 \
  --cidr-block 0.0.0.0/0 \
  --rule-action allow \
  --egress

The ephemeral port range (1024-65535) is critical for NACLs because they are stateless. When a client connects to your web server on port 443, the response goes back on a random ephemeral port chosen by the client operating system. Without allowing these ports inbound, your web server can receive requests but cannot send responses.

Real-World Use Cases

VPC design patterns vary based on workload requirements, compliance needs, and organizational structure. Here are the most common production architectures that cloud engineers implement.

The three-tier web application is the most common VPC pattern. Public subnets host load balancers that terminate TLS and distribute traffic. Private subnets in the application tier run EC2 instances or ECS tasks that process requests. A separate private subnet tier hosts RDS databases with Multi-AZ deployment. Each tier has its own security group that only allows traffic from the tier above it, creating a strict access chain from internet to data.

Multi-account VPC architectures use AWS Transit Gateway to connect VPCs across accounts. A shared services VPC hosts centralized logging, monitoring, and CI/CD infrastructure. Application VPCs in separate accounts connect through Transit Gateway route tables, allowing controlled communication without exposing every VPC to every other VPC. This pattern is standard in enterprises that use AWS Organizations with separate accounts per team or environment.

Hybrid cloud connectivity uses AWS Site-to-Site VPN or Direct Connect to extend on-premises networks into AWS VPCs. The VPC acts as an extension of the corporate network, with route tables directing traffic destined for on-premises CIDR ranges through the VPN or Direct Connect gateway. This enables gradual cloud migration where some workloads remain on-premises while others run in AWS, all communicating over private network paths.

Container networking on ECS and EKS relies heavily on VPC subnet design. ECS tasks using awsvpc network mode each receive their own elastic network interface in the subnet, consuming one IP address per task. This means your subnet sizing must account for the maximum number of concurrent tasks. A /20 subnet with 4,091 usable IPs can support thousands of concurrent containers, but a /24 subnet with only 251 usable IPs fills up quickly under load.

Best Practices

Design your CIDR blocks with future growth and peering in mind. Use a consistent addressing scheme across all VPCs in your organization. Document which CIDR ranges are allocated to which VPCs and accounts in a central registry to prevent overlaps that block peering later.

Always deploy across at least two Availability Zones for production workloads. Place NAT gateways, load balancers, and application instances in multiple zones so that a single zone failure does not take down your entire application. The additional cost of multi-AZ NAT gateways is negligible compared to the downtime cost of a single-AZ architecture.

Use security groups as your primary access control mechanism and NACLs as a coarse-grained safety net. Security groups are easier to manage because they are stateful and reference other security groups by ID. Reserve NACLs for blocking known malicious IP ranges or enforcing compliance requirements that demand subnet-level controls.

Enable VPC Flow Logs on every production VPC and send them to CloudWatch Logs or S3 for analysis. Flow logs capture metadata about every network connection including source, destination, ports, protocol, and whether the traffic was accepted or rejected. This data is invaluable for troubleshooting connectivity issues, detecting unauthorized access attempts, and satisfying audit requirements.

Tag every VPC resource with at minimum a Name, Environment, and Owner tag. As your VPC count grows, untagged resources become impossible to manage. Consistent tagging enables cost allocation, automated cleanup of unused resources, and clear ownership when issues arise.

Use VPC endpoints for AWS services like S3, DynamoDB, and ECR to keep traffic on the AWS private network rather than routing it through NAT gateways and the public internet. Gateway endpoints for S3 and DynamoDB are free and reduce NAT gateway data processing charges. Interface endpoints for other services cost a small hourly fee but eliminate internet dependency for critical service communication.

Common Mistakes

Choosing a CIDR block that is too small is the most frequent VPC design error. A /24 VPC with 256 addresses seems sufficient initially but becomes a constraint as you add subnets, deploy more instances, and need to peer with other VPCs. Always start with at least a /16 for production VPCs. You cannot expand the primary CIDR after creation, though you can add secondary CIDR blocks.

Placing databases in public subnets is a critical security mistake. Even if the security group restricts access, a public subnet means the database has a route to the internet and could potentially be reached if the security group is misconfigured. Always place databases, caches, and internal services in private subnets with no internet gateway route.

Forgetting ephemeral ports in NACL rules causes mysterious connectivity failures. Because NACLs are stateless, you must allow the ephemeral port range for return traffic. Engineers who are accustomed to security groups often forget this because security groups handle return traffic automatically through statefulness.

Using the default VPC for production workloads is risky. The default VPC has all subnets configured as public with auto-assign public IP enabled. It uses a single route table and the default NACL that allows all traffic. Production workloads should always run in a custom VPC with intentional subnet design, explicit routing, and restrictive security controls.

Not planning for IP address exhaustion in container environments leads to deployment failures. When using ECS with awsvpc networking or EKS with the VPC CNI plugin, each pod or task consumes an IP address. If your subnets are too small, new tasks fail to launch because no IP addresses are available. Monitor subnet IP utilization and size subnets based on peak concurrent container count plus a growth buffer.

Relying solely on security groups without NACLs removes a defense layer. While security groups are sufficient for most access control, NACLs provide a subnet-level safety net. If a security group is accidentally opened too broadly, a restrictive NACL still blocks unauthorized traffic at the subnet boundary.

Summary

AWS VPC is the networking foundation that every cloud engineer must master. A well-designed VPC provides isolation, security, and connectivity for all your AWS resources. The key components work together in a layered architecture: the VPC defines your address space, subnets divide it across Availability Zones into public and private tiers, route tables direct traffic to internet gateways or NAT gateways, and security groups plus NACLs filter traffic at the instance and subnet levels respectively.

The public-private subnet pattern is the cornerstone of secure VPC design. Public subnets with internet gateway routes host load balancers and bastion hosts. Private subnets with NAT gateway routes host application servers and databases. This separation ensures that backend resources are never directly reachable from the internet while still having outbound connectivity for updates and API calls.

Production VPC design requires multi-AZ deployment, proper CIDR planning for future growth and peering, tiered security groups that create chains of trust between application layers, and VPC Flow Logs for visibility into network traffic. As your architecture grows, Transit Gateway connects multiple VPCs and on-premises networks, while VPC endpoints keep AWS service traffic off the public internet.

Start by building a two-AZ VPC with public and private subnets, an internet gateway, NAT gateways, and tiered security groups. Once this foundation is solid, layer on VPC peering, endpoints, and flow logs as your workload demands grow. The investment in proper VPC design pays dividends in security, reliability, and operational clarity for every service you deploy on AWS.

Intermediate13 min read

AWS VPC Networking Guide

AWS VPC Networking Guide

What You Will Learn

Prerequisites

Concept Overview

Step-by-Step Explanation

Creating a VPC with CIDR Block Planning

Designing Subnets Across Availability Zones

Configuring Route Tables and Internet Gateways

Deploying NAT Gateways for Private Subnet Outbound Access

Implementing Security Groups as Stateful Firewalls

Configuring Network ACLs for Subnet-Level Defense

Real-World Use Cases

Best Practices

Common Mistakes

Summary

AWS API Gateway Deep Dive

AWS CloudFront CDN Guide

AWS CloudWatch Monitoring

Related Articles

AWS API Gateway Deep Dive

AWS CloudFront CDN Guide

AWS CloudWatch Monitoring