Network Tunneling & VPNs — From First Principles

A structured guide to understanding network tunneling, VPNs, NAT traversal, and L7 tunneling solutions. Organized for gradual understanding — each section builds on the previous one.

1. The Big Picture

The core problem: Two machines on private networks need to communicate. Neither has a public IP. Neither can accept inbound connections. A direct connection is impossible.

The L7 solution: Both machines make outbound HTTPS connections (port 443) to a relay service. The relay matches the two sessions and forwards bytes between them. No inbound ports, no public IPs, no VPN — just outbound HTTPS, which passes through virtually any firewall. This is how AWS SSM, Cloudflare Tunnel, and ngrok work.

The L3 VPN solution: Create a virtual network interface and encrypt IP packets between peers directly. Faster and transparent to all apps, but requires UDP connectivity between peers (may be blocked). This is how WireGuard and IPsec work.

Tailscale's approach: Try L3 first (direct WireGuard over UDP). If the network won't allow it, fall back to L7 (DERP relay over HTTPS). Best of both worlds.

The rest of this document unpacks each piece in detail, building up from network layers to concrete solutions.

2. The Network Layer Cake

All network communication happens in layers. Each layer has a specific job and wraps the layer above it.

Layer    Name            What It Moves          Examples
─────   ──────────────  ────────────────────   ──────────────────────────
L7      Application     HTTP requests,          HTTP, DNS, SSH, SMTP
                        API calls, web pages

L6      Presentation    Encryption, encoding    TLS/SSL, compression

L5      Session         Connections, sessions   WebSocket, RPC sessions

  (In practice, L5–L7 blur together. Most people just say "L7".)

L4      Transport       Segments / datagrams    TCP, UDP
                        (port numbers)

L3      Network         IP packets              IP, ICMP
                        (IP addresses, routing)

L2      Data Link       Ethernet frames         Ethernet, Wi-Fi, ARP
                        (MAC addresses)

L1      Physical        Bits on wire /          Ethernet cable, fiber,
                        radio waves             Wi-Fi radio, 5G

Where VPNs and Tunnels Fit

┌─────────────────────────────────────────────────────────┐
│  L7 — Application Layer Tunnels                         │
│    AWS SSM Port Forwarding · Cloudflare Tunnel · ngrok  │
│    SSH Tunnels / SOCKS Proxy · Tailscale DERP Relay     │
├─────────────────────────────────────────────────────────┤
│  L4 — Transport Layer                                   │
│    OpenVPN (TUN mode) · stunnel                         │
├─────────────────────────────────────────────────────────┤
│  L3 — Network Layer VPNs                                │
│    WireGuard (← Tailscale) · IPsec (tunnel mode) · GRE  │
├─────────────────────────────────────────────────────────┤
│  L2 — Data Link Layer VPNs                              │
│    OpenVPN (TAP mode) · L2TP · VXLAN / MPLS             │
└─────────────────────────────────────────────────────────┘

3. What Is a VPN?

A VPN (Virtual Private Network) creates a private network over a public one. It does two things:

Tunneling — wraps your original network packets inside new packets, so they can travel across the internet as if on a private wire

Encryption — makes the wrapped content unreadable to anyone in between

Two Fundamental VPN Models

Remote Access VPN ("Road Warrior") — one client connects into a network:

Site-to-Site VPN — two entire networks connected:

4. VPN Layers — Where the Tunnel Lives

VPNs operate at different layers of the network stack. The layer determines what gets tunneled and the tradeoffs involved.

L3 VPN (Network Layer) — Most Common

Tunnels IP packets. Creates a virtual network interface on your machine; anything routed through it gets encrypted and tunneled.

Examples: WireGuard, IPsec, Tailscale

Your OS sees a new network interface (e.g., wg0, utun3)

Any application works transparently — it just sends packets to an IP

L2 VPN (Data Link Layer)

Tunnels Ethernet frames. The remote machine appears to be on the same LAN segment.

Examples: OpenVPN (TAP mode), L2TP, VXLAN, MPLS

Can carry non-IP protocols (ARP, broadcast traffic)

Heavier overhead, but useful when you need true L2 adjacency

L7 Tunnel (Application Layer)

Tunnels at the application level. Per-application or per-port, not system-wide.

Examples: SSH tunnels, SOCKS proxies, SSM Port Forwarding, Cloudflare Tunnel, ngrok

Not traditional VPNs — they don't create a virtual network interface or route IP packets

How Tunneling Works at Each Layer

L7 Tunnel (e.g., SSM Port Forwarding):

L7: WebSocket frame carrying tunnel data
 └─ L6: TLS encryption
     └─ L4: TCP segment
         └─ L3: IP packet (public IPs)
             └─ L2: Ethernet frame
                 └─ L1: bits on wire

The tunneled content (e.g., Docker API call) is just
bytes inside the WebSocket message. The relay doesn't
know or care what's inside.

L3 Tunnel (e.g., WireGuard):

L4: UDP datagram (outer, port 51820)
 └─ L3: IP packet (outer, public IPs)
     └─ WireGuard header + encryption
         └─ L3: IP packet (inner, VPN private IPs)  ◄── tunneled content
             └─ L4: TCP/UDP (inner, original traffic)
                 └─ L7: Application data

An entire IP packet is encrypted and stuffed inside
a UDP packet. The OS routes it like any other packet.

L2 Tunnel (e.g., OpenVPN TAP / VXLAN):

L4: UDP datagram (outer)
 └─ L3: IP packet (outer, public IPs)
     └─ Tunnel header + encryption
         └─ L2: Ethernet frame (inner, MAC addrs)  ◄── tunneled content
             └─ L3: IP packet (inner)
                 └─ L4: TCP/UDP (inner)
                     └─ L7: Application data

An entire Ethernet frame (including MAC addresses,
ARP, broadcast) is encrypted and tunneled.

The Key Tradeoff

	Higher Layer Tunnel (L7)	Lower Layer Tunnel (L2/L3)
Firewall traversal	✓ Works through any firewall	✗ May be blocked (non-HTTPS)
Setup	✓ Easy	Moderate to complex
Granularity	✓ Per-service	Full network emulation
App transparency	✗ Per-app configuration	✓ Transparent to all apps
Overhead	✗ Higher	✓ Lower (kernel)
Path	✗ Relay always in path	✓ Can go direct (P2P)

General rule: the lower the layer, the more transparent and performant the tunnel, but the harder it is to traverse restrictive networks. The higher the layer, the easier it passes through firewalls, but the more limited and application-specific it becomes.

5. Major VPN Protocols

IPsec (1990s — The Enterprise Standard)

A suite of protocols operating at L3, implemented in the kernel:

IKE (Internet Key Exchange) — negotiates encryption keys between peers

ESP (Encapsulating Security Payload) — encrypts and authenticates packets

AH (Authentication Header) — authenticates but doesn't encrypt (rarely used alone)

Original:  [IP Header][TCP Header][Data]
IPsec ESP: [New IP Header][ESP Header][encrypted: IP Header | TCP Header | Data][ESP Trailer]

Very fast (hardware acceleration on most CPUs)

Extremely complex configuration — dozens of parameters to negotiate

Two modes: transport (encrypts payload only) and tunnel (encrypts entire original packet)

The standard for site-to-site VPNs (AWS VPN, most enterprise gear)

OpenVPN (2001 — The Flexible Workhorse)

Uses TLS to establish the tunnel, then sends encrypted packets over UDP or TCP.

Runs in userspace (not kernel) — easier to deploy, but slower

TUN mode (L3, routes IP packets) or TAP mode (L2, bridges Ethernet frames)

TCP mode can traverse firewalls that block UDP (but suffers from TCP-over-TCP issues)

Verbose configuration — certs, keys, DH params, cipher negotiation

WireGuard (2018 — The Modern Minimalist)

Designed as a reaction to the complexity of IPsec and OpenVPN.

Design philosophy:

~4,000 lines of code (vs ~100,000 for OpenVPN, ~400,000 for IPsec)

No cipher negotiation — one fixed set of modern cryptography

Cryptokey routing — identity IS the public key, routing IS the configuration

Silent by default — doesn't respond to unauthenticated packets

Fixed cryptography (no negotiation):

Function	Algorithm
Key exchange	Curve25519 (ECDH)
Symmetric encryption	ChaCha20
MAC	Poly1305
Hashing	BLAKE2s
Key derivation	HKDF

Configuration — the entire thing:

# Peer A
[Interface]
PrivateKey = <A's private key>
Address = 10.0.0.1/24
ListenPort = 51820

[Peer]
PublicKey = <B's public key>
AllowedIPs = 10.0.0.2/32
Endpoint = 203.0.113.5:51820

# Peer B
[Interface]
PrivateKey = <B's private key>
Address = 10.0.0.2/24
ListenPort = 51820

[Peer]
PublicKey = <A's public key>
AllowedIPs = 10.0.0.1/32
Endpoint = 198.51.100.10:51820

Handshake (Noise Protocol Framework, 1-RTT):

Data transport:

Original:  [IP Header 10.0.0.1 → 10.0.0.2][TCP][Data]
On wire:   [UDP 198.51.100.10:51820 → 203.0.113.5:51820]
           [WireGuard Header (type, index, counter)]
           [ChaCha20-Poly1305 encrypted payload]

Cryptokey routing — the AllowedIPs field serves double duty:

Outbound: routing table — "packets to 10.0.0.2/32 go to peer B"

Inbound: ACL — "packets from peer B must have source IP in 10.0.0.2/32"

What WireGuard deliberately doesn't do:

No user authentication (key-based only)

No key distribution (manual exchange)

No peer discovery (must know endpoints)

No NAT traversal (raw UDP)

No IP address assignment (static config)

WireGuard is a tunnel primitive, not a complete VPN product.

6. NAT and NAT Traversal

Why NAT Exists

Most devices don't have public IP addresses. Your laptop has a private IP like 192.168.1.50; your router has one public IP. When you send a packet out, the router rewrites the source address — this is NAT (Network Address Translation).

This works for outbound connections. The problem: if another device wants to reach your laptop directly, it can't. No mapping exists yet, so the router drops the packet.

SNAT vs DNAT

NAT has two directions:

SNAT (Source NAT) — rewrites the source address of outbound packets. This is what your home router does: your laptop sends a packet with source 192.168.1.50, the router rewrites it to 73.45.x.x before sending it to the internet. The most common form is masquerade (dynamic SNAT where the public IP may change). Every device behind a home router uses SNAT.

DNAT (Destination NAT) — rewrites the destination address of inbound packets. This is what port forwarding does: a packet arrives at the router's public IP 73.45.x.x:8080, and the router rewrites the destination to 192.168.1.50:8080 before forwarding it to your laptop. Load balancers also use DNAT — traffic arrives at the LB's IP, gets rewritten to a backend server's private IP.

SNAT (outbound):  src 192.168.1.50 → src 73.45.x.x       (hide private IP)
DNAT (inbound):   dst 73.45.x.x:8080 → dst 192.168.1.50:8080  (port forward)

In the context of tunneling, SNAT is the reason both sides can connect outbound to a relay — the router handles the address rewriting transparently. DNAT is what we're trying to avoid — it requires manual port forwarding or a public IP, which L7 tunnels eliminate entirely.

If two devices — both behind SNAT — want to talk directly, neither can initiate. Both doors are closed from the outside. NAT traversal is the collection of tricks to solve this.

The Tricks

Trick 1: STUN (discover your public address)

You don't even know your own public IP and port. STUN solves this.

Trick 2: Hole Punching (the core trick)

Once both peers know each other's public address, they send packets simultaneously:

Trick 3: Port Mapping (UPnP / NAT-PMP / PCP)

Explicitly ask the router for a mapping. Cleanest solution, but depends on router support (many corporate networks disable it).

Trick 4: TURN / Relay (when all else fails)

Give up on direct connection and relay through a server. Always works (both sides connect outbound), but adds latency.

NAT Types

NAT Type	Behavior	Hole Punching?
Full Cone	Any external host can send to the mapped port	Easy ✓
Address-Restricted	Only the specific IP you sent to can reply	Works ✓
Port-Restricted	Only the specific IP:port you sent to can reply	Works ✓
Symmetric	Different mapping for every destination	Very hard ✗

Symmetric NAT is the nemesis — the port STUN discovers is useless because the router assigns a different port for each destination. Most home routers are port-restricted (hole punching works). Corporate firewalls are often symmetric (need relay).

ICE — The Standard Framework

These tricks aren't ad-hoc — they're formalized in ICE (Interactive Connectivity Establishment), a protocol from the VoIP/WebRTC world. ICE combines STUN + TURN + candidate gathering into a single framework:

Gather all possible connection paths ("candidates"): local address, STUN-discovered address, TURN relay address

Exchange candidates with the peer via a signaling channel

Try all candidate pairs simultaneously, in priority order

Pick the best working path (direct preferred over relayed)

WebRTC uses ICE under the hood for browser-to-browser video calls. Tailscale implements the same ideas adapted for WireGuard.

7. Tailscale: The Complete Product on WireGuard

Tailscale fills in everything WireGuard deliberately leaves out:

WireGuard Gap	Tailscale Solution
No key distribution	Coordination server distributes public keys
No peer discovery	Coordination server shares endpoint info
No NAT traversal	STUN + hole punching + DERP relays
No user auth	OAuth/OIDC (Google, Microsoft, GitHub, etc.)
No IP assignment	Automatic from 100.64.0.0/10 (CGNAT range)
No access control	ACLs defined in a central policy file
No DNS	MagicDNS — each node gets `hostname.tailnet.ts.net`
Manual config	Zero-config — install, login, done

Architecture

What happens when you install Tailscale:

tailscale up → browser opens → you log in with your identity provider

tailscaled generates a WireGuard key pair, sends the public key to the coordination server

Coordination server authenticates you, assigns a Tailscale IP (e.g., 100.64.0.1), pushes the network map (all peers' public keys + known endpoints)

Your node runs STUN to discover its public address, reports it to the coordination server

When you reach another node, tailscaled tries direct UDP (hole punching), falls back to DERP if needed

Traffic flows over WireGuard — the coordination server is never in the data path (unless DERP is needed)

Connection Path Selection

Tailscale's Hybrid Nature

Tailscale's control plane is L7 (HTTPS to coordination server), but the data plane is L3 (WireGuard UDP). It uses the L7 channel to bootstrap and maintain the L3 mesh, and falls back to L7 (DERP) when direct connections fail — getting the best of both worlds.

Headscale is an open-source reimplementation of Tailscale's coordination server. You self-host it; the data plane (WireGuard) is identical.

8. L7 Tunneling Solutions

All L7 tunneling solutions share the same fundamental architecture:

The Three Participants

The Agent — runs on the private/target machine, maintains a persistent outbound connection to the relay, never listens on a public port

The Relay — publicly reachable service, accepts connections from both sides, matches them together, forwards bytes (dumb pipe)

The Client — runs on your machine, connects outbound to the relay, opens a local listener and pipes traffic through

The Network Flow

Both arrows point inward toward the relay. Neither side accepts inbound connections. No firewall rules, public IPs, or VPNs needed.

The Lifecycle

Auth Boundaries

Client → Relay ("Are you allowed to request this tunnel?"):

IAM roles, OAuth/OIDC tokens, API keys, pre-shared tokens — the relay is the gatekeeper.

Agent → Relay ("Are you a legitimate agent?"):

Instance identity documents, enrollment tokens, mutual TLS, API keys — prevents rogue agents.

The tunneled traffic itself is opaque to the relay.

Why Port 443

Almost every network allows outbound HTTPS (port 443)

Corporate firewalls, NATs, hotel WiFi, private subnets — all pass it through

WebSocket upgrade starts as a normal HTTPS request, so it passes through HTTP proxies and deep packet inspection

Cloudflare Tunnel — Brief Overview

Cloudflare Tunnel (cloudflared) is a popular L7 tunneling solution that follows this exact pattern:

The agent (cloudflared) runs on your server and connects outbound to Cloudflare's edge network via HTTP/2 or QUIC

Cloudflare's edge acts as the relay — it's a global anycast network, so the relay is always close to both sides

Clients are end users hitting a public hostname (e.g., app.example.com) — Cloudflare routes the request through the tunnel to your agent

Auth is handled by Cloudflare Access (OIDC, SAML, etc.) for private apps, or simply DNS for public-facing services

Can expose HTTP services, TCP services (SSH, RDP, databases), or act as a private network connector

Free tier available — no need for public IPs, no inbound firewall rules

9. AWS SSM Port Forwarding — Detailed Example

AWS Systems Manager (SSM) port forwarding is a concrete implementation of the L7 tunneling pattern. It lets you reach services on private EC2 instances without public IPs, SSH, inbound security group rules, or VPNs.

Architecture

No public IP, no SSH, no inbound security group rules, no VPN. Both sides connect outbound to the SSM service over HTTPS (port 443).

Step-by-Step Flow

Real-World Use Case: Remote Docker Builder

A common use case for SSM port forwarding is offloading Docker builds to a remote EC2 instance — for example, when your Mac has Apple Silicon (ARM) but you need to build x86/amd64 images, or you want faster builds on a powerful instance.

The setup:

An EC2 instance in a private subnet runs Docker daemon listening on 127.0.0.1:2375

Your Mac uses SSM port forwarding to tunnel localhost:2376 → EC2's 127.0.0.1:2375

Docker CLI on your Mac connects to the remote daemon as if it were local

The commands:

# 1. Start the SSM tunnel (maps local port 2376 → remote port 2375)
aws ssm start-session \
  --target i-0abc123def456 \
  --document-name AWS-StartPortForwardingSession \
  --parameters '{"portNumber":["2375"],"localPortNumber":["2376"]}'

# 2. In another terminal, point Docker CLI at the tunnel
export DOCKER_HOST=tcp://localhost:2376

# 3. Now docker commands execute on the remote EC2 instance
docker build --platform linux/amd64 -t my-app:latest .
docker compose up -d

What's happening under the hood:

The Docker CLI thinks it's talking to a local daemon. The SSM tunnel transparently relays every Docker API call to the remote instance. Build context, image layers, logs — all flow through the tunnel. The EC2 instance builds x86 images natively (no emulation), and you get the result back on your Mac.

10. L7 Tunneling Solutions Comparison

Solution	Self-hosted?	Protocol	Auth Model	Primary Use Case
AWS SSM Port Forwarding	No (AWS)	WebSocket	IAM	AWS infra access
Cloudflare Tunnel	No	HTTP/2, QUIC	Cloudflare Access	Expose services
ngrok	No (or yes, v2)	WebSocket	API keys/OAuth	Dev tunnels
Tailscale DERP	Partial (Headscale)	WireGuard + HTTPS	OAuth/OIDC	Mesh networking fallback
Teleport	Yes	WebSocket/gRPC	SSO/RBAC	Infra access
Inlets	Yes	WebSocket	Token	Lightweight tunnels
frp / Rathole	Yes	TCP/WebSocket	Token	Self-hosted tunnels
Boundary (HashiCorp)	Yes	gRPC	OIDC/Vault	Infra access
GCP IAP TCP Forwarding	No (GCP)	WebSocket	Google IAM	GCP infra access
Azure Bastion	No (Azure)	HTTPS	Azure AD	Azure infra access

Complexity Spectrum

Dimension	Simple (ngrok, inlets)	Medium (Cloudflare, SSM)	Full (Teleport, Boundary)
Auth	API key / token	Cloud IAM	SSO + RBAC + MFA
Relay	Vendor-hosted	Vendor-hosted	Self-hosted option
Audit	Basic logs	Cloud audit trail	Session recording
Scope	Single tunnel	Service exposure	Full infra access
Credential mgmt	Manual	Cloud-native	Injected (e.g., Vault)

11. When to Use What — A Decision Guide

Quick Reference

Scenario	Best Fit	Why
Reach an EC2 instance in a private subnet	SSM Port Forwarding	Zero setup, IAM-native, no public IP needed
Remote Docker builds from Mac to EC2	SSM Port Forwarding	Tunnel Docker API, build x86 natively
Expose a local dev server for webhooks	ngrok	One command, instant public URL
Expose a production service without public IP	Cloudflare Tunnel	Global edge, DDoS protection, free tier
Connect all your devices in a mesh	Tailscale	Zero-config WireGuard mesh, works everywhere
Site-to-site between two offices	WireGuard or IPsec	Direct, kernel-level, high throughput
Secure infra access with audit trail	Teleport or Boundary	Session recording, RBAC, credential injection
Self-hosted tunnel on a budget	frp or inlets	Lightweight, runs on a cheap VPS

12. Key Takeaways

WebSockets / HTTP/2 over port 443 are the universal firewall bypass — outbound HTTPS is almost never blocked, making L7 tunnels work from virtually any network.

All L7 tunneling solutions share the same architecture: agent dials out, relay matches, client dials out, bytes flow.

VPN layer choice is a tradeoff: lower layers (L2/L3) are more transparent and performant; higher layers (L7) traverse firewalls more easily.

WireGuard is a tunnel primitive, not a complete VPN — it does one thing (encrypt and route packets) extremely well in ~4,000 lines of code.

Tailscale bridges both worlds: L3 data plane (WireGuard) for performance, L7 control plane + DERP fallback for reachability.

NAT traversal (STUN, hole punching, port mapping, relay fallback) enables direct P2P connections despite both sides being behind NAT. These techniques are formalized in the ICE framework from the WebRTC/VoIP world.

Choose by scope: single port → SSM/cloud-native tunnel; full network → WireGuard/Tailscale; expose to internet → Cloudflare/ngrok.