Kubernetes Services and DNS Guide
Service Types
Kubernetes Service types build on each other — each type includes everything the previous one does.
ClusterIP (Base)
Internal-only. Gets a virtual IP from the service CIDR.
Pod → my-svc.default.svc.cluster.local
→ CoreDNS returns ClusterIP (e.g., 172.20.10.5)
→ kube-proxy/eBPF intercepts traffic to 172.20.10.5
→ DNAT to a healthy pod IP (e.g., 10.0.2.47:8080)
- Only reachable from within the cluster
- DNS:
<service>.<namespace>.svc.cluster.local→ ClusterIP - Load balancing: random or round-robin across endpoints via iptables/eBPF
NodePort (ClusterIP + port on every node)
Allocates a static port (30000–32767) on every node. A ClusterIP is still created underneath.
External client → <any-node-ip>:31234
→ kube-proxy/eBPF on that node intercepts
→ DNAT to a healthy pod IP (same as ClusterIP routing)
Internal pod → ClusterIP still works as before
- Reachable externally via
<node-ip>:<node-port> - DNS: same internal DNS as ClusterIP; no external DNS created
- The port is opened on all nodes, even those not running the target pods — traffic gets forwarded across nodes
- Rarely used directly in production; mainly a building block
LoadBalancer (NodePort + external LB)
Provisions a cloud load balancer (NLB or ALB on AWS) that routes to the NodePort — or directly to pod IPs in IP target mode.
Instance target mode (traditional):
Client → AWS NLB/ALB → <node-ip>:<node-port>
→ kube-proxy/eBPF → pod IP (double hop)
IP target mode (default in EKS Auto Mode):
Client → AWS NLB/ALB → pod IP directly (single hop)
- DNS: AWS assigns a load balancer DNS name (e.g.,
abc123.elb.ca-central-1.amazonaws.com). No Kubernetes DNS — you create a Route 53 alias/CNAME yourself or use external-dns. - One LB per Service — gets expensive with many services
- NLB for TCP/UDP (Layer 4), ALB for HTTP/HTTPS (Layer 7)
IP Target vs Instance Target Mode
With VPC CNI, pods have real VPC IPs, so the load balancer can route directly to them.
| IP Target Mode | Instance Target Mode | |
|---|---|---|
| Traffic path | LB → pod directly | LB → node → kube-proxy → pod |
| Latency | Lower (one fewer hop) | Higher (extra node hop) |
| Distribution | Even across pods | Uneven (nodes may have different pod counts) |
| Source IP | Preserved | May require externalTrafficPolicy: Local |
| Default in | EKS Auto Mode | Standard EKS |
In standard EKS, opt in with annotations:
# For NLB
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
# For ALB via Ingress
alb.ingress.kubernetes.io/target-type: ip
Bare-Metal / MetalLB
Deep dive: See 02-k8s-network-infrastructure.md for how L2 Announcements and BGP work under the hood, and how MetalLB compares to Cilium LB-IPAM and LoxiLB.
On bare-metal clusters (k3s, kubeadm, etc.) with MetalLB, there’s no cloud LB that can register pod IPs. The flow is always:
Client → MetalLB VIP → Node → kube-proxy → Pod
MetalLB operates in L2 mode (ARP, single node owns VIP) or BGP mode (multiple nodes advertise VIP to upstream router). Either way, the node hop is unavoidable because pod IPs on overlay networks aren’t directly routable from outside the cluster.
ExternalName (DNS alias)
No ClusterIP, no proxying — purely a DNS CNAME redirect.
apiVersion: v1
kind: Service
metadata:
name: my-database
spec:
type: ExternalName
externalName: mydb.abc123.ca-central-1.rds.amazonaws.com
Pod resolves my-database.default.svc.cluster.local
→ CoreDNS returns CNAME → mydb.abc123.rds.amazonaws.com
→ Pod resolves and connects directly
Use cases:
- Reference external services with in-cluster DNS names
- Migration path — switch from ExternalName to ClusterIP when bringing a service in-cluster
- Cross-namespace references
Limitations:
externalNamemust be a DNS name, not an IP address- No port remapping, no health checks
- TLS/SNI issues: if the client does hostname verification, the in-cluster name won’t match the external certificate. For HTTPS services this can break TLS. For databases with
sslmode=verify-ca(verify CA only, not hostname) it’s fine;sslmode=verify-fullwould fail.
Headless Service (clusterIP: None)
No ClusterIP allocated. DNS returns individual pod IPs directly instead of a single virtual IP.
apiVersion: v1
kind: Service
metadata:
name: my-svc
spec:
clusterIP: None
selector:
app: my-app
Pod resolves my-svc.default.svc.cluster.local
→ CoreDNS returns A records: 10.0.2.10, 10.0.2.11, 10.0.2.12
→ Pod connects to one of them directly (client's choice)
No kube-proxy/eBPF interception — DNS returns the IPs and the pod connects directly.
Headless Services + StatefulSets
StatefulSets need headless Services because each pod has a stable identity and clients often need to reach a specific pod.
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
clusterIP: None
selector:
app: mysql
ports:
- port: 3306
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql # links to the headless service
replicas: 3
The serviceName field tells Kubernetes to create per-pod DNS records:
# Service-level — returns all pod IPs
mysql.default.svc.cluster.local → 10.0.2.10, 10.0.2.11, 10.0.2.12
# Per-pod — stable DNS name for each pod
mysql-0.mysql.default.svc.cluster.local → 10.0.2.10
mysql-1.mysql.default.svc.cluster.local → 10.0.2.11
mysql-2.mysql.default.svc.cluster.local → 10.0.2.12
Clients can connect to a specific pod (mysql-0.mysql) or any pod (mysql). The service-level DNS returns all IPs — which one the client picks depends on the DNS client (not true load balancing).
A common pattern for databases is two Services:
# Headless — for per-pod addressing
apiVersion: v1
kind: Service
metadata:
name: mysql
spec:
clusterIP: None
selector:
app: mysql
---
# Regular ClusterIP — for load-balanced reads
apiVersion: v1
kind: Service
metadata:
name: mysql-read
spec:
selector:
app: mysql
role: replica
ports:
- port: 3306
Writes go to mysql-0.mysql (specific primary), reads go to mysql-read (ClusterIP load-balances across replicas).
The concept of primary
Kubernetes itself has no concept of “primary” or “replica”. The statement assumes mysql-0 is the primary, and this is a widely-used convention that must be implemented at the application level.
StatefulSets guarantee ordered, stable pod identities. The setup scripts leverage the ordinal index (the number in the pod name) to assign roles:
# Typically done in an initContainer or entrypoint script
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql
replicas: 3
template:
spec:
initContainers:
- name: init-mysql
image: mysql:8.0
command:
- bash
- -c
- |
# Extract ordinal index from hostname
ORDINAL=$(hostname | grep -o '[0-9]*$')
if [ "$ORDINAL" -eq 0 ]; then
# mysql-0 → configure as PRIMARY
echo "[mysqld]" > /etc/mysql/conf.d/server.cnf
echo "server-id=1" >> /etc/mysql/conf.d/server.cnf
echo "log-bin=mysql-bin" >> /etc/mysql/conf.d/server.cnf
else
# mysql-1, mysql-2 → configure as REPLICA
echo "[mysqld]" > /etc/mysql/conf.d/server.cnf
echo "server-id=$((ORDINAL + 1))" >> /etc/mysql/conf.d/server.cnf
echo "read-only=1" >> /etc/mysql/conf.d/server.cnf
echo "super-read-only=1" >> /etc/mysql/conf.d/server.cnf
fi
The Key Chain of Responsibility
Kubernetes guarantees Application implements Clients assume
───────────────────── ────────────────────── ──────────────
• mysql-0 always exists → • mysql-0 runs as primary → • Writes → mysql-0.mysql
• Stable DNS names → • mysql-1,2 run as replicas→ • Reads → mysql-read
• Ordered creation → • Replication configured
A More Robust Alternative: Label-Based Selection
Rather than hardcoding mysql-0.mysql as the write endpoint, production setups often use labels + a dedicated primary service:
# Primary-only service (writes)
apiVersion: v1
kind: Service
metadata:
name: mysql-primary
spec:
selector:
app: mysql
role: primary # ← only the primary pod has this label
ports:
- port: 3306
---
# Replica-only service (reads)
apiVersion: v1
kind: Service
metadata:
name: mysql-read
spec:
selector:
app: mysql
role: replica
ports:
- port: 3306
The operator or sidecar manages the labels:
Normal operation:
mysql-0 → labels: {app: mysql, role: primary}
mysql-1 → labels: {app: mysql, role: replica}
mysql-2 → labels: {app: mysql, role: replica}
After failover (mysql-0 dies, mysql-1 promoted):
mysql-1 → labels: {app: mysql, role: primary} ← label changed
mysql-2 → labels: {app: mysql, role: replica}
This is exactly what MySQL Operator, Vitess, and similar operators do — they watch cluster state and shuffle labels so the services automatically route to the correct pod.
Headless Service with Manual Endpoints
For pointing at external IPs that aren’t managed by Kubernetes. Create a Service with no selector and a matching Endpoints object:
apiVersion: v1
kind: Service
metadata:
name: external-db
spec:
clusterIP: None
ports:
- port: 5432
---
apiVersion: v1
kind: Endpoints
metadata:
name: external-db # must match Service name
subsets:
- addresses:
- ip: 10.0.5.20
- ip: 10.0.6.21
ports:
- port: 5432
DNS returns the IPs directly. Use a regular ClusterIP Service (instead of headless) with manual Endpoints to get kube-proxy load balancing across the external IPs.
Use cases:
- External services that only have IPs (ExternalName requires a DNS name)
- Load balancing across multiple external endpoints
- Gradual migration — start with manual Endpoints, add a
selectorlater when the service moves in-cluster - Health-aware routing — update Endpoints to remove unhealthy IPs
For newer clusters, EndpointSlice is the preferred API:
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: external-db-1
labels:
kubernetes.io/service-name: external-db
addressType: IPv4
endpoints:
- addresses: ["10.0.5.20"]
- addresses: ["10.0.6.21"]
ports:
- port: 5432
Ingress
Not a Service type — a separate resource that sits in front of multiple ClusterIP Services, routing by host/path through a shared ALB.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
spec:
rules:
- host: app.example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-svc
port:
number: 80
- path: /
pathType: Prefix
backend:
service:
name: frontend-svc
port:
number: 80
Client → ALB (single LB, shared)
→ /api → api-svc pods (IP target)
→ / → frontend-svc pods (IP target)
- One ALB handles many services — cost-efficient vs one LoadBalancer Service per service
- Layer 7 only (HTTP/HTTPS)
- TLS termination at the ALB
- Requires an Ingress controller (built into EKS Auto Mode, or AWS Load Balancer Controller in standard EKS)
Gateway API
The successor to Ingress. Splits concerns across multiple resources for better role separation.
GatewayClass → who provides the infrastructure (e.g., AWS ALB)
Gateway → the actual LB instance (ports, TLS, listeners)
HTTPRoute → routing rules (replaces Ingress rules)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: my-gateway
spec:
gatewayClassName: amazon-alb
listeners:
- name: https
protocol: HTTPS
port: 443
tls:
certificateRefs:
- name: my-cert
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: api-route
spec:
parentRefs:
- name: my-gateway
hostnames:
- "app.example.com"
rules:
- matches:
- path:
type: PathPrefix
value: /api
backendRefs:
- name: api-svc
port: 80
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: frontend-svc
port: 80
Advantages over Ingress:
- Role separation — platform team manages GatewayClass + Gateway, app teams manage HTTPRoutes
- Multi-protocol — supports TCP, UDP, gRPC, TLS natively via
TCPRoute,GRPCRoute, etc. - Cross-namespace routing — HTTPRoutes in different namespaces can attach to a shared Gateway
- Traffic splitting — native weighted routing for canary deployments
- Standardized — features expressed in the spec, not vendor-specific annotations
DNS in Kubernetes
DNS Hierarchy
CoreDNS serves the cluster.local domain. The full DNS structure:
<service>.<namespace>.svc.cluster.local → Service ClusterIP
<pod-name>.<service>.<namespace>.svc.cluster.local → Pod IP (StatefulSet + headless)
<pod-ip-dashed>.<namespace>.pod.cluster.local → Pod IP (e.g., 10-0-2-47.default.pod.cluster.local)
Within the same namespace, pods can use short names:
my-svc → works (same namespace)
my-svc.other-namespace → works (cross-namespace)
my-svc.other-namespace.svc.cluster.local → fully qualified
DNS Resolution by Service Type
| Service Type | DNS Response |
|---|---|
| ClusterIP | A record → ClusterIP |
| Headless | A records → individual pod IPs |
| Headless + StatefulSet | A records for service + per-pod A records |
| ExternalName | CNAME → external DNS name |
| NodePort | A record → ClusterIP (same as ClusterIP) |
| LoadBalancer | A record → ClusterIP (internal); external DNS managed separately |
Service Discovery Methods
Kubernetes provides two ways for pods to discover services:
DNS (recommended) — pods resolve service names via CoreDNS. Works for all service types and updates automatically.
Environment variables — kubelet injects <SERVICE_NAME>_SERVICE_HOST and <SERVICE_NAME>_SERVICE_PORT into every pod. Only includes services that existed when the pod started. Doesn’t update if services change. Mainly a legacy mechanism.
Traffic Policies
Both traffic policies are fields on the Service spec. They are configured per-service:
apiVersion: v1
kind: Service
metadata:
name: my-svc
spec:
type: LoadBalancer
externalTrafficPolicy: Local # or Cluster (default)
internalTrafficPolicy: Local # or Cluster (default)
externalTrafficPolicy
Applies only to LoadBalancer and NodePort service types (traffic originating from outside the cluster).
Controls how traffic from external sources (LoadBalancer, NodePort) is routed to pods.
Cluster (default) — traffic can be routed to pods on any node. If the pod isn’t on the receiving node, there’s an extra hop. Source IP is lost (SNAT’d to the node IP).
Client → Node A (no pod here) → SNAT → Node B (pod here)
Pod sees source IP: Node A's IP
Local — traffic is only routed to pods on the receiving node. No extra hop, source IP preserved. But if a node has no pods, traffic to that node is dropped.
Client → Node B (pod here) → Pod
Pod sees source IP: Client's real IP
Cluster | Local | |
|---|---|---|
| Distribution | Even across all pods | Only to local pods |
| Source IP | Lost (SNAT) | Preserved |
| Extra hops | Possible | None |
| Risk | None | Dropped traffic if node has no pods |
With IP target mode in EKS, externalTrafficPolicy is less relevant since the LB routes directly to pods, bypassing nodes entirely.
internalTrafficPolicy
Applies to all service types (traffic originating from within the cluster). Controls how cluster-internal traffic is routed.
Cluster (default) — traffic can go to any pod backing the service, on any node.
Local — traffic is only routed to pods on the same node as the caller. Useful for node-local caches or DaemonSet services where you want each pod to talk to the local instance.
apiVersion: v1
kind: Service
metadata:
name: node-cache
spec:
internalTrafficPolicy: Local
selector:
app: cache
ExternalDNS
The cluster.local DNS is purely internal — CoreDNS only serves it to pods inside the cluster. External clients have no visibility into it. ExternalDNS bridges that gap by watching Kubernetes resources and automatically creating/updating DNS records in an external DNS provider.
How It Works
You create: Ingress with host: app.example.com → ALB provisioned
ExternalDNS: watches the Ingress, sees the ALB DNS name
→ creates DNS record: app.example.com → ALB DNS name
Without ExternalDNS, you’d manually create that DNS record every time you deploy a new Ingress or LoadBalancer Service.
What It Watches
- LoadBalancer Services — registers the LB’s external DNS name/IP
- Ingress resources — registers based on the
hostfield - Gateway API HTTPRoutes — registers based on
hostnames - NodePort Services — registers node IPs (less common)
Typical Setup with Route 53
ExternalDNS runs as a pod in the cluster with IAM permissions to manage Route 53:
args:
- --source=ingress
- --source=service
- --provider=aws
- --domain-filter=example.com # only manage this domain
- --policy=upsert-only # safety: never delete records
- --txt-owner-id=my-cluster # ownership tracking
The txt-owner-id creates TXT records alongside each DNS record to track ownership, preventing multiple clusters from clobbering each other’s records.
End-to-End Flow
1. Deploy Ingress with host: app.example.com
2. AWS LB Controller provisions ALB → abc123.elb.amazonaws.com
3. ExternalDNS sees the Ingress, creates Route 53 record:
app.example.com → ALIAS → abc123.elb.amazonaws.com
4. External client resolves app.example.com → ALB → pods
Providers
ExternalDNS supports many providers — Route 53, CloudFlare, Google Cloud DNS, Azure DNS, and others.
It also supports CoreDNS as a provider, which is useful in on-prem or air-gapped environments without cloud DNS. In this setup, a separate CoreDNS instance runs on the network segment (outside the cluster) with a backend that supports dynamic updates (commonly etcd, though CoreDNS supports multiple backends including file-based zone files, the kubernetes API, and others). ExternalDNS writes records to the backend, and the external CoreDNS serves them to clients on the network — making Kubernetes service endpoints resolvable from outside the cluster without a cloud DNS provider.
Cluster: ExternalDNS watches Services/Ingresses → writes to backend (e.g., etcd)
Network: External CoreDNS reads from backend → serves DNS to on-prem clients
Summary
| Type | Scope | DNS | Load Balancing | Cost |
|---|---|---|---|---|
| ClusterIP | Internal only | svc.cluster.local → ClusterIP | kube-proxy/eBPF | Free |
| NodePort | External via node IP | No external DNS | kube-proxy/eBPF | Free |
| LoadBalancer | External via cloud LB | LB DNS name (manual Route 53) | Cloud LB → pods | 1 LB per service |
| ExternalName | DNS alias to external | CNAME to external name | None | Free |
| Headless | Internal, direct pod IPs | A records → pod IPs | None (client chooses) | Free |
| Ingress | External via shared ALB | 1 ALB DNS (manual Route 53) | ALB path/host routing → pods | 1 ALB shared |
| Gateway API | External via shared LB | Same as Ingress | Same, with more control | 1 LB shared |
The progression: ClusterIP (internal) → NodePort (expose on nodes) → LoadBalancer (cloud LB per service) → Ingress/Gateway API (shared LB, smart routing).
Cross-References
Related: Network Infrastructure Deep Dive
See Kubernetes Network Infrastructure — L2, BGP, and On-Prem Load Balancing for:
- L2 Announcements — how VIPs work via ARP, failover via gratuitous ARP, broadcast domain limitations
- BGP Route Advertisement — peering, ECMP multi-node load balancing, route propagation
- MetalLB vs Cilium vs LoxiLB — on-prem load balancer comparison (control plane vs data plane)
- Overlay vs Native CNI — why Flannel pod IPs aren’t externally routable, contrast with VPC CNI
- External LBs (NGINX/HAProxy) — service discovery via Consul, static config, external ingress controllers
- Protocol deep dives — ARP mechanics, BGP message types, SCTP