Routing, NAT, and Firewalls

Routing decides the next hop. NAT rewrites addresses or ports. Firewalls enforce policy. Most production network failures involve these three together, especially when return paths are asymmetric or connection tracking state is exhausted.

Core Commands

ip route
ip route get 198.51.100.10
ip rule
nft list ruleset
conntrack -S
conntrack -L | head

Routing

Linux chooses routes by longest-prefix match within the selected routing table. Policy rules can select alternate tables based on source address, fwmark, TOS, interface, or other packet attributes.

Key ideas:

  • local routes are special,
  • the default route is the fallback,
  • policy routing can make ip route alone misleading,
  • source address selection affects return traffic,
  • ECMP can split flows across next hops.

Route Lookup and Policy Routing Example

flowchart LR
  Packet[Packet: src, dst, mark, input interface] --> Rules[ip rule priority order]
  Rules --> Local[local table]
  Rules --> Main[main table]
  Rules --> Custom[custom table by source or fwmark]
  Main --> LPM[longest-prefix match]
  Custom --> LPM
  LPM --> NextHop[next hop and egress interface]
  NextHop --> Neigh[ARP/NDP neighbor lookup]
  Neigh --> Output[egress qdisc and interface]

Policy routing lab:

ip rule add from 10.10.20.0/24 table 200 priority 1000
ip route add default via 10.10.20.1 dev eth1 table 200
ip route get 198.51.100.10 from 10.10.20.50
ip rule show
ip route show table 200

Interpretation:

Observation Meaning
ip route shows one default but ip route get ... from ... uses another A policy rule selected a non-main table.
Forward path works but replies leave a different interface Source routing, asymmetric routes, or SNAT state is mismatched.
Route exists but first packet stalls Neighbor lookup, ARP/NDP filtering, or gateway reachability is failing.
ECMP route exists but only some flows fail One next hop or hash bucket is bad; vary source ports during tests.

Routers often provide the first routed boundary for DHCP. Because DHCPv4 starts with local broadcast, a router interface or relay agent must forward requests to DHCP servers on other subnets. The relay-selected interface address, commonly recorded in giaddr, is how the server chooses the right scope. A routing device can therefore break DHCP even while ordinary routed traffic works.

Router-provided host configuration commonly arrives through DHCP options:

  • router/default gateway option,
  • DNS server option,
  • domain/search suffix,
  • classless static routes,
  • PXE/TFTP boot information.

NAT

SNAT changes the source. DNAT changes the destination. Masquerade is dynamic SNAT for changing egress addresses. NAT requires connection tracking for most stateful behavior.

NAT pitfalls:

  • ephemeral port exhaustion,
  • conntrack table full,
  • idle timeout too short,
  • source IP loss at the application layer,
  • hairpin NAT behavior differs by platform,
  • return path must pass the same stateful device unless routing is designed otherwise.

Firewalls

Stateful firewalls make decisions using connection state, interfaces, addresses, ports, and sometimes marks or zones. When a packet is dropped, the reason may be policy, state mismatch, invalid conntrack state, reverse path filtering, or rate limiting.

Load Balancers

Load balancers often add their own NAT and health-check behavior. Health checks prove only the configured check path, not the full user path. L7 load balancers can fail on Host headers, SNI, HTTP versions, or backend connection reuse even when TCP is fine.

Study Cards

Question

What is longest-prefix match?

Answer

The most specific matching route prefix wins over broader routes.

Question

Why can policy routing make ip route misleading?

Answer

Rules can select routing tables other than the main table based on packet attributes.

Question

Why does stateful NAT care about return paths?

Answer

The return packet must match connection-tracking state or the NAT device cannot translate it correctly.

References