Over the past few months, I have been configuring a replacement multi-wan NAT router/firewall for work. My collegues and I decided to use Voyage Linux (a derivative of Debian Linux for embedded devices) on a Soekris net4801 box. See also the pictures on Joe’s post.
Unlike other organizations who use their multi wan connections to do automatic load balancing, and traffic shaping, we simply use our extra WAN connection for redundancy. Both connections DNAT to an internal server with two distinct external IP addresses. The idea is that users can access the server using either of the IP addresses, though they might normally prefer one over the other. Users would be able to switch to the other connection should the one they were using provide a less than optimal result. No automatic load balancing is required nor desired. In simple terms, our network looks something like this: As part of this configuration, we wanted to have network traffic that came in on one interface properly exit again through that same interface. I was able to configure most of the firewall and NAT parts of the router with relative ease using iptables but was stumped when it came to the routing table and how to route packets in and out of their own respective interfaces.
The Problem Defined
Traditional routing tables generally only allow for one default gateway at a time. Multiple default gateways have to be specified in priority sequence. Thus, there is no guarantee that an incoming packet on one line will receive a reply routed back through that same interface. At best, the return packet will go out the default gateway or some other static gateway defined according to the routing table. Furthermore, traditional routing tables only allow destination based routing. That is, we can create specific routing entries to dictate a route given a destination address but not based on the source address.
After some research, I discovered that IPROUTE2 solves a lot of my problems. IPRoute2, amongst many other things, allows for source based routing, and also allows for routing based on packet markers. More on packet markers later.
Be warned though, IPROUTE2 is a rather complex beast! The user manual is far from friendly, and it took me a few tries to get it to do what I wanted to do.
Attempt 1: Source based routing
My first attempt at my problem involved source-based routing: Since most of the time, users will be using WAN connection #1 for this server, route all traffic originating from the IP address of my server out WAN connection 1. This works, however, it requires a manual change to the routing table when WAN connection 1 goes down. An administrator would have to switch the source based routing rule to now say route all traffic originating from server IP address out WAN connection #2.
Wouldn’t it be simpler if the router could somehow just remember what connection the packet came in and route subsequent replies through that same interface?
Attempt 2: Packet marking based routing
My second attempt at my problem centered around being able to track connections and routing accordingly. To do this, I discovered iptables’ packet marking and connection marking.
In short, iptables has two types of targets that one can use to mark packets: CONNMARK and MARK. CONNMARK marks a connection. Once marked, packets in the same “conversation” are also marked with the same CONNMARK indicator.
Another marker is the packet marker denoted by iptables’ MARK target. (Couldn’t they have come up with better names?!) The MARK target only marks individual packets. They are not resilient like the connmark indicators – i.e. they only retain their value for the duration of that one packet’s lifespan.
Now when I first went diving into this, I erroneously thought that one could simply set the CONNMARK when a packet came in one WAN line, and have the routing tables detect that connmark and route accordingly. As I soon discovered though, iproute2 only recognizes packet MARKs not CONNMARKs. Thus, to do what I wanted, the CONNMARK value had to be copied to the MARK value each time a packet was about to be routed.
Solution Part 1: Configuring the mangle table in iptables
Given the above restrictions with CONNMARK and MARK, I devised in plain English the steps I want my router to take when marking packets and when routing.
- If this is the first packet in a connection (i.e. it doesn’t have a CONNMARK nor a MARK) then, set the MARK of the packet to 1 or 2 depending on which line it came in. Save this MARK to the CONNMARK value and accept the packet for routing.
- If, however, a CONNMARK does exist, then restore that CONNMARK to the MARK value. Check to see what the MARK value is. If it is 1 or 2, then ACCEPT the packet for routing.
Once the packet is accepted for routing, route basis these rules:
- If the packet has a MARK value of 1 then use the routing table for WAN connection #1.
- Else if the packet has a MARK value of 2, then use the routing table for WAN connection #2.
Now that you understand the English algorithm, I will translate it into pseudocode in the same order in which it must appear in iptables’ mangle table:
- Restore the packet’s CONNMARK to the MARK. (If one doesn’t exist, then no mark is set.)
- If packet MARK is 1, then it means that there is already a connection mark and the original packet came in on WAN #1, so ACCEPT.
- Else, we need to mark the packet. If the packet is incoming on eth1 then set MARK to 1
- If packet MARK is 2, then it means there is already a connection mark and the original packet came in on WAN #2, so ACCEPT.
- Else, we need to mark the packet. If the packet is incoming on eth2 then set MARK to 2
- Save MARK to CONNMARK. This rule will be hit only if the previous rules (2, and 4) did not match. A new mark would have been written according to rules (3 and 5) and it is saved here to the connection mark indicator.
Finally, the actual iptables commands:
iptables -A PREROUTING -t mangle -j CONNMARK --restore-mark
iptables -A PREROUTING -t mangle --match mark --mark 1 -j ACCEPT
iptables -A PREROUTING -t mangle -i eth1 -j MARK --set-mark 1
iptables -A PREROUTING -t mangle --match mark --mark 2 -j ACCEPT
iptables -A PREROUTING -t mangle -i eth2 -j MARK --set-mark 2
iptables -A PREROUTING -t mangle -j CONNMARK --save-mark
Solution Part 2: Configuring iproute2 to route according to the packet markers
Now that the connection and packets are marked as they come in, we need to instruct the routing table to route according to the markers on each packet. This is done using the Routing Policy database available in iproute2. In essence, this database defines a bunch of rules which when matched, ask the router to consider specific routing tables rather than the default routing table. In this way, we can define specific rules that say when the packet has a marker value of say “1″, use wan_one routing table. Similarly if the packet has marker value of “2″, use the wan_two routing table.
Several things need to be done in order to put all this together:
1. Modify the file /etc/iproute2/rt_tables.
2. Add two custom tables at the bottom of the file. Number the table numbers similar to your packet marker numbers for simplicity.
myrouter:/etc/iproute2# more rt_tables
# reserved values
3. Define each routing table (wan_one and wan_two) by specifying rules specific to that connection. Note, however, that you must also specify rules that dictate how other packets will behave as well (notably packets destined for the local LAN). This is because once in the special routing table, the routing process does not consult your default routing table anymore. This is what I have in my two routing tables:
myrouter:/etc/iproute2# ip route show table wan_one
172.16.1.0/24 dev eth0 scope link
default via 184.108.40.206 dev eth1
myrouter:/etc/iproute2# ip route show table wan_two
172.16.1.0/24 dev eth0 scope link
default via 220.127.116.11 dev eth2
These are the commands I entered to get the routing tables above:
ip route add 172.16.1.0/24 dev eth0 table wan_one
ip route add default via 18.104.22.168 dev eth1 table wan_one
ip route add 172.16.1.0/24 dev eth0 table wan_two
ip route add default via 22.214.171.124 dev eth2 table wan_two
4. Next, you must define the iproute2 rules that will tell iproute2 to use the special routing tables. Do this by issuing the following commands:
ip rule add fwmark 1 table wan_one prio 1024
ip rule add fwmark 2 table wan_two prio 1025
Note: the prio (priority) numbers are simply there to ensure that they get placed in the right order and relatively near the top of the rules. You may need to adjust this number if you have other rules in your policy database.
You can verify that the rules were entered correctly by issuing an ip rule show command.
myrouter:/usr/local/sbin# ip rule show
0: from all lookup local
1024: from all fwmark 0x1 lookup wan_one
1025: from all fwmark 0x2 lookup wan_two
32766: from all lookup main
32767: from all lookup default
5. Add a default gateway to the default routing table to define the default path unmarked packets must take.
You’re done! Packets now coming in wan connection one should be marked with 1, which then get routed according to table wan_one. Similarly for wan_two.
A few interesting notes in addition:
- I have not described here any of the firewalling or nat processes. Obviously you need to have these setup and tested correctly before doing the CONNMARKing and MARKing.
- Packets originating from inside the LAN will not receive a connection mark at first, and thus will fall through to the default routing table. They will route out the default gateway specified there. However, the first ack packet and every subsequent related packet should receive a connection mark, and follow one of the special routing tables.
- Because of this peculiar behaviour for packets originating from inside the LAN, and because of the nature of network address translation, it is necessary to explicitly state the ISP’s gateway in each of the default rules in the special tables. In other words, it is not enough to simply put “ip route add default dev eth2 table wan_two”. Instead, this should be issued: “ip route add default via 126.96.36.199 dev eth2 table wan_two”.
- Debugging the above solution can be a bit of a pain. I found that the iptables (mangling) part of the whole exercise can be done relatively easily through logging and the “iptables -L –line-numbers -n -v -t mangle” command, but there is no equivalent functionality in iproute2. This, probably more than anything caused more grief when things weren’t working than anything else.
- I have posted an addendum to this article which includes a few important details left out in this article.