Monday, December 12, 2011

Linux bridging will drop too big packets marked with DF

It should be obvious, but Linux will drop packets on the floor if you try to bridge packets marked as Don't Fragment (DF) which are too big for the interface. If you set a MTU on a bridge the setting will propagate to all the attached bridge links. A Linux bridge is a simple L2 device, even if you can assign an IP address to a virtual host port on the bridge. Hence, given that it does not actually routes packets, there's no place to generate the ICMP Destination Unreachable error messages that are needed to get Path MTU Discovery to converge to the right value. As long as you route instead, you're fine.

TCP streams like HTTP and SSH data transfers between my laptop and my home workstation have the DF bit set, because that's a part of RFC 1191 used for proper Path MTU Discovery. As my OpenWRT-based router had a lower MTU than the default of 1500, the communication was dropped as soon as there was a bulk of data to transmit. Interactive SSH sessions worked just fine. An easy way to debug this is to run tracepath(6) against the target. It will stop at the first hop that just drops packets instead of emitting proper error messages.

Sunday, November 20, 2011

Slots on Cisco ASA 5550

According to the Cisco product description of the ASA 5550 the appliance has a maximal throughput of "up to 1.2 Gbps". It's pretty common for Cisco to measure throughput in interesting ways. Mostly they'll add up RX and TX and mention the combined bandwidth. In this case this means that you're likely to be capped at somewhere around 600 Mbps of bi-directional traffic.

One thing you really have to watch out for, to get more out of that particular appliance, is balancing the traffic across its two slots. You can issue show traffic on the CLI which gives you the current balance at the end of its output. If you're just using one of the two slots and your CPU usage is at 100% you'll experience packet loss despite the fact that there's still some free capacity. It seems that the CPU is blocked too much waiting for the queues to get free at this point, when trying to send back traffic to where it came from. Using both slots the CPU usage went back to 50% which seems much more reasonable.