HowTo - tell tcpdump to filter mixed tagged and untagged VLAN (IEEE 802.1Q) traffic
A week ago I needed to filter VLAN traffic with tcpdump. Everything went well, as long as *only*
tagged or *only* untagged traffic was given as input. However, when trying to filter say UDP packets
out of traffic that contains both tagged and untagged packets, tcpdump screwed my filters. As I think
this situation may happen to some more people, here some input for nerds struggling with the same
issue in the future.
Example doomed to fail with mixed traffic:
tcpdump -nn -v udp
This simple BPF filter should basically deliver all UDP packets, regardless whether the traffic is tagged with a VLAN tag or not. But: it doesn't. The issue is that tagging traffic inserts four more bytes (namely the VLAN ID) to the ethernet (or more precisely IEEE 802.1Q) header. Without specifically asking for VLAN traffic in the BPF filter, every traffic is parsed as untagged traffic. Thus, the specified filter delivers only untagged UDP packets (i.e., their frames) and drops all tagged traffic.
Now watch out: similar things happen if you specify the mysterious 'vlan' keyword in the tcpdump filter. After specifiying the 'vlan' keyword, the *preceding* filters are matched against traffic shifted by 4 bytes to the right. Note that this is also true if you specify 'not vlan' as filter. The internals of how tcpdump translates the BPF filter are exposed when calling tcpdump with the -b option:
BPF translation of filter 'not vlan and udp':
[root@vm-fedora ~]# tcpdump -nn -d not vlan and udp (000) ldh  (001) jeq #0x8100 jt 10 jf 2 (002) ldh  (003) jeq #0x86dd jt 4 jf 6 (004) ldb  (005) jeq #0x11 jt 9 jf 10 (006) jeq #0x800 jt 7 jf 10 (007) ldb  (008) jeq #0x11 jt 9 jf 10 (009) ret #96 (010) ret #0
What do we see here? Although we explicitly specified to have untagged traffic, our filter fails and matches UDP traffic that has no VLAN tag but is shifter by 4 byte to the right (i.e., it matches nothing). Our fault was to specify the 'vlan' keyword, such that all preceding filters ('udp') are matched against shifted traffic. To cope with this issue, one should be careful in which order the filter is put together. If we want to match both tagged and untagged UDP traffic, we have to specify the following filter:
Filter UDP traffic, both VLAN tagged and untagged:
[root@vm-fedora ~]# tcpdump -nn -d "udp or (vlan and udp)"
Or, the generic solution:
Generic filter expression that matches VLAN tagged and untagged traffic:
[root@vm-fedora ~]# tcpdump -nn -d "<filter> or (vlan and <filter>)"
If you want to filter only untagged traffic, specify the following:
Generic filter to match only untagged traffic:
[root@vm-fedora ~]# tcpdump -nn -d <filter> and not vlan
Long story short: When using tcpdump (or libpcap), be careful where to put the 'vlan' keyword in your expression. In general, it's a very bad idea to specify the keyword twice, unless you pack VLAN traffic into VLAN traffic. Maybe these examples are more explanative than the quote below taken from the tcpdump manpage: "Note that the first vlan keyword encountered in expression changes the decoding offsets for the remainder of expression on the assumption that the packet is a VLAN packet." Recall this (admittedly sometimes strange) behavior is not a bug...
Thanks goes to Nuno Paiva, who sent me an example how to solve matching mixed traffic. Thanks to Dan Cox who spotted missing quotes.