Dissecting DNS Responses With Scapy

Introduction

Packet analysis to support troubleshooting is a big part of my job. In a company with hundreds of discrete applications, it is not reasonable to memorize IP addresses, or even to try to maintain a cheat sheet of IPs. Therefore, when analyzing network traffic in Wireshark, the “Resolve Network Addresses” view option is a lifesaver. At least, it is most of the time.

Wireshark resolves those network addresses by performing a reverse zone lookup through DNS. If you try to inspect a capture file on an offline computer, or one not on the corporate network, network address resolution will fail. In addition, this lookup will only return the name associated with the A record, which means that if that address was resolved through SRV or CNAME records, the returned name may not be very helpful.

A perfect example I came across was a client computer attempting to find a server to receive LDAP traffic. The initial DNS query from the client was __ldap.__tcp.windowslogon.domain.test, which returned SRV records connecting that service to srv1.domain.test on port 389 and A records connecting srv1.domain.test to an IP address. Using Wireshark’s name resolution, that IP address resolves to a random server address, and I don’t get the clue that it’s an LDAP connection used for Windows logon. This is especially confusing if the TCP ports used are nonstandard.

Script Requirements

I wanted a solution that would let me take the actual, in situ, DNS queries from the client displayed in the capture and connect those to the IP addresses that show up. Therefore, my script must parse DNS responses that showed up in the packet capture and connect the initial query through any chaining to the final IP address.

To accomplish this, I chose Scapy, the “Python-based interactive packet manipulation program & library,” based on a few blog posts I found. It’s important to note that packet dissection and analysis is not the primary goal for this library; it’s primarily meant for packet crafting. In fact, most of what you can find on StackOverflow or Google about Scapy revolve around using it to perform Man in the Middle attacks, ARP or DNS poisoning attacks, or other attacks revolving around packet manipulation. Because of this, the method by which Scapy stores packets, and the way it wants you to refer to different parts of each packet, is kind of strange.

Scapy’s Peculiarities

Scapy uses a nesting approach to storing packets, which does an admirable job matching the encapsulation that most networking protocols use. If you refer to packet[TCP], the returned data will include the TCP header and everything TCP encapsulates. However, it's not very useful to simply look at a packet with Scapy, because there is no output formatting by default.

In general, Scapy uses angular brackets (< and >) to denote the beginning and end of different sections, with specific fields separated by spaces, and displayed as field_name = field_value. Given this storage method, the best way to display a field in the packet is to refer to the section and field name. For example, the sequence number in a captured frame can be returned using packet[TCP].seq. For Scapy’s returned values to make any sense for packet analysis, it’s very important to refer to, and return, individual fields rather than entire headers.

The point at which this becomes very confusing is in DNS responses. A DNS response packet has four primary sections: queries, answers, authoritative nameservers, and additional records. Not all of these are always populated, and each one of those section can have multiple records in it. In fact, the DNS response header has fields that tell you how many values each one of those sections contains.

Based on how Scapy nests different protocols, you would expect that packet[DNS] will return the entire DNS section of the packet, and you should see fields that include qd (query), an (answer), ns (nameserver), and ar (additional record). Each one of those fields should contain an array (or list) of records. However, Scapy actually stores them nested, as shown for the nameserver section below:

ns=
    <DNSRR  
        rrname='ns.domain.test.' 
        type=NS 
        rclass=IN 
        ttl=3600 
        rdata='ns1.domain.test.' |
        <DNSRR  
            rrname='ns.domain.test.' 
            type=NS 
            rclass=IN 
            ttl=3600 
            rdata='ns2.domain.test.' |
            <DNSRR  
                rrname='ns.domain.test.' 
                type=NS 
                rclass=IN
                ttl=3600 
                rdata='ns3.domain.test.' 
                <DNSRR  
                    rrname='ns.domain.test.' 
                    type=NS 
                    rclass=IN 
                    ttl=3600 
                    rdata='ns4.domain.test.' |
                    <DNSRR  
                        rrname='ns.domain.test.' 
                        type=NS 
                        rclass=IN 
                        ttl=3600 
                        rdata='ns5.domain.test.' |
                    >
                >
            >
        >
    >

This means, somewhat unbelievably, that packet[DNS].ns[0] will return all the nameserver records, and packet[DNS].ns[4] will only return the last one. Confusing these even further, the section names for these are standardized to the record type and not the field, so the DNSRR (DNS response record) section name doesn’t consistently match with response records. A response that includes a SRV record will have a section name of DNSSRV. So, despite every other application of Scapy making it very easy to reference fields by packet[section_name].field_name, DNS responses completely break that mold.

Consistently Dissecting DNS Responses

My method to dissect DNS responses consistently makes heavy use of indices rather than alphanumeric section names. Because the DNS header reports the length of each of the four major sections, use those values to iterate through the information you need.

To iterate through the all the records in the answers section, use:

for x in range(packet[DNS].ancount):

To then connect an IP address, to the original query, use:

packet[DNS].an[x].rdata    # to return the IP address
packet[DNS].an[x].rrname   # to return the response record name
packet[DNS].qd.qname       # to return the original query name

Similar references can be used to iterate through the nameservers and additional records.

Building a dictionary of all DNS Responses

While my full script can be seen on Github, my general process to building a full dictionary mapping IP addresses to A records to DNS queries is as follows:

# For a given DNS packet, handle the case for an A record
if packet[DNS].qd.qtype == 1:
    for x in range(packet[DNS].ancount):
        if re.match(ip_address_pattern, packet[DNS].an[x].rdata) == None:
            continue
        temp_dict = {packet[DNS].an[x].rdata:[packet[DNS].an[x].rrname,packet[DNS].qd.qname]}
# And repeat the same process for the additional records by substituting ar for an

The process for a SRV record (designated by packet[DNS].qd.qtype == 33), is identical, except I don’t even bother with the answers section.

Conclusion

Automated packet dissection is a real possibility with Scapy, provided you are willing to spend the time learning how Scapy stores data and effective ways of working around some of its limitations. This example of mapping DNS responses is an excellent introduction Scapy itself, and I’m excited to see what I can do in the future if I can bake in other libraries that can give me statistical measurements, timing details, or even correlation between multiple packet captures showing the same conversations.