Real-time applications like voice and video are fast becoming mainstream. Data loss or severe data breach can change the way these apps function. Even a packet loss of less than 2% affects the quality of a video call conversation. PCAP, or Packet Capture, is the ideal solution. It is a powerful tool which assists in capturing network traffic for deeper analysis of networks and identifying performance issues. Using the tool, IT/CS teams can easily detect cases like intrusion attempts, security issues, network misuse, packet loss, and network congestion.
We usually open PCAP in Wireshark which shows data in an appealing manner when we have <*.pcapng> file. The file can contain multiple packets where each packet has its own data which provides unique information.
Below are the other 2 ways to read <*.pcapng> files in a console. Each of these commands has options that will improve the output.
- tcpdump –r <*.pcapng>
- tshark –r <*.pcapng>
I was asked not to use the third-party libraries and decode the data. It had lots of learning curves. I have enjoyed working on it and here’s how you can do it with few basic steps.
Ethernet- A Wired Interface
Before we continue, let us understand few basics:
- An access point (a radio receiver/transmitter used to bridge between wireless/wired networks) can have multiple interfaces
- A user can start PCAP on each interface. This blog will mainly look into Ethernet (wired) interface.
The basic idea behind Ethernet was to connect two or more devices with a co-axial cable to exchange information. The system that connects through Ethernet divides the data into frames or shorter pieces. Each frame contains a source, destination address, and error-checking data to detect and discard the damaged frames.
Usually, after starting PCAP on any device, they typically create a file with the extension .pcap/.pcapng consisting of 1000’s frames. But, here, we will take one single frame and get basic information like destination mac, source mac, and the ether type.
Each frame can be represented using hex, but you should encode them to reduce the size while transmitting. Here we will take an example of an encoded frame using base64, decode it and convert it to hex, and then we will start dissecting the data using a simple application written in Python.
Prerequisites
The preferred language we are going to use is Python (3.x) because of its power and simplicity.
In our case we are going to use Intellij IDE as preferred editor. This editor choice is optional.
An encoded packet data
////////0E3Gwu+KgQAAAY/9vu8EAy4BAwU5duAKAgJsRwABAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA== (wired)
Let’s create a simple python project (PcapngReader) in IntelliJ IDE.
class | Function | gist |
helper.py | helps to decode the encoded data to hex dump | helper.py |
ether.py | helps to retrieve useful Wired information from hex dump created or existing hex dump | ether.py |
ether_types.py | contains a lookup whose key is hex code and value is the ether type respectively | ether_types.py |
- helper.py will return a list of hex values from encoded string. i.e.:
[‘ff’, ‘ff’, ‘ff’, ‘ff’, ‘ff’, ‘ff’’, ‘d0’, ‘4d’, ‘c6’, ‘c2’, ‘ef’, ‘8a’, ’81’, ’00’, ’00’, ’01’, ‘8f’, ‘fd’, ‘be’, ‘ef’, ’04’, ’03’, ‘2e’, ’01’, ’03’, ’05’, ’39’, ’76’, ‘e0’, ‘0a’, ’02’, ’02’, ‘6c’, ’47’, ’00’, ’01’, ’00’, ’00’, ’00’, ’10’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’, ’00’]
For Ethernet frame (Interface == 1)
Continuing on ethernet frame, the frame type we are going to decode here is Ethernet-II.
Destination Address (6 bytes) – The MAC address of the destination machine where data has to be sent.
Source Address (6 bytes) – The MAC address of the source machine which is sending the data.
Ether Type Field (4 bytes) – This field helps to identify the protocol carried in the frame i.e. normally for wired it is 0x8100 (802.1Q Virtual Lan).
802.1Q Virtual Lan (4 bytes) – The Ethernet frame includes a VLAN (Virtual Local Area Network) tag. First 2 bytes helps to identify the Priority, DEI and VLAN Id. The last 2 bytes again identifies the Ether yype field usually IPv4, IPv6, etc.
Data – Commonly known as payload, it contains information like source IP, source port, destination IP and destination port, if the ether type is ipv4 and transmission protocol is tcp/udp.
column name | length | hex | Other comments |
DA (Destination Address) | 6 | ff:ff:ff:ff:ff:ff | First 6 bytes, starting from 0-5 bytes, (colored in blue) |
SA (Source Address) | 6 | d0:4d:c6:c2:ef:8a | starting from 6-11 bytes (colored in green) |
ethernet_proto | 2 | 8100 | starting from 12-14 bytes and for ethernet it is generally vlan tagged frame (colored in red) |
Vlan details | 2 | 0001 | 15-16 to bytes, to retrieve the values for vlan we need to convert this to binary (colored in brown) |
ether_type | 2 | 8ffd | 17-18 bytes usually represents other ether types, described in https://en.wikipedia.org/wiki/EtherType. or length (colored in red) |
vlan details: (binary information)
vlan_details_bin = bin (int (‘0001’,16)) => bin (1) => 0b1 => ‘0b1’[2:].zfill(16) => 0000 0000 0000 0001
priority = int(vlan_details_bin[0:3], 2) => colored in blue, 0 = Best Effort to 7= highest
dei = int(vlan_details_bin[3], 2) => colored in orange, 0 = Ineligible, 1 = eligible
vlan_id = int(vlan_details_bin[4:], 2) => (colored in black) = 1
ether_type/length:
A notable exception is when the destination address is 01:00:0c:cc:cc:cc (Cisco) at that point this byte represents the length, and the remaining byte will contain LLC protocol.
Please note additional bytes will help you to identify more information for example, when ether type is ipv4, it will contain information about transmission protocol like TCP and UDP as well as source/destination IP address. And also source/destination port with the help of known port we can identify application protocols using https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers.
In short, our application was able to retrieve the following useful information from the encoded string.
One of the best use cases is when we are not supposed to use third-party libraries and retrieve values from encoded frame.
In addition, it shows how the IEEE 802.1Q details help us to convert data to human readable.
Conclusion
The direct impact of packet capture is immense. People, who work with data in real-time, experience this all the time. Be it in a field like network traffic for security-related threats or providing important forensic clues in an investigation, the right type of packet capture can change the outcome dynamics. I hope this blog helps you resolve your packet capture-related issues. Do share your feedback in the comments section.