How Can I Parse a Pcapng File in C#?

Question

I'm new to Pcapng files. I've read the 40+ page whitepaper and I'm still scratching my head and sweating. I understand that the Pcapng file is:

Made up of a Section Header Block - This is the start of every Pcapng file.

Question 1: How large is this?

It appears that it's BlockType (4 Bytes) + BlockTotalLength (4 bytes) + Byte Order Magic (4 Bytes) + Mahor and Minor Version (4 bytes total, 2 bytes each) + Section Length (4 bytes) + Options (Variable) + Block Total length (again, 4 bytes).

If I'm building a parser, how would I know how many bytes I need to skip to arrive at my first data frame block?

Question 2: Where is the data stored? By data I mean the entire frame that contains Ethernet, IP, and TCP Data, as shown in the picture below (Figure 1).

The documentation states that:

A section includes data delimited by two section header blocks.

When doing a manual inspection (yes, I went byte by byte over a file to see how many bytes lie in between two frames :'( ), I noticed there were 35 bytes in between each message (each message shown on wireshark had 35 bytes in between). Are these bytes related to a pcapng block?

Once I understand how to get to the first tcp frame, and how many bytes I need to skip to get to the next, I can build my parser.

I'm willing to send Bitcoin/Monero to anyone who can help me understand how I can best parse these pcapng messages. Thanks!

Files consist of blocks. Blocks can be of different types. If you don't care about byte order, you can ignore section headers. Just look for Enhanced Packet Blocks, Simple Packet Blocks and (if dealing with old data) the obsolete Packet Blocks. — NetMage, Mar 12 '20 at 23:47
Checking the spec, there is a minimum of 32 bytes between packet data, unless there are options, and those are always multiples of 4 bytes, so I don't see how you got 35. Can you show a section of the pcapng file? (Preferably from the start.) — NetMage, Mar 12 '20 at 23:51
You could also just use this managed project for parsing [PcapngUtils](https://github.com/ryrychj/PcapngUtils). — NetMage, Mar 13 '20 at 00:07
@NetMage Can I add you on some other platform? How do yuou know there are a minimum of 32 bytes between packet data without options? I'll pay you to teach me. Indeed it is 36 bytes in between each. I have no idea how I'd know how many options would be in each section. — Tee Zad Awk, Mar 13 '20 at 00:08
From the [pcapnp file format](http://xml2rfc.tools.ietf.org/cgi-bin/xml2rfc.cgi?url=https://raw.githubusercontent.com/pcapng/pcapng/master/draft-tuexen-opsawg-pcapng.xml&modeAsFormat=html/ascii&type=ascii#section_epb) you can see in section 4.3 an Enhanced Packet Block has fixed 28 bytes before the packet data and fixed 4 bytes after (repeated Block Total Length) plus the Options. Section 3.5 says options have 4 fixed bytes and then an optional value that is a multiple of 4 bytes. — NetMage, Mar 13 '20 at 00:12
Also, you determine the size of the options area by taking the Block Total Length, subtracting the beginning overhead (28), the duplicate length (4) and the Packet Data length (Captured Packet Length rounded up to multiple of 4). What's left is the Options block. But if all you care about is the Packet Data, you can just skip to the next block using the Block Total Length. — NetMage, Mar 13 '20 at 00:15
@NetMage Great point... I can skip Options using Block Total Length. — Tee Zad Awk, Mar 13 '20 at 01:07

Christopher Maynard · Accepted Answer · 2020-03-13T16:42:07.563

I think @tee-zad-awk found an answer that helped over at https://ask.wireshark.org/question/15159/how-can-i-display-as-much-pcapng-information-as-possible/, but for the benefit of anyone else looking for an answer to this question, I've linked it here and have provided my answer below, just in case the link is ever broken someday.

It seems that, after reading the 40 page whitepaper on Pcapng ...

The current PCAP Next Generation (pcapng) Capture File Format draft document is 52 pages, so perhaps you're not looking at the most recent version? Other versions do exist, such as those at https://tools.ietf.org/html/draft-tuexen-opswg-pcapng-00, https://pcapng.github.io/pcapng/ or https://www.tcpdump.org/pcap/pcap.html and probably others, but they're all obsolete.

If you're looking for a pcapng parser to help you decipher the file, then look no further than Wireshark itself. If you've loaded a pcapng file into Wireshark, you can use "View -> Reload as File Format/Capture" (Ctrl+Shift+F) to cause Wireshark to load and display the raw file contents itself rather than to load and display the packets from the file. This should cause you to be able to see the various pcapng blocks and be able to drill down into them. For example:

Frame 1: 184 bytes on wire (1472 bits), 184 bytes captured (1472 bits)
MIME file
PCAPNG File Format
    Block: Section Header Block 1
    Block: Interface Description Block 0
    Block: Enhanced Packet Block 1

You can also have a look at the Wireshark source code, such as the epan/dissectors/file-pcapng.c and wiretap/pcapng.c files.

By the way, if you're looking to support all extensions, the Wireshark [PcapNg wiki page] (https://wiki.wireshark.org/Development/PcapNg) has a link to Augmented PCAP Next Generation Dump File Format page that you might also want to take a look at. I don't know how many other extensions may have been implemented but not included in the main pcapng file format specification, but hopefully not many, as this could quickly become problematic with different projects possibly using the same block type for different blocks. That practice should be highly discouraged.

Hahha, that was my question that I asked on Wireshark after posting here. For doing the research and coming up with this answer, I'll give you the green checkmark. — Tee Zad Awk, Mar 13 '20 at 16:15
I am a little confused by the timestamp though. I see a low and high component, but I'm not sure what to make of it and how to convert it to a UTC timestamp with nanosecond precision. — Tee Zad Awk, Mar 13 '20 at 16:16
@TeeZadAwk Unfortunately, timestamps are 64-bit timestamps in units of 10^-6 seconds unless there is an Interface Description Block with a if_tsresol option, in which case you need to interpret the option. Which makes getting timestamps more complicated than it should be. It is all in the file format documents. — NetMage, Mar 13 '20 at 16:22
@NetMage Yep, I read that part, I meant I'm confused on interpretting. So I'm at my Interface Description Block and the 64 bit number that contains both a Timestamp Resolution of 9 (10^-9, Nanoseconds?) and 6 (10^-6, Microseconds). https://i.imgur.com/G8KI47k.png How do I make sense of this? Which one is it? In hex this is `09 00 01 00 06 00 00 00` in Little endian. So the most significant bit is 0. This means `the remaining bits indicates the resolution of the timestamp as a negative power of 10 `. What do I make of this? — Tee Zad Awk, Mar 13 '20 at 16:36
@NetMage, Copy/paste didn't pick up all the links properly, and I missed re-linking to the most recent pcapng capture file format specification. It should be fixed now. Thanks for pointing that out. — Christopher Maynard, Mar 13 '20 at 16:43
There's only 1 resolution specified, namely 6 (i.e., 10^-6 or microseconds). The 9 is just the option code for the time resolution option, which you can see in the Interface Description Block Options table below Figure 10 of the pcapng capture file format specification. More information over at https://ask.wireshark.org/question/15177/what-are-the-units-of-time-referring-to-in-an-enhanced-packet-block/ — Christopher Maynard, Mar 13 '20 at 19:45

score 1 · Answer 2 · answered Mar 15 '20 at 08:42

If I'm building a parser, how would I know how many bytes I need to skip to arrive at my first data frame block?

That's not how you do it.

If you're building a parser, note that a parser must look at more than just the first data frame block.

First of all, it must look at the Section Header Block (SHB), to determine the byte order of the data in all the subsequent blocks by looking at the Byte-Order Magic field.

After that, you need to look at all subsequent blocks, looking for Interface Description Blocks and Enhanced Packet Blocks (EPBs), Simple Packet Blocks (SPBs), and possibly Packet Blocks (PBs) (those are obsolete, so no program should write them, but programs should be prepared to read them). Each EPB or PB has an interface ID that refers to an IDB, which must have appeared before the EPB or PB in question; an SPB implicitly refers to the first IDB, which, again, must have appeared before the SPB in question.

The format of the packet data in an EPB, SPB, or PB depends on the link-layer type specified by the IDB to which it refers, so you need to have read the IDB in question.

And, as the above indicates, there is no fixed number of bytes between the SHB and the first EPB, SPB, or PB, so there is no simple fixed number of bytes to skip to get to the first data frame block. For one thing, there's a variable number of bytes, which you can only determine by reading all the blocks before the first EPB, SPB, or PB. For another thing, you can't skip them, you have to read them to get enough information to interpret the packet data in them.

Where is the data stored? By data I mean the entire frame that contains Ethernet, IP, and TCP Data, as shown in the picture below (Figure 1).

It's stored in EPBs, SPBs, or PBs. See the descriptions of those three block types; frames are in the "Packet Data" fields of those blocks.

So I'm at my Interface Description Block and the 64 bit number that contains both a Timestamp Resolution of 9 (10^-9, Nanoseconds?) and 6 (10^-6, Microseconds).

As Christopher Maynard indicated, the 9 isn't a timestamp resolution, it's an option type. Pcapng blocks have both fixed information at the beginning and options; an option begins with an option type and option value length, followed by the option data. An IDB if_tsresol option has

2 bytes of option type, with the value 9;
2 bytes of option value length, with the value 1;
1 byte of option value, with the value as specified in the description of that option.

A value of 6 means the time stamp resolution is 1/10^6 of a second, which means 1 microsecond.

How Can I Parse a Pcapng File in C#?

2 Answers2