blog tags:


I'm Dmitry Popov,
lead developer and director of Infognition.

Known in the interwebs as Dee Mon since 1997. You could see me as thedeemon on reddit or LiveJournal.

Articles Technology Blog News Company
Deciphering .grf files
August 10, 2012

In DirectShow we work with graphs of filters. We build them in tools like GraphEdit or GraphEditPlus while experimenting and then we build them in our own code. Some parts of graph can be built automatically by DirectShow's intelligent connect procedure which selects filters according to their ability to handle given mediatypes and their priorities. To see details of graph built by our code we can save the graph to a file and then open it in an editor. Loading a graph from a file is done by calling IPersistStream::Load() method which performs all the loading logic and either succeeds or fails, there's not much control over its actions. If the graph was created on a different machine or the same machine but in different circumstances and it mentions some filters, devices or even files not available at the moment of loading, then loading fails and the graph file is pretty much worthless.

Not anymore! Here is a small utility which can read a .grf file and translate it to plain text containing most of useful information. Now you can easily see graph details (filters, connections, mediatypes, including all basic info for video and audio streams) even if you don't have all the mentioned filters and files. (134 KB)

It's a command line tool, you run it like

grfdump.exe file.grf > plain_text.txt
and get something like:
1 File Source (Async.) 0000 E436EBB5-524F-11CE-9F53-0020AF0BA770 SOURCE: C:\video\Video22.MP4 filter data: 0 bytes.
2 Elecard MP4 Demultiplexer 9A79C4D0-84CC-46F3-824C-BC5793D5596C  filter data: 99 bytes.
3 Elecard AVC Video Decoder 5C122C6D-8FCC-46F9-AAB7-DCFB0841E04D  filter data: 532 bytes.
4 Elecard AAC Audio Decoder 109DF9EC-AEA3-47A3-97EA-DAAF57EC97F0  filter data: 180 bytes.
5 VDFilter 6CD44B99-8406-4E8B-A522-911FCFBEA2F2  filter data: 87 bytes.
6 Video Renderer 70E102B0-5556-11CE-97C0-00AA0055595A  filter data: 0 bytes.
7 Default DirectSound Device 79376820-07D0-11CF-A24D-0020AFD79767  filter data: 40 bytes.

File Source (Async.) 0000.Output --> Elecard MP4 Demultiplexer.Input
fixed size: 1, temporal: 0, sample size: 1
major type: e436eb83-524f-11ce-9f53-0020af0ba770            MEDIATYPE_Stream
sub type: 49952F4C-3EDC-4A9B-8906-1DE02A3D4BC2
format type: 00000000-0000-0000-0000-000000000000
format size: 0

Elecard MP4 Demultiplexer.H.264 Video (Annex B) --> Elecard AVC Video Decoder.In
fixed size: 0, temporal: 1, sample size: 58
major type: 73646976-0000-0010-8000-00AA00389B71  'vids' == MEDIATYPE_Video
sub type: 8D2D71CB-243F-45E3-B2D8-5FD7967EC09B
format type: E06D80E3-DB46-11CF-B4D1-00805F6CBBEA
format size: 194
rcSource: {left: 0, top: 0, right: 704, bottom: 572}
rcTarget: {left: 0, top: 0, right: 704, bottom: 572}
bitrate: 0 AvgTimePerFrame: 1199600
biSize: 0
biWidth: 704
biHeight: 572
biPlanes: 1
biBitCount: 24
biCompression: 875967080
biSizeImage: 0
biXPelsPerMeter: 0
biYPelsPerMeter: 0
biClrUsed: 0
biClrImportant: 0

Elecard MP4 Demultiplexer.AAC Audio --> Elecard AAC Audio Decoder.Input Pin
fixed size: 1, temporal: 0, sample size: 1
major type: 73647561-0000-0010-8000-00AA00389B71  'auds' == MEDIATYPE_Audio
sub type: 000000FF-0000-0010-8000-00AA00389B71
format type: 05589f81-c356-11ce-bf01-00aa0055595a        FORMAT_WaveFormatEx
format size: 23
wFormatTag: 255
nChannels: 2
nSamplesPerSec: 48000
nAvgBytesPerSec: 24000
nBlockAlign: 4
wBitsPerSample: 16
cbSize: 5

A few thoughts arised while making this tool.

DirectShow Graph file format

The format is kind of described in MSDN. It's a COM storage file containing a stream for which a grammar is given. The guy who wrote this grammar for MSDN clearly didn't understand anything about grammars. For example, here's a line from that "grammar":

<filter list> ::= [<filter> <b>] <filter list>
This line actually says that a filter list always consists of a filter list. And it may contain some filters too, but not necessarily. ;)
What he probably meant is

<filter list> ::= | <filter> [<b> <filter list>]
(filter list is either empty or consists of a filter optionally followed by a filter list)

Not only the form is screwed, content of that grammar is not correct either! It becomes obvious if you compare connection description in the grammar and in the example below (on that MSDN page). Some fields are missing, some are not in proper place. If you ever decide to implement parsing this kind of files follow the example, not the grammar. But even this will not give you a correct implementation because after trying to open some real files you'll find that even simplest description of what constitutes a blank character is wrong: there can also be a zero char.

The data is a mix of unicode text and binary data, that binary data can have odd size, so the unicode strings are not aligned at 2-byte offsets, so you can't just treat the data as a big unicode string and use simple string processing functions.

Luckily I made this tool in D programming language which allowed to create a bunch of simple parser combinators and primitive parsers, so the parsing code is pretty concise and looks similar to the grammar. For example, this line from the grammar

  <filter> ::= <n><b>"<name>"<b><class id><b>[<file>]<length><b1><filter data>
became this code in D:
  int n, datalen;
  wstring name, cls, fname;

  auto r = st.b().num(n).b().p_name(name).b().p_clsid(cls).b();
  r = r.opt!((x) { return x.p_file(fname); }).num(datalen).b1();

Structures, mixins and reflection

Without reflection outputting contents of structures is a tedious task. In D we can enjoy compile-time reflection and write a simple generic function:
void printRecord(R)(R r) 
    foreach(fld; __traits(allMembers, R))
        writeln(fld, ": ", __traits(getMember, r, fld));
When applied to different structures it's automatically unrolled to different functions writing contents of that structures (including field names).

In DirectShow there are 2 very similar structures which are used very often: VIDEOINFOHEADER and VIDEOINFOHEADER2. The latter contains all the fields of the first however not in the same order, so one cannot just cast one to the other and use the same code for both. In C++ this usually leads to code duplication. In D we can use mixins: write common code once and include it to both structures.

mixin template print_vih() {
  void print()
    write("rcSource: "); rcSource.print();
    write("rcTarget: "); rcTarget.print();
    writeln("bitrate: ", dwBitRate, " AvgTimePerFrame: ", AvgTimePerFrame);

    RECT            rcSource;          // The bit we really want to use
    RECT            rcTarget;          // Where the video should go
    DWORD           dwBitRate;         // Approximate bit data rate
    DWORD           dwBitErrorRate;    // Bit error rate for this stream
    ulong           AvgTimePerFrame;   // Average time per frame (100ns units)

    mixin print_vih;
} ;

    RECT                rcSource;
    RECT                rcTarget;
    DWORD               dwBitRate;
    DWORD               dwBitErrorRate;
    ulong               AvgTimePerFrame;
    DWORD               dwInterlaceFlags;
    DWORD               dwCopyProtectFlags;
    DWORD               dwPictAspectRatioX; 
    DWORD               dwPictAspectRatioY; 
    union {
        DWORD           dwControlFlags;
        DWORD           dwReserved1;
    DWORD               dwReserved2;
    BITMAPINFOHEADER    bmiHeader;

    mixin print_vih;

Source code is included in the archive linked above. This is a very small and simple tool so I don't want to restrict its use with any license, you can do with it whatever you wish.