5

I'm trying to parse H.264 frames from a .mov file. I think I've come to the conclusion that mov.c from AVFormat-part of FFMPEG is the way to go. But mov.c is ~2600 lines of next to uncommented code. I'm looking for examples of usage of FFMPEG, especially parsing the structure of any file type. doesn't matter if it is MPEG4 or Quicktime Movie since they are quite similar in structure.

if there are no existing examples (I can't find any) maybe someone has used it and can give me a couple of lines of code, or explain how to get started?

What I'm trying to do: i use AVCaptureSession to capture samples from the video camera, these samples are then encoded in H264 and written to file with the help of AVAssetsWriter, AVAssetsWriterInput and AVAssetsWriterInputPixelBufferAdaptor. The reason for which is that I can't access the hardware H264 encoding directly since apple won't allow this. What I now need to do (I think not sure) is parse out:

The "mdat"-atom (Movie data, there might be more than one i think) from the .mov file. then the "vide"-atom and then within the vide-atom (Video data sample, there might be more than one). I think there will be several atoms which i belive is the frames. these will be of type "avc1" (that's the type for H264). Please correct me in this because i'm quite sure that i havn't gotten all of this correctly yet.

my question then is, how will i go about parsing out the single frames. I've been reading the documentation and looked at iFrameExtractor (which is not very helpful since it decodes the frames). I think I've understood it correctly when I'm supposed to use mov.c from FFMPEG-AVFormat but I'm not sure.

Edit: I'm now trying like this:

  1. I run the slightly reduced init function i iFrameExtractor which finds the videostream in the .mov-file.

  2. I get the data for the frame like this:

    AVPacket packet;
    av_read_frame(pFormatCtx, &packet);
    NSData *frame;
    if(packet.stream_index == videoStream){
        frame = [NSData dataWithBytes:packet.data length:packet.size];
    }
    videoStream++;
    av_free_packet(&packet);
    return frame;
    

i then pass it to a subclass of NSOperation where it is retained in wait for upload. but i receive a EXC_BAD_ACC, am I doing something wrong when copying the data from the frame? any ideas. i get the EXC_... when I try to set the class variable NSData* frame using its (nonatomic,retain)-property. (it says EXC_BAD_ACC on the synthesize row)

Robin Rye
  • 480
  • 7
  • 20
  • mov.c is not going to help you with the end goal. If you need MOV/MP4 parsing help it can be handy. Another handy library for when things get rough [mp4v2](http://code.google.com/p/mp4v2/). Basically you are going to have to write this yourself. No library is going to get the job done for a variety of reasons. – Steve McFarlin Aug 15 '11 at 20:05
  • @Steve McFarlin, thanks, you have any tips on reading except the qt-documentation to grasp the whole structure of the mov-file because i'm really having trouble gripping all of it, is it correct that the atom don't even need to be in a specific order? which one did you think was easiest to work with mov or mp4? – Robin Rye Aug 16 '11 at 06:55
  • @Steve McFarlin, i guess you have seen the iFrameExtractor project. it's also essentially the same code as in the tutorial by Martin Böhme (for instance on dranger.com). in next frame function they only use av_read_frame and then decode this. won't the AVPacket modified by av_read_frame be the encoded H264-frame? – Robin Rye Aug 16 '11 at 07:22
  • 1
    Question 1 - You will need to read ISO-14496-10 section 7.3 and Annex B (14496-15 is also helpful). You need to know what a NALU is. While it may be possible to use mp4v2 or FFmpeg I do not recommend it. I can guarantee you will have to modify the sources of those libraries to do what you want. The MOV file is a tree structure. Parsing is very easy. I suggest you write a atom parser to start with. Just dump the FOURCC code of each atom to the console. There is a simple Java project floating out there somewhere that should get you started. – Steve McFarlin Aug 16 '11 at 18:40
  • 2
    Question 2 - Actually I have not seen that project. It more then likely is the encoded H264 frame. If this is the path you are going to take then you should use movieFragmentInterval. This way the SPS/PPS NALUs are written out to file before the entire movie. These should be in the extradata field, and will most likely be in Annex B format. Again, using FFmpeg/mp4v2 is going to be harder then writing this from scratch. It may not even be possible to do in RT with these libraries. You will certainly have to modify them. – Steve McFarlin Aug 16 '11 at 18:50
  • @Steve McFarlin i've managed to capture the frames using the simple code below. But when i've passed the frames to server and trying something simple as ffmpeg -i file* test.mov i get "file00: Invalid data found when processing input". i read that the av_read_frame-function doesn't verify that the frame is valid and might contain extra information that will help the decoder. i guess this is the case, any idea how to get rid of this? in other words i could ask what does the mdat-atom contain that is not raw frame-data? – Robin Rye Aug 18 '11 at 15:30
  • 2
    In a MOV file the SPS/PPS NALUs are in the header of the MOV file. They are not written until the MOV file is complete. You can use movieFragmentInterval in iOS to create a 'streaming' quicktime file such that this information is written before any sample data. However the resulting file is much more complex to parse 'by hand'. I am not sure if FFmpeg supports this file type or not. The mdat atom contains sample data. If you are only storing AVC data then this will be I and P frames assuming Baseline 3.x. Again. I highly recommend you write this from scratch. – Steve McFarlin Aug 21 '11 at 20:50

3 Answers3

1

i use the following to parse each frame from the mov file.

-(NSData *)nextFrame {
    AVPacket packet;
    NSData *frame = nil;

    while(!frame && av_read_frame(pFormatCtx, &packet)>=0) {

        if(packet.stream_index == streamNo) {
            frame = [[[NSData alloc] initWithBytes:packet.data length:packet.size] autorelease];
        }
        av_free_packet(&packet);
    }
    return frame;
}

although watch out since av_read_frame does not verify the frames, that is done in the decoding step. this means that the "frames" returned might contain extra information which are not part of the actual frame.

to init the AVFormatContext *pFormatCtx and AVCodecContext *pCodecCtx I use this code (which I believe is derived from Martin Böhme's example code):

    AVCodec *pCodec;

    // Register all formats and codecs
    av_register_all();

    // Open video file
    if(avformat_open_input(&pFormatCtx, [moviePath cStringUsingEncoding:NSASCIIStringEncoding], NULL, NULL)!=0)
        goto initError; // Couldn't open file

    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx,NULL)<0)
        goto initError; // Couldn't find stream information

    // Find the video stream
    streamNo = -1;
    for(int i=0; i<pFormatCtx->nb_streams; i++){
        if(pFormatCtx->streams[i]->codec->codec_type == AVMEDIA_TYPE_VIDEO)
        {
            streamNo = i;
            break;
        }
    }
    if(streamNo == -1)
        goto initError; // Didn't find a video stream

    // Get a pointer to the codec context for the video stream
    pCodecCtx=pFormatCtx->streams[streamNo]->codec;

    // Find the decoder for the video stream
    pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
    if(pCodec==NULL)
        goto initError; // Codec not found

    // Open codec
    if(avcodec_open2(pCodecCtx, pCodec, NULL)<0)
        goto initError; // Could not open codec

    return self;

initError:
    NSLog(@"initError in VideoFrameExtractor");
    [self release];
    return nil;

hope this helps someone in the future.

Robin Rye
  • 480
  • 7
  • 20
0

There's a pretty good tutorial on using libavcodec/libavformat here. The bit it sounds like you're interested in is the DoSomethingWithTheImage() function they've left unimplemented.

Flexo
  • 82,006
  • 22
  • 174
  • 256
  • i want the raw H.264 data, so i can reassemble the frames to a mov on the server side later on. i've looked at this example before and couldn't quite figure out if i was supposed to skip the decode step? and just keep the `rawData=packet.data`? what happens when i decode? do i go from the H.264 standard then? – Robin Rye Aug 15 '11 at 11:58
  • @yi_H i'm disassembling the .mov file while it is recording to send the H264 frames to a server where i reassemble them again. it's the only way to stream H264 in real time with iOS as i understand it. – Robin Rye Aug 15 '11 at 12:41
  • @awoodland I don't wan't to decode it, i want the data encoded in H264. u have any experience of parsing the qtmov format? – Robin Rye Aug 15 '11 at 13:58
0

If you stream H264 to iOS you need segmented streaming (aka apple live streaming).

Here is an open source project: http://code.google.com/p/httpsegmenter/

Karoly Horvath
  • 88,860
  • 11
  • 107
  • 169
  • 1
    i'm going to stream from iOS. using AVCaptureSession and AVAssetsWriter to write from camera to file. then i want to parse the file to get the H264 frames and upload these to file. i've got everything working including the http-packets for uploading. what i need is a way to access the frames in the .mov file, access the raw frame data. maybe it will work with the example posted in the other answer. i'm trying now, if you have another suggestion on how i can make it work please share it :) – Robin Rye Aug 15 '11 at 13:46
  • you want to drop audio channel? you want to use a different container? I still don't get it. – Karoly Horvath Aug 15 '11 at 14:19
  • i realized that the other answer won't work since it decodes the frame, so it will no longer be encoded in H264. i need to extract the frames immediately from the video stream without decoding – Robin Rye Aug 15 '11 at 14:31
  • i've updated the question with (i think) thorough description of my problem. hope u can help, thanks – Robin Rye Aug 15 '11 at 14:51