12

I am trying to show H.264 encoded rtsp video on an Android device. The stream is coming from a Raspberry Pi, using vlc to encode /dev/video1 which is a "Pi NoIR Camera Board".

vlc-wrapper -vvv v4l2:///dev/video1 --v4l2-width $WIDTH --v4l2-height $HEIGHT --v4l2-fps ${FPS}.0 --v4l2-chroma h264 --no-audio --no-osd --sout "#rtp{sdp=rtsp://:8000/pi.sdp}" :demux=h264 > /tmp/vlc-wrapper.log 2>&1

I am using very minimal Android code right now:

final MediaPlayer mediaPlayer = new MediaPlayer();
mediaPlayer.setDisplay(holder);
try {
  mediaPlayer.setDataSource(url);
  mediaPlayer.prepare();

and getting a "Prepare failed.: status=0x1" IOException. When I look at the logs, I see lines like

06-02 16:28:05.566 W/APacketSource(  316): Format:video 0 RTP/AVP 96  / MIME-Type:H264/90000
06-02 16:28:05.566 W/MyHandler(  316): Unsupported format. Ignoring track #1.
06-02 16:28:05.566 I/MyHandler(  316): SETUP(1) completed with result -1010 (Unknown error 1010)

coming from a system process. Grepping for these messages points to the libstagefright/rtsp sources, and seems to mean that the ASessionDescription::getDimensions call in the APacketSource::APacketSource constructor is failing. This doesn't seem like it should be happening, because VLC certainly knows what dimensions to output:

[0x1c993a8] v4l2 demux debug: trying specified size 800x600
[0x1c993a8] v4l2 demux debug: Driver requires at most 262144 bytes to store a complete image
[0x1c993a8] v4l2 demux debug: Interlacing setting: progressive
[0x1c993a8] v4l2 demux debug: added new video es h264 800x600

What seems to be happening is that ASessionDescription::getDimensions is looking for a framesize attribute in the (seemingly well-formed) DESCRIBE results

06-02 16:28:05.566 I/MyHandler(  316): DESCRIBE completed with result 0 (Success)
06-02 16:28:05.566 I/ASessionDescription(  316): v=0
06-02 16:28:05.566 I/ASessionDescription(  316): o=- 15508012299902503225 15508012299902503225 IN IP4 pimple
06-02 16:28:05.566 I/ASessionDescription(  316): s=Unnamed
06-02 16:28:05.566 I/ASessionDescription(  316): i=N/A
06-02 16:28:05.566 I/ASessionDescription(  316): c=IN IP4 0.0.0.0
06-02 16:28:05.566 I/ASessionDescription(  316): t=0 0
06-02 16:28:05.566 I/ASessionDescription(  316): a=tool:vlc 2.0.3
06-02 16:28:05.566 I/ASessionDescription(  316): a=recvonly
06-02 16:28:05.566 I/ASessionDescription(  316): a=type:broadcast
06-02 16:28:05.566 I/ASessionDescription(  316): a=charset:UTF-8
06-02 16:28:05.566 I/ASessionDescription(  316): a=control:rtsp://192.168.1.35:8000/pi.sdp
06-02 16:28:05.566 I/ASessionDescription(  316): m=video 0 RTP/AVP 96
06-02 16:28:05.566 I/ASessionDescription(  316): b=RR:0
06-02 16:28:05.566 I/ASessionDescription(  316): a=rtpmap:96 H264/90000

This looks like it may be a Stagefright bug: It knows (or should know) that it has a H.264 encoded stream, yet it seems to be expecting a H.263 framesize attribute. Hence my questions:

  1. Am I reading the sources right? Is the problem in the ASessionDescription::getDimensions call? (Does stagefright only actually support H.263 streaming?)
  2. Or is the Pi-side code wrong in some way?
  3. Or am I just missing a key step or two in my client-side code?

Update, 20140606:

The MediaPlayer docs say that -1010 is MEDIA_ERROR_UNSUPPORTED: "Bitstream is conforming to the related coding standard or file spec, but the media framework does not support the feature." This makes me wonder if the problem is the 'standard' progressive download issue. That is, Supported Media Formats says

For video content that is streamed over HTTP or RTSP [in a] MPEG-4 [container] the moov atom must precede any mdat atoms, but must succeed the ftyp atom

while most streams put the moov atom last.

I am not at all sure how to verify this, though!

  • I see no moov or ftyp atoms in the vlc source. (I am told that vlc is just streaming, here; that the actual H264 content is coming out of the camera driver.)
  • I see no moov or ftyp atoms in the https://github.com/raspberrypi linux or userland branches. (Maybe I'm just grepping for the wrong things, though.)
  • When I have vlc save the stream, I get an mp4 file with moov before mdat, but of course vlc could be doing some transcoding, here.

Update, 20140610:

The GPAC "Osmo4" player can display the stream on an Android 4.3 tablet. Badly (more lag than VLC on a laptop, and prone to lockups) but it can display it.

Update, 20140616:

When I tried grepping the VLC sources again (case-insensitive and without word-orientation, this time) I did find the FOURCC macros defining the moov and ftyp atoms in modules/mux/mp4.c, which quickly led to the --sout-mp4-faststart (and --no-sout-mp4-faststart) switches ... which don't make any difference.

So, it looks like it may actually not be an atom-ordering issue. That's good to know, if it closes off a whole class of dead-ends, but it does leave me banging my head against the wall (which always seem to do more damage to my head than to the wall) without a clue.

Update, 20140702:

I compiled VLC for Android, and it can display the stream generated by VLC on the pi. It puts the image in the top-left of the screen; I tried writing my own skin for their .so, and couldn't find any 'knobs' that would let me zoom-to-surface or whatever. (Plus the .apk came to about 12M!)

So, I found the relevant RFCs and wrote my own RTSP client. Or tried to: I can parse the SDP and generate enough valid RTSP to get RTP and RTCP datagrams, and I can parse the RTP and RTCP headers. But even though the SDP claims to deliver m=video 0 RTP/AVP 96 and a=rtpmap:96 H264/90000, the MediaCodec won't display video on my surface, no matter which of the three H264 codecs on my tablet I pass to MediaCodec.createByCodecName(), and when I look at the RTP payloads, I'm not too surprised: I don't see the NAL sync pattern anywhere in any of the packets.

Instead, they all start with either 21 9A __ 22 FF (usually) or occasionally 3C 81 9A __ 22 FF, where the __ seems to always be an even number that goes up by 2 each packet. I don't recognize this pattern - do you?

Update, 20140711:

Turns out that H264 packets don't have to start with the NAL sync pattern - that's only necessary where NAL Units may be embedded in a larger data stream. My RTP packets are in RFC 6184 format.

Community
  • 1
  • 1
Jon Shemitz
  • 1,225
  • 12
  • 28
  • can you read the same stream with another app, e.g. VLC? – Alex Cohn Jun 07 '14 at 06:13
  • Huh - I'm surprised I didn't mention that. Yes, VLC can read the stream ... but, of course, VLC is willing to open a second stream to find the moov atom and Android is not. (Sorry, reference to that is on machine at work.) – Jon Shemitz Jun 07 '14 at 16:54
  • 2
    http://bytechunk.net/mobile_progressive_playback/index.php – Jon Shemitz Jun 11 '14 at 00:20
  • Actually, I thought you tried VLC playback on Android… – Alex Cohn Jun 11 '14 at 04:14
  • I have had success with encoding video for mobile playback using a combination of imagemagic (I use this to stitch images together) and avconv (http://manpages.ubuntu.com/manpages/precise/man1/avconv.1.html) Both run nicely on linux and as the RPi is linux based neither should be a problem for you to get working. Just a suggestion. Worth looking into as I have had little success with any other tools – jamesc Jun 18 '14 at 00:29

1 Answers1

5

After an amazing amount of dead-ends, I can show a H264 RTSP stream on an Android SurfaceView. This answer is only sort of an answer because I still can't address my original three questions, but even full of bug and shortcuts as it is, my 75K apk is a lot better than Vlc for Android or the osmo4 player: It has sub-second latency (at least when the sender and the receiver are on the same wifi router!) and fills the SurfaceView.

A few takeaways, to help anyone trying to do anything similar:

  • All input buffers you pass to MediaCodec.queueInputBuffer() must start with the 00 00 01 sync pattern.
  • You can configure() and start() the codec right away - but don't queue any 'normal' input buffers until you've see both an SPS (NALU code 7) and PPS (NALU code 8) packet. (These might not be 0x67 and 0x68 - the "nal_ref_idc" bits should be non-zero but will not necessarily be 11. Fwiw, vlc seems to always give me 01.)
  • Pass the SPS/PPS packets almost normally - pass the BUFFER_FLAG_CODEC_CONFIG flag to queueInputBuffer(). In particular, don't try to put them in a "csd-0" buffer attached to the MediaFormat!
  • When you see (a) missed frame(s) (i.e. you see a jump in RTP sequence number) do not call codec.flush()! Just skip the partial frame, and don't queue up a bufer until the next full frame.
Community
  • 1
  • 1
Jon Shemitz
  • 1,225
  • 12
  • 28