FFmpeg学习手册

视频编码   2009-06-29 16:18   阅读15   评论0  
字号:    

                     source: http://www.dranger.com/ffmpeg/ffmpegtutorial_all.html

An ffmpeg and SDL Tutorial

Part 1

Part 2

Part 3

Part 4

library for creating video applications or even general purpose utilities. ffmpeg takes care of all the hard work of video processing by doing all the decoding, encoding, muxing and demuxing for you. This can make media applications much simpler to write. It's simple, written in C, fast, and can decode almost any codec you'll find in use today, as well as encode several other formats.

The only prolem is that documentation is basically nonexistent. There is a single tutorial that shows the basics of ffmpeg and auto-generated doxygen documents. That's it. So, when I decided to learn about ffmpeg, and in the process about how digital video and audio applications work, I decided to document the process and present it as a tutorial.

There is a sample program that comes with ffmpeg called ffplay. It is a simple C program that implements a complete video player using ffmpeg. This tutorial will begin with an updated version of the original tutorial, written by Martin Böhme (I have stolen liberally borrowed from that work), and work from there to developing a working video player, based on Fabrice Bellard's ffplay.c. In each tutorial, I'll introduce a new idea (or two) and explain how we implement it. Each tutorial will have a C file so you can download it, compile it, and follow along at home. The source files will show you how the real program works, how we move all the pieces around, as well as showing you the technical details that are unimportant to the tutorial. By the time we are finished, we will have a working video player written in less than 1000 lines of code!

In making the player, we will be using SDL to output the audio and video of the media file. SDL is an excellent cross-platform multimedia library that's used in MPEG playback software, emulators, and many video games. You will need to download and install the SDL development libraries for your system in order to compile the programs in this tutorial.

This tutorial is meant for people with a decent programming background. At the very least you should know C and have some idea about concepts like queues, mutexes, and so on. You should know some basics about multimedia; things like waveforms and such, but you don't need to know a lot, as I explain a lot of those concepts in this tutorial.

There are printable HTML files along the way as well as old school ASCII files. You can also get a tarball of the text files and source code or just the source. You can get a printable page of the full thing in HTML or in text.

UPDATE: I've fixed a code error in Tutorial 7 and 8, as well as adding -lavutil.

Please feel free to email me with bugs, questions, comments, ideas, features, whatever, at dranger at gmail dot com.

>> Proceed with the tutorial!

Tutorial 01: Making Screencaps

Code: tutorial01.c

Overview

Movie files have a few basic components. First, the file itself is called a container, and the type of container determines where the information in the file goes. Examples of containers are AVI and Quicktime. Next, you have a bunch of streams; for example, you usually have an audio stream and a video stream. (A "stream" is just a fancy word for "a succession of data elements made available over time".) The data elements in a stream are called frames. Each stream is encoded by a different kind of codec. The codec defines how the actual data is COded and DECoded - hence the name CODEC. Examples of codecs are DivX and MP3. Packets are then read from the stream. Packets are pieces of data that can contain bits of data that are decoded into raw frames that we can finally manipulate for our application. For our purposes, each packet contains complete frames, or multiple frames in the case of audio.

At its very basic level, dealing with video and audio streams is very easy:

10 OPEN video_stream FROM video.avi20 READ packet FROM video_stream INTO frame30 IF frame NOT COMPLETE GOTO 2040 DO SOMETHING WITH frame50 GOTO 20Handling multimedia with ffmpeg is pretty much as simple as this program, although some programs might have a very complex "DO SOMETHING" step. So in this tutorial, we're going to open a file, read from the video stream inside it, and our DO SOMETHING is going to be writing the frame to a PPM file.

Opening the File

First, let's see how we open a file in the first place. With ffmpeg, you have to first initialize the library. (Note that some systems might have to use <ffmpeg/avcodec.h> and <ffmpeg/avformat.h> instead.)

#include <avcodec.h>#include <avformat.h>...int main(int argc, charg *argv[]) {av_register_all();This registers all available file formats and codecs with the library so they will be used automatically when a file with the corresponding format/codec is opened. Note that you only need to call av_register_all() once, so we do it here in main(). If you like, it's possible to register only certain individual file formats and codecs, but there's usually no reason why you would have to do that.

Now we can actually open the file:

AVFormatContext *pFormatCtx;// Open video fileif(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NULL)!=0) return -1; // Couldn't open fileWe get our filename from the first argument. This function reads the file header and stores information about the file format in the AVFormatContext structure we have given it. The last three arguments are used to specify the file format, buffer size, and format options, but by setting this to NULL or 0, libavformat will auto-detect these.

This function only looks at the header, so next we need to check out the stream information in the file.:

// Retrieve stream informationif(av_find_stream_info(pFormatCtx)<0) return -1; // Couldn't find stream informationThis function populates pFormatCtx->streams with the proper information. We introduce a handy debugging function to show us what's inside:

// Dump information about file onto standard errordump_format(pFormatCtx, 0, argv[1], 0);Now pFormatCtx->streams is just an array of pointers, of size pFormatCtx->nb_streams, so let's walk through it until we find a video stream.

int i;AVCodecContext *pCodecCtx;// Find the first video streamvideoStream=-1;for(i=0; i<pFormatCtx->nb_streams; i++) if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO) { videoStream=i; break; }if(videoStream==-1) return -1; // Didn't find a video stream// Get a pointer to the codec context for the video streampCodecCtx=pFormatCtx->streams[videoStream]->codec;The stream's information about the codec is in what we call the "codec context." This contains all the information about the codec that the stream is using, and now we have a pointer to it. But we still have to find the actual codec and open it:

AVCodec *pCodec;// Find the decoder for the video streampCodec=avcodec_find_decoder(pCodecCtx->codec_id);if(pCodec==NULL) { fprintf(stderr, "Unsupported codec!\n"); return -1; // Codec not found}// Open codecif(avcodec_open(pCodecCtx, pCodec)<0) return -1; // Could not open codecSome of you might remember from the old tutorial that there were two other parts to this code: adding CODEC_FLAG_TRUNCATED to pCodecCtx->flags and adding a hack to correct grossly incorrect frame rates. These two fixes aren't in ffplay.c anymore, so I have to assume that they are not necessary anymore. There's another difference to point out since we removed that code: pCodecCtx->time_base now holds the frame rate information. time_base is a struct that has the numerator and denominator (AVRational). We represent the frame rate as a fraction because many codecs have non-integer frame rates (like NTSC's 29.97fps).

Storing the Data

Now we need a place to actually store the frame:

AVFrame *pFrame;// Allocate video framepFrame=avcodec_alloc_frame();Since we're planning to output PPM files, which are stored in 24-bit RGB, we're going to have to convert our frame from its native format to RGB. ffmpeg will do these conversions for us. For most projects (including ours) we're going to want to convert our initial frame to a specific format. Let's allocate a frame for the converted frame now.

// Allocate an AVFrame structurepFrameRGB=avcodec_alloc_frame();if(pFrameRGB==NULL) return -1;Even though we've allocated the frame, we still need a place to put the raw data when we convert it. We use avpicture_get_size to get the size we need, and allocate the space manually:

uint8_t *buffer;int numBytes;// Determine required buffer size and allocate buffernumBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width, pCodecCtx->height);buffer=(uint8_t *)av_malloc(numBytes*sizeof(uint8_t));av_malloc is ffmpeg's malloc that is just a simple wrapper around malloc that makes sure the memory addresses are aligned and such. It will not protect you from memory leaks, double freeing, or other malloc problems.

Now we use avpicture_fill to associate the frame with our newly allocated buffer. About the AVPicture cast: the AVPicture struct is a subset of the AVFrame struct - the beginning of the AVFrame struct is identical to the AVPicture struct.

// Assign appropriate parts of buffer to image planes in pFrameRGB// Note that pFrameRGB is an AVFrame, but AVFrame is a superset// of AVPictureavpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24, pCodecCtx->width, pCodecCtx->height);Finally! Now we're ready to read from the stream!

Reading the Data

What we're going to do is read through the entire video stream by reading in the packet, decoding it into our frame, and once our frame is complete, we will convert and save it.

int frameFinished;AVPacket packet;i=0;while(av_read_frame(pFormatCtx, &packet)>=0) { // Is this a packet from the video stream? if(packet.stream_index==videoStream) { // Decode video frame avcodec_decode_video(pCodecCtx, pFrame, &frameFinished, packet.data, packet.size); // Did we get a video frame? if(frameFinished) { // Convert the image from its native format to RGB img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, (AVPicture*)pFrame, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height); // Save the frame to disk if(++i<=5) SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height, i); } } // Free the packet that was allocated by av_read_frame av_free_packet(&packet);}A note on packets

Technically a packet can contain partial frames or other bits of data, but ffmpeg's parser ensures that the packets we get contain either complete or multiple frames. The process, again, is simple: av_read_frame() reads in a packet and stores it in the AVPacket struct. Note that we've only allocated the packet structure - ffmpeg allocates the internal data for us, which is pointed to by packet.data. This is freed by the av_free_packet() later. avcodec_decode_video() converts the packet to a frame for us. However, we might not have all the information we need for a frame after decoding a packet, so avcodec_decode_video() sets frameFinished for us when we have the next frame. Finally, we use img_convert() to convert from the native format (pCodecCtx->pix_fmt) to RGB. Remember that you can cast an AVFrame pointer to an AVPicture pointer. Finally, we pass the frame and height and width information to our SaveFrame function.

Now all we need to do is make the SaveFrame function to write the RGB information to a file in PPM format. We're going to be kind of sketchy on the PPM format itself; trust us, it works.

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame) { FILE *pFile; char szFilename[32]; int y; // Open file sprintf(szFilename, "frame%d.ppm", iFrame); pFile=fopen(szFilename, "wb"); if(pFile==NULL) return; // Write header fprintf(pFile, "P6\n%d %d\n255\n", width, height); // Write pixel data for(y=0; y<height; y++) fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile); // Close file fclose(pFile);}We do a bit of standard file opening, etc., and then write the RGB data. We write the file one line at a time. A PPM file is simply a file that has RGB information laid out in a long string. If you know HTML colors, it would be like laying out the color of each pixel end to end like #ff0000#ff0000.... would be a red screen. (It's stored in binary and without the separator, but you get the idea.) The header indicated how wide and tall the image is, and the max size of the RGB values.

Now, going back to our main() function. Once we're done reading from the video stream, we just have to clean everything up:

// Free the RGB imageav_free(buffer);av_free(pFrameRGB);// Free the YUV frameav_free(pFrame);// Close the codecavcodec_close(pCodecCtx);// Close the video fileav_close_input_file(pFormatCtx);return 0;You'll notice we use av_free for the memory we allocated with avcode_alloc_frame and av_malloc.

That's it for the code! Now, if you're on Linux or a similar platform, you'll run:

gcc -o tutorial01 tutorial01.c -lavutil -lavformat -lavcodec -lz -lavutil -lmIf you have an older version of ffmpeg, you may need to drop -lavutil:

gcc -o tutorial01 tutorial01.c -lavformat -lavcodec -lz -lmMost image programs should be able to open PPM files. Test it on some movie files.

>> Tutorial 2: Outputting to the Screen

Tutorial 02: Outputting to the Screen

Code: tutorial02.c

SDL and Video

To draw to the screen, we're going to use SDL. SDL stands for Simple Direct Layer, and is an excellent library for multimedia, is cross-platform, and is used in several projects. You can get the library at the official website or you can download the development package for your operating system if there is one. You'll need the libraries to compile the code for this tutorial (and for the rest of them, too).

SDL has many methods for drawing images to the screen, and it has one in particular that is meant for displaying movies on the screen - what it calls a YUV overlay. YUV (technically not YUV but YCbCr) * A note: There is a great deal of annoyance from some people at the convention of calling "YCbCr" "YUV". Generally speaking, YUV is an analog format and YCbCr is a digital format. ffmpeg and SDL both refer to YCbCr as YUV in their code and macros. is a way of storing raw image data like RGB. Roughly speaking, Y is the brightness (or "luma") component, and U and V are the color components. (It's more complicated than RGB because some of the color information is discarded, and you might have only 1 U and V sample for every 2 Y samples.) SDL's YUV overlay takes in a raw array of YUV data and displays it. It accepts 4 different kinds of YUV formats, but YV12 is the fastest. There is another YUV format called YUV420P that is the same as YV12, except the U and V arrays are switched. The 420 means it is subsampled at a ratio of 4:2:0, basically meaning there is 1 color sample for every 4 luma samples, so the color information is quartered. This is a good way of saving bandwidth, as the human eye does not percieve this change. The "P" in the name means that the format is "planar" — simply meaning that the Y, U, and V components are in separate arrays. ffmpeg can convert images to YUV420P, with the added bonus that many video streams are in that format already, or are easily converted to that format.

So our current plan is to replace the SaveFrame() function from Tutorial 1, and instead output our frame to the screen. But first we have to start by seeing how to use the SDL Library. First we have to include the libraries and initalize SDL:

#include <SDL.h>#include <SDL_thread.h>if(SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) { fprintf(stderr, "Could not initialize SDL - %s\n", SDL_GetError()); exit(1);}SDL_Init() essentially tells the library what features we're going to use. SDL_GetError(), of course, is a handy debugging function.

Creating a Display

Now we need a place on the screen to put stuff. The basic area for displaying images with SDL is called a surface:

SDL_Surface *screen;screen = SDL_SetVideoMode(pCodecCtx->width, pCodecCtx->height, 0, 0);if(!screen) { fprintf(stderr, "SDL: could not set video mode - exiting\n"); exit(1);}This sets up a screen with the given width and height. The next option is the bit depth of the screen - 0 is a special value that means "same as the current display". (This does not work on OS X; see source.)

Now we create a YUV overlay on that screen so we can input video to it:

SDL_Overlay *bmp;bmp = SDL_CreateYUVOverlay(pCodecCtx->width, pCodecCtx->height, SDL_YV12_OVERLAY, screen);As we said before, we are using YV12 to display the image.

Displaying the Image

Well that was simple enough! Now we just need to display the image. Let's go all the way down to where we had our finished frame. We can get rid of all that stuff we had for the RGB frame, and we're going to replace the SaveFrame() with our display code. To display the image, we're going to make an AVPicture struct and set its data pointers and linesize to our YUV overlay:

if(frameFinished) { SDL_LockYUVOverlay(bmp); AVPicture pict; pict.data[0] = bmp->pixels[0]; pict.data[1] = bmp->pixels[2]; pict.data[2] = bmp->pixels[1]; pict.linesize[0] = bmp->pitches[0]; pict.linesize[1] = bmp->pitches[2]; pict.linesize[2] = bmp->pitches[1]; // Convert the image into YUV format that SDL uses img_convert(&pict, PIX_FMT_YUV420P, (AVPicture *)pFrame, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height); SDL_UnlockYUVOverlay(bmp); } First, we lock the overlay because we are going to be writing to it. This is a good habit to get into so you don't have problems later. The AVPicture struct, as shown before, has a data pointer that is an array of 4 pointers. Since we are dealing with YUV420P here, we only have 3 channels, and therefore only 3 sets of data. Other formats might have a fourth pointer for an alpha channel or something. linesize is what it sounds like. The analogous structures in our YUV overlay are the pixels and pitches variables. ("pitches" is the term SDL uses to refer to the width of a given line of data.) So what we do is point the three arrays of pict.data at our overlay, so when we write to pict, we're actually writing into our overlay, which of course already has the necessary space allocated. Similarly, we get the linesize information directly from our overlay. We change the conversion format to PIX_FMT_YUV420P, and we use img_convert just like before.

Drawing the Image

But we still need to tell SDL to actually show the data we've given it. We also pass this function a rectangle that says where the movie should go and what width and height it should be scaled to. This way, SDL does the scaling for us, and it can be assisted by your graphics processor for faster scaling:

SDL_Rect rect; if(frameFinished) { /* ... code ... */ // Convert the image into YUV format that SDL uses img_convert(&pict, PIX_FMT_YUV420P, (AVPicture *)pFrame, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height); SDL_UnlockYUVOverlay(bmp); rect.x = 0; rect.y = 0; rect.w = pCodecCtx->width; rect.h = pCodecCtx->height; SDL_DisplayYUVOverlay(bmp, &rect); }Now our video is displayed!

Let's take this time to show you another feature of SDL: its event system. SDL is set up so that when you type, or move the mouse in the SDL application, or send it a signal, it generates an event. Your program then checks for these events if it wants to handle user input. Your program can also make up events to send the SDL event system. This is especially useful when multithread programming with SDL, which we'll see in Tutorial 4. In our program, we're going to poll for events right after we finish processing a packet. For now, we're just going to handle the SDL_QUIT event so we can exit:

SDL_Event event; av_free_packet(&packet); SDL_PollEvent(&event); switch(event.type) { case SDL_QUIT: SDL_Quit(); exit(0); break; default: break; }And there we go! Get rid of all the old cruft, and you're ready to compile. If you are using Linux or a variant, the best way to compile using the SDL libs is this:

gcc -o tutorial02 tutorial02.c -lavutil -lavformat -lavcodec -lz -lm \`sdl-config --cflags --libs`sdl-config just prints out the proper flags for gcc to include the SDL libraries properly. You may need to do something different to get it to compile on your system; please check the SDL documentation for your system. Once it compiles, go ahead and run it.

What happens when you run this program? The video is going crazy! In fact, we're just displaying all the video frames as fast as we can extract them from the movie file. We don't have any code right now for figuring out when we need to display video. Eventually (in Tutorial 5), we'll get around to syncing the video. But first we're missing something even more important: sound!

>> Playing Sound

Tutorial 03: Playing Sound

Code: tutorial03.c

Audio

So now we want to play sound. SDL also gives us methods for outputting sound. The SDL_OpenAudio() function is used to open the audio device itself. It takes as arguments an SDL_AudioSpec struct, which contains all the information about the audio we are going to output.

Before we show how you set this up, let's explain first about how audio is handled by computers. Digital audio consists of a long stream of samples. Each sample represents a value of the audio waveform. Sounds are recorded at a certain sample rate, which simply says how fast to play each sample, and is measured in number of samples per second. Example sample rates are 22,050 and 44,100 samples per second, which are the rates used for radio and CD respectively. In addition, most audio can have more than one channel for stereo or surround, so for example, if the sample is in stereo, the samples will come 2 at a time. When we get data from a movie file, we don't know how many samples we will get, but ffmpeg will not give us partial samples - that also means that it will not split a stereo sample up, either.

SDL's method for playing audio is this: you set up your audio options: the sample rate (called "freq" for frequency in the SDL struct), number of channels, and so forth, and we also set a callback function and userdata. When we begin playing audio, SDL will continually call this callback function and ask it to fill the audio buffer with a certain number of bytes. After we put this information in the SDL_AudioSpec struct, we call SDL_OpenAudio(), which will open the audio device and give us back another AudioSpec struct. These are the specs we will actually be using — we are not guaranteed to get what we asked for!

Setting Up the Audio

Keep that all in your head for the moment, because we don't actually have any information yet about the audio streams yet! Let's go back to the place in our code where we found the video stream and find which stream is the audio stream.

// Find the first video streamvideoStream=-1;audioStream=-1;for(i=0; i < pFormatCtx->nb_streams; i++) { if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO && videoStream < 0) { videoStream=i; } if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_AUDIO && audioStream < 0) { audioStream=i; }}if(videoStream==-1) return -1; // Didn't find a video streamif(audioStream==-1) return -1;From here we can get all the info we want from the AVCodecContext from the stream, just like we did with the video stream:

AVCodecContext *aCodecCtx;aCodecCtx=pFormatCtx->streams[audioStream]->codec;

Contained within this codec context is all the information we need to set up our audio:

wanted_spec.freq = aCodecCtx->sample_rate;wanted_spec.format = AUDIO_S16SYS;wanted_spec.channels = aCodecCtx->channels;wanted_spec.silence = 0;wanted_spec.samples = SDL_AUDIO_BUFFER_SIZE;wanted_spec.callback = audio_callback;wanted_spec.userdata = aCodecCtx;if(SDL_OpenAudio(&wanted_spec, &spec) < 0) { fprintf(stderr, "SDL_OpenAudio: %s\n", SDL_GetError()); return -1;}Let's go through these:

  • freq: The sample rate, as explained earlier.
  • format: This tells SDL what format we will be giving it. The "S" in "S16SYS" stands for "signed", the 16 says that each sample is 16 bits long, and "SYS" means that the endian-order will depend on the system you are on. This is the format that avcodec_decode_audio2 will give us the audio in.
  • channels: Number of audio channels.
  • silence: This is the value that indicated silence. Since the audio is signed, 0 is of course the usual value.
  • samples: This is the size of the audio buffer that we would like SDL to give us when it asks for more audio. A good value here is between 512 and 8192; ffplay uses 1024.
  • callback: Here's where we pass the actual callback function. We'll talk more about the callback function later.
  • userdata: SDL will give our callback a void pointer to any user data that we want our callback function to have. We want to let it know about our codec context; you'll see why.

Finally, we open the audio with SDL_OpenAudio.

If you remember from the previous tutorials, we still need to open the audio codec itself. This is straightforward:

AVCodec *aCodec;aCodec = avcodec_find_decoder(aCodecCtx->codec_id);if(!aCodec) { fprintf(stderr, "Unsupported codec!\n"); return -1;}avcodec_open(aCodecCtx, aCodec);

Queues

There! Now we're ready to start pulling audio information from the stream. But what do we do with that information? We are going to be continuously getting packets from the movie file, but at the same time SDL is going to call the callback function! The solution is going to be to create some kind of global structure that we can stuff audio packets in so our audio_callback has something to get audio data from! So what we're going to do is to create a queue of packets. ffmpeg even comes with a structure to help us with this: AVPacketList, which is just a linked list for packets. Here's our queue structure:

typedef struct PacketQueue { AVPacketList *first_pkt, *last_pkt; int nb_packets; int size; SDL_mutex *mutex; SDL_cond *cond;} PacketQueue;First, we should point out that nb_packets is not the same as sizesize refers to a byte size that we get from packet->size. You'll notice that we have a mutex and a condtion variable in there. This is because SDL is running the audio process as a separate thread. If we don't lock the queue properly, we could really mess up our data. We'll see how in the implementation of the queue. Every programmer should know how to make a queue, but we're including this so you can learn the SDL functions.

First we make a function to initialize the queue:

void packet_queue_init(PacketQueue *q) { memset(q, 0, sizeof(PacketQueue)); q->mutex = SDL_CreateMutex(); q->cond = SDL_CreateCond();}Then we will make a function to put stuff in our queue:

int packet_queue_put(PacketQueue *q, AVPacket *pkt) { AVPacketList *pkt1; if(av_dup_packet(pkt) < 0) { return -1; } pkt1 = av_malloc(sizeof(AVPacketList)); if (!pkt1) return -1; pkt1->pkt = *pkt; pkt1->next = NULL; SDL_LockMutex(q->mutex); if (!q->last_pkt) q->first_pkt = pkt1; else q->last_pkt->next = pkt1; q->last_pkt = pkt1; q->nb_packets++; q->size += pkt1->pkt.size; SDL_CondSignal(q->cond); SDL_UnlockMutex(q->mutex); return 0;}SDL_LockMutex() locks the mutex in the queue so we can add something to it, and then SDL_CondSignal() sends a signal to our get function (if it is waiting) through our condition variable to tell it that there is data and it can proceed, then unlocks the mutex to let it go.

Here's the corresponding get function. Notice how SDL_CondWait() makes the function block (i.e. pause until we get data) if we tell it to.

int quit = 0;static int packet_queue_get(PacketQueue *q, AVPacket *pkt, int block) { AVPacketList *pkt1; int ret; SDL_LockMutex(q->mutex); for(;;) { if(quit) { ret = -1; break; } pkt1 = q->first_pkt; if (pkt1) { q->first_pkt = pkt1->next; if (!q->first_pkt) q->last_pkt = NULL; q->nb_packets--; q->size -= pkt1->pkt.size; *pkt = pkt1->pkt; av_free(pkt1); ret = 1; break; } else if (!block) { ret = 0; break; } else { SDL_CondWait(q->cond, q->mutex); } } SDL_UnlockMutex(q->mutex); return ret;}As you can see, we've wrapped the function in a forever loop so we will be sure to get some data if we want to block. We avoid looping forever by making use of SDL's SDL_CondWait() function. Basically, all CondWait does is wait for a signal from SDL_CondSignal() (or SDL_CondBroadcast()) and then continue. However, it looks as though we've trapped it within our mutex — if we hold the lock, our put function can't put anything in the queue! However, what SDL_CondWait() also does for us is to unlock the mutex we give it and then attempt to lock it again once we get the signal.

In Case of Fire

You'll also notice that we have a global quit variable that we check to make sure that we haven't set the program a quit signal (SDL automatically handles TERM signals and the like). Otherwise, the thread will continue forever and we'll have to kill -9 the program. ffmpeg also has a function for providing a callback to check and see if we need to quit some blocking function: url_set_interrupt_cb.

int decode_interrupt_cb(void) { return quit;}...main() {... url_set_interrupt_cb(decode_interrupt_cb); ... SDL_PollEvent(&event); switch(event.type) { case SDL_QUIT: quit = 1;...This only applies for ffmpeg functions that block, of course, not SDL ones. We make sure to set the quit flag to 1.

Feeding Packets

The only thing left is to set up our queue:

PacketQueue audioq;main() {... avcodec_open(aCodecCtx, aCodec); packet_queue_init(&audioq); SDL_PauseAudio(0);SDL_PauseAudio() finally starts the audio device. It plays silence if it doesn't get data; which it won't right away.

So, we've got our queue set up, now we're ready to start feeding it packets. We go to our packet-reading loop:

while(av_read_frame(pFormatCtx, &packet)>=0) { // Is this a packet from the video stream? if(packet.stream_index==videoStream) { // Decode video frame .... } } else if(packet.stream_index==audioStream) { packet_queue_put(&audioq, &packet); } else { av_free_packet(&packet); }Note that we don't free the packet after we put it in the queue. We'll free it later when we decode it.

Fetching Packets

Now let's finally make our audio_callback function to fetch the packets on the queue. The callback has to be of the form void callback(void *userdata, Uint8 *stream, int len), where userdata of course is the pointer we gave to SDL, stream is the buffer we will be writing audio data to, and len is the size of that buffer. Here's the code:

void audio_callback(void *userdata, Uint8 *stream, int len) { AVCodecContext *aCodecCtx = (AVCodecContext *)userdata; int len1, audio_size; static uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2]; static unsigned int audio_buf_size = 0; static unsigned int audio_buf_index = 0; while(len > 0) { if(audio_buf_index >= audio_buf_size) { /* We have already sent all our data; get more */ audio_size = audio_decode_frame(aCodecCtx, audio_buf, sizeof(audio_buf)); if(audio_size < 0) { /* If error, output silence */ audio_buf_size = 1024; memset(audio_buf, 0, audio_buf_size); } else { audio_buf_size = audio_size; } audio_buf_index = 0; } len1 = audio_buf_size - audio_buf_index; if(len1 > len) len1 = len; memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1); len -= len1; stream += len1; audio_buf_index += len1; }}This is basically a simple loop that will pull in data from another function we will write, audio_decode_frame(), store the result in an intermediary buffer, attempt to write len bytes to stream, and get more data if we don't have enough yet, or save it for later if we have some left over. The size of audio_buf is 1.5 times the size of the largest audio frame that ffmpeg will give us, which gives us a nice cushion.

Finally Decoding the Audio

Let's get to the real meat of the decoder, audio_decode_frame:

int audio_decode_frame(AVCodecContext *aCodecCtx, uint8_t *audio_buf, int buf_size) { static AVPacket pkt; static uint8_t *audio_pkt_data = NULL; static int audio_pkt_size = 0; int len1, data_size; for(;;) { while(audio_pkt_size > 0) { data_size = buf_size; len1 = avcodec_decode_audio2(aCodecCtx, (int16_t *)audio_buf, &data_size, audio_pkt_data, audio_pkt_size); if(len1 < 0) { /* if error, skip frame */ audio_pkt_size = 0; break; } audio_pkt_data += len1; audio_pkt_size -= len1; if(data_size <= 0) { /* No data yet, get more frames */ continue; } /* We have data, return it and come back for more later */ return data_size; } if(pkt.data) av_free_packet(&pkt); if(quit) { return -1; } if(packet_queue_get(&audioq, &pkt, 1) < 0) { return -1; } audio_pkt_data = pkt.data; audio_pkt_size = pkt.size; }}This whole process actually starts towards the end of the function, where we call packet_queue_get(). We pick the packet up off the queue, and save its information. Then, once we have a packet to work with, we call avcodec_decode_audio2(), which acts a lot like its sister function, avcodec_decode_video(), except in this case, a packet might have more than one frame, so you may have to call it several times to get all the data out of the packet. * A note: Why avcodec_decode_audio2? Because there used to be an avcodec_decode_audio, but it's now deprecated. The new function reads from the data_size variable to figure out how big audio_buf is. Also, remember the cast to audio_buf, because SDL gives an 8 bit int buffer, and ffmpeg gives us data in a 16 bit int buffer. You should also notice the difference between len1 and data_size. len1 is how much of the packet we've used, and data_size is the amount of raw data returned.

When we've got some data, we immediately return to see if we still need to get more data from the queue, or if we are done. If we still had more of the packet to process, we save it for later. If we finish up a packet, we finally get to free that packet.

So that's it! We've got audio being carried from the main read loop to the queue, which is then read by the audio_callback function, which hands that data to SDL, which SDL beams to your sound card. Go ahead and compile:

gcc -o tutorial03 tutorial03.c -lavutil -lavformat -lavcodec -lz -lm \`sdl-config --cflags --libs`Hooray! The video is still going as fast as possible, but the audio is playing in time. Why is this? That's because the audio information has a sample rate — we're pumping out audio information as fast as we can, but the audio simply plays from that stream at its leisure according to the sample rate.

We're almost ready to start syncing video and audio ourselves, but first we need to do a little program reorganization. The method of queueing up audio and playing it using a separate thread worked very well: it made the code more managable and more modular. Before we start syncing the video to the audio, we need to make our code easier to deal with. Next time: Spawning Threads!

>> Spawning Threads

Tutorial 04: Spawning Threads

Code: tutorial04.c

Overview

Last time we added audio support by taking advantage of SDL's audio functions. SDL started a thread that made callbacks to a function we defined every time it needed audio. Now we're going to do the same sort of thing with the video display. This makes the code more modular and easier to work with - especially when we want to add syncing. So where do we start?

First we notice that our main function is handling an awful lot: it's running through the event loop, reading in packets, and decoding the video. So what we're going to do is split all those apart: we're going to have a thread that will be responsible for decoding the packets; these packets will then be added to the queue and read by the corresponding audio and video threads. The audio thread we have already set up the way we want it; the video thread will be a little more complicated since we have to display the video ourselves. We will add the actual display code to the main loop. But instead of just displaying video every time we loop, we will integrate the video display into the event loop. The idea is to decode the video, save the resulting frame in another queue, then create a custom event (FF_REFRESH_EVENT) that we add to the event system, then when our event loop sees this event, it will display the next frame in the queue. Here's a handy ASCII art illustration of what is going on:

________ audio _______ _____| | pkts | | | | to spkr| DECODE |----->| AUDIO |--->| SDL |-->|________| |_______| |_____| | video _______ | pkts | | +---------->| VIDEO | ________ |_______| _______| | | | || EVENT | +------>| VIDEO | to mon.| LOOP |----------------->| DISP. |-->|_______|<---FF_REFRESH----|_______|The main purpose of moving controlling the video display via the event loop is that using an SDL_Delay thread, we can control exactly when the next video frame shows up on the screen. When we finally sync the video in the next tutorial, it will be a simple matter to add the code that will schedule the next video refresh so the right picture is being shown on the screen at the right time.

Simplifying Code

We're also going to clean up the code a bit. We have all this audio and video codec information, and we're going to be adding queues and buffers and who knows what else. All this stuff is for one logical unit, viz. the movie. So we're going to make a large struct that will hold all that information called the VideoState.

typedef struct VideoState { AVFormatContext *pFormatCtx; int videoStream, audioStream; AVStream *audio_st; PacketQueue audioq; uint8_t audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2]; unsigned int audio_buf_size; unsigned int audio_buf_index; AVPacket audio_pkt; uint8_t *audio_pkt_data; int audio_pkt_size; AVStream *video_st; PacketQueue videoq; VideoPicture pictq[VIDEO_PICTURE_QUEUE_SIZE]; int pictq_size, pictq_rindex, pictq_windex; SDL_mutex *pictq_mutex; SDL_cond *pictq_cond; SDL_Thread *parse_tid; SDL_Thread *video_tid; char filename[1024]; int quit;} VideoState;Here we see a glimpse of what we're going to get to. First we see the basic information - the format context and the indices of the audio and video stream, and the corresponding AVStream objects. Then we can see that we've moved some of those audio buffers into this structure. These (audio_buf, audio_buf_size, etc.) were all for information about audio that was still lying around (or the lack thereof). We've added another queue for the video, and a buffer (which will be used as a queue; we don't need any fancy queueing stuff for this) for the decoded frames (saved as an overlay). The VideoPicture struct is of our own creations (we'll see what's in it when we come to it). We also notice that we've allocated pointers for the two extra threads we will create, and the quit flag and the filename of the movie.

So now we take it all the way back to the main function to see how this changes our program. Let's set up our VideoState struct:

int main(int argc, char *argv[]) { SDL_Event event; VideoState *is; is = av_mallocz(sizeof(VideoState));av_mallocz() is a nice function that will allocate memory for us and zero it out.

Then we'll initialize our locks for the display buffer (pictq), because since the event loop calls our display function - the display function, remember, will be pulling pre-decoded frames from pictq. At the same time, our video decoder will be putting information into it - we don't know who will get there first. Hopefully you recognize that this is a classic race condition. So we allocate it now before we start any threads. Let's also copy the filename of our movie into our VideoState.

pstrcpy(is->filename, sizeof(is->filename), argv[1]);is->pictq_mutex = SDL_CreateMutex();is->pictq_cond = SDL_CreateCond();pstrcpy is a function from ffmpeg that does some extra bounds checking beyond strncpy.

Our First Thread

Now let's finally launch our threads and get the real work done:

schedule_refresh(is, 40);is->parse_tid = SDL_CreateThread(decode_thread, is);if(!is->parse_tid) { av_free(is); return -1;}schedule_refresh is a function we will define later. What it basically does is tell the system to push a FF_REFRESH_EVENT after the specified number of milliseconds. This will in turn call the video refresh function when we see it in the event queue. But for now, let's look at SDL_CreateThread().

SDL_CreateThread() does just that - it spawns a new thread that has complete access to all the memory of the original process, and starts the thread running on the function we give it. It will also pass that function user-defined data. In this case, we're calling decode_thread() and with our VideoState struct attached. The first half of the function has nothing new; it simply does the work of opening the file and finding the index of the audio and video streams. The only thing we do different is save the format context in our big struct. After we've found our stream indices, we call another function that we will define, stream_component_open(). This is a pretty natural way to split things up, and since we do a lot of similar things to set up the video and audio codec, we reuse some code by making this a function.

The stream_component_open() function is where we will find our codec decoder, set up our audio options, save important information to our big struct, and launch our audio and video threads. This is where we would also insert other options, such as forcing the codec instead of autodetecting it and so forth. Here it is:

int stream_component_open(VideoState *is, int stream_index) { AVFormatContext *pFormatCtx = is->pFormatCtx; AVCodecContext *codecCtx; AVCodec *codec; SDL_AudioSpec wanted_spec, spec; if(stream_index < 0 || stream_index >= pFormatCtx->nb_streams) { return -1; } // Get a pointer to the codec context for the video stream codecCtx = pFormatCtx->streams[stream_index]->codec; if(codecCtx->codec_type == CODEC_TYPE_AUDIO) { // Set audio settings from codec info wanted_spec.freq = codecCtx->sample_rate; /* .... */ wanted_spec.callback = audio_callback; wanted_spec.userdata = is; if(SDL_OpenAudio(&wanted_spec, &spec) < 0) { fprintf(stderr, "SDL_OpenAudio: %s\n", SDL_GetError()); return -1; } } codec = avcodec_find_decoder(codecCtx->codec_id); if(!codec || (avcodec_open(codecCtx, codec) < 0)) { fprintf(stderr, "Unsupported codec!\n"); return -1; } switch(codecCtx->codec_type) { case CODEC_TYPE_AUDIO: is->audioStream = stream_index; is->audio_st = pFormatCtx->streams[stream_index]; is->audio_buf_size = 0; is->audio_buf_index = 0; memset(&is->audio_pkt, 0, sizeof(is->audio_pkt)); packet_queue_init(&is->audioq); SDL_PauseAudio(0); break; case CODEC_TYPE_VIDEO: is->videoStream = stream_index; is->video_st = pFormatCtx->streams[stream_index]; packet_queue_init(&is->videoq); is->video_tid = SDL_CreateThread(video_thread, is); break; default: break; }}This is pretty much the same as the code we had before, except now it's generalized for audio and video. Notice that instead of aCodecCtx, we've set up our big struct as the userdata for our audio callback. We've also saved the streams themselves as audio_st and video_st. We also have added our video queue and set it up in the same way we set up our audio queue. Most of the point is to launch the video and audio threads. These bits do it:

SDL_PauseAudio(0); break;/* ...... */ is->video_tid = SDL_CreateThread(video_thread, is);We remember SDL_PauseAudio() from last time, and SDL_CreateThread() is used as in the exact same way as before. We'll get back to our video_thread() function.

Before that, let's go back to the second half of our decode_thread() function. It's basically just a for loop that will read in a packet and put it on the right queue:

for(;;) { if(is->quit) { break; } // seek stuff goes here if(is->audioq.size > MAX_AUDIOQ_SIZE || is->videoq.size > MAX_VIDEOQ_SIZE) { SDL_Delay(10); continue; } if(av_read_frame(is->pFormatCtx, packet) < 0) { if(url_ferror(&pFormatCtx->pb) == 0) { SDL_Delay(100); /* no error; wait for user input */ continue; } else { break; } } // Is this a packet from the video stream? if(packet->stream_index == is->videoStream) { packet_queue_put(&is->videoq, packet); } else if(packet->stream_index == is->audioStream) { packet_queue_put(&is->audioq, packet); } else { av_free_packet(packet); } }Nothing really new here, except that we now have a max size for our audio and video queue, and we've added a function that will check for read errors. The format context has a ByteIOContext struct inside it called pb. ByteIOContext is the structure that basically keeps all the low-level file information in it. url_ferror checks that structure to see if there was some kind of error reading from our file.

After our for loop, we have all the code for waiting for the rest of the program to end or informing it that we've ended. This code is instructive because it shows us how we push events - something we'll have to later to display the video.

while(!is->quit) { SDL_Delay(100); } fail: if(1){ SDL_Event event; event.type = FF_QUIT_EVENT; event.user.data1 = is; SDL_PushEvent(&event); } return 0;We get values for user events by using the SDL constant SDL_USEREVENT. The first user event should be assigned the value SDL_USEREVENT, the next SDL_USEREVENT + 1, and so on. FF_QUIT_EVENT is defined in our program as SDL_USEREVENT + 2. We can also pass user data if we like, too, and here we pass our pointer to the big struct. Finally we call SDL_PushEvent(). In our event loop switch, we just put this by the SDL_QUIT_EVENT section we had before. We'll see our event loop in more detail; for now, just be assured that when we push the FF_QUIT_EVENT, we'll catch it later and raise our quit flag.

Getting the Frame: video_thread

After we have our codec prepared, we start our video thread. This thread reads in packets from the video queue, decodes the video into frames, and then calls a queue_picture function to put the processed frame onto a picture queue:

int video_thread(void *arg) { VideoState *is = (VideoState *)arg; AVPacket pkt1, *packet = &pkt1; int len1, frameFinished; AVFrame *pFrame; pFrame = avcodec_alloc_frame(); for(;;) { if(packet_queue_get(&is->videoq, packet, 1) < 0) { // means we quit getting packets break; } // Decode video frame len1 = avcodec_decode_video(is->video_st->codec, pFrame, &frameFinished, packet->data, packet->size); // Did we get a video frame? if(frameFinished) { if(queue_picture(is, pFrame) < 0) { break; } } av_free_packet(packet); } av_free(pFrame); return 0;}Most of this function should be familiar by this point. We've moved our avcodec_decode_video function here, just replaced some of the arguments; for example, we have the AVStream stored in our big struct, so we get our codec from there. We just keep getting packets from our video queue until someone tells us to quit or we encounter an error.

Queueing the Frame

Let's look at the function that stores our decoded frame, pFrame in our picture queue. Since our picture queue is an SDL overlay (presumably to allow the video display function to have as little calculation as possible), we need to convert our frame into that. The data we store in the picture queue is a struct of our making:

typedef struct VideoPicture { SDL_Overlay *bmp; int width, height; /* source height & width */ int allocated;} VideoPicture;Our big struct has a buffer of these in it where we can store them. However, we need to allocate the SDL_Overlay ourselves (notice the allocated flag that will indicate whether we have done so or not).

To use this queue, we have two pointers - the writing index and the reading index. We also keep track of how many actual pictures are in the buffer. To write to the queue, we're going to first wait for our buffer to clear out so we have space to store our VideoPicture. Then we check and see if we have already allocated the overlay at our writing index. If not, we'll have to allocate some space. We also have to reallocate the buffer if the size of the window has changed! However, instead of allocating it here, to avoid locking issues. (I'm still not quite sure why; I believe it's to avoid calling the SDL overlay functions in different threads.)

int queue_picture(VideoState *is, AVFrame *pFrame) { VideoPicture *vp; int dst_pix_fmt; AVPicture pict; /* wait until we have space for a new pic */ SDL_LockMutex(is->pictq_mutex); while(is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE && !is->quit) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } SDL_UnlockMutex(is->pictq_mutex); if(is->quit) return -1; // windex is set to 0 initially vp = &is->pictq[is->pictq_windex]; /* allocate or resize the buffer! */ if(!vp->bmp || vp->width != is->video_st->codec->width || vp->height != is->video_st->codec->height) { SDL_Event event; vp->allocated = 0; /* we have to do it in the main thread */ event.type = FF_ALLOC_EVENT; event.user.data1 = is; SDL_PushEvent(&event); /* wait until we have a picture allocated */ SDL_LockMutex(is->pictq_mutex); while(!vp->allocated && !is->quit) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } SDL_UnlockMutex(is->pictq_mutex); if(is->quit) { return -1; } }The event mechanism here is the same one we saw earlier when we wanted to quit. We've defined FF_ALLOC_EVENT as SDL_USEREVENT. We push the event and then wait on the conditional variable for the allocation function to run.

Let's look at how we change our event loop:

for(;;) { SDL_WaitEvent(&event); switch(event.type) {/* ... */ case FF_ALLOC_EVENT: alloc_picture(event.user.data1); break;Remember that event.user.data1 is our big struct. That was simple enough. Let's look at the alloc_picture() function:

void alloc_picture(void *userdata) { VideoState *is = (VideoState *)userdata; VideoPicture *vp; vp = &is->pictq[is->pictq_windex]; if(vp->bmp) { // we already have one make another, bigger/smaller SDL_FreeYUVOverlay(vp->bmp); } // Allocate a place to put our YUV image on that screen vp->bmp = SDL_CreateYUVOverlay(is->video_st->codec->width, is->video_st->codec->height, SDL_YV12_OVERLAY, screen); vp->width = is->video_st->codec->width; vp->height = is->video_st->codec->height; SDL_LockMutex(is->pictq_mutex); vp->allocated = 1; SDL_CondSignal(is->pictq_cond); SDL_UnlockMutex(is->pictq_mutex);}You should recognize the SDL_CreateYUVOverlay function that we've moved from our main loop to this section. This code should be fairly self-explanitory by now. Remember that we save the width and height in the VideoPicture structure because we need to make sure that our video size doesn't change for some reason.

Okay, we're all settled and we have our YUV overlay allocated and ready to receive a picture. Let's go back to queue_picture and look at the code to copy the frame into the overlay. You should recognize that part of it:

int queue_picture(VideoState *is, AVFrame *pFrame) { /* Allocate a frame if we need it... */ /* ... */ /* We have a place to put our picture on the queue */ if(vp->bmp) { SDL_LockYUVOverlay(vp->bmp); dst_pix_fmt = PIX_FMT_YUV420P; /* point pict at the queue */ pict.data[0] = vp->bmp->pixels[0]; pict.data[1] = vp->bmp->pixels[2]; pict.data[2] = vp->bmp->pixels[1]; pict.linesize[0] = vp->bmp->pitches[0]; pict.linesize[1] = vp->bmp->pitches[2]; pict.linesize[2] = vp->bmp->pitches[1]; // Convert the image into YUV format that SDL uses img_convert(&pict, dst_pix_fmt, (AVPicture *)pFrame, is->video_st->codec->pix_fmt, is->video_st->codec->width, is->video_st->codec->height); SDL_UnlockYUVOverlay(vp->bmp); /* now we inform our display thread that we have a pic ready */ if(++is->pictq_windex == VIDEO_PICTURE_QUEUE_SIZE) { is->pictq_windex = 0; } SDL_LockMutex(is->pictq_mutex); is->pictq_size++; SDL_UnlockMutex(is->pictq_mutex); } return 0;}The majority of this part is simply the code we used earlier to fill the YUV overlay with our frame. The last bit is simply "adding" our value onto the queue. The queue works by adding onto it until it is full, and reading from it as long as there is something on it. Therefore everything depends upon the is->pictq_size value, requiring us to lock it. So what we do here is increment the write pointer (and rollover if necessary), then lock the queue and increase its size. Now our reader will know there is more information on the queue, and if this makes our queue full, our writer will know about it.

Displaying the Video

That's it for our video thread! Now we've wrapped up all the loose threads except for one — remember that we called the schedule_refresh() function way back? Let's see what that actually did:

/* schedule a video refresh in 'delay' ms */static void schedule_refresh(VideoState *is, int delay) { SDL_AddTimer(delay, sdl_refresh_timer_cb, is);}SDL_AddTimer() is an SDL function that simply makes a callback to the user-specfied function after a certain number of milliseconds (and optionally carrying some user data). We're going to use this function to schedule video updates - every time we call this function, it will set the timer, which will trigger an event, which will have our main() function in turn call a function that pulls a frame from our picture queue and displays it! Phew!

But first thing's first. Let's trigger that event. That sends us over to:

static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque) { SDL_Event event; event.type = FF_REFRESH_EVENT; event.user.data1 = opaque; SDL_PushEvent(&event); return 0; /* 0 means stop timer */}Here is the now-familiar event push. FF_REFRESH_EVENT is defined here as SDL_USEREVENT + 1. One thing to notice is that when we return 0, SDL stops the timer so the callback is not made again.

Now we've pushed an FF_REFRESH_EVENT, we need to handle it in our event loop:

for(;;) { SDL_WaitEvent(&event); switch(event.type) { /* ... */ case FF_REFRESH_EVENT: video_refresh_timer(event.user.data1); break;and that sends us to this function, which will actually pull the data from our picture queue:

void video_refresh_timer(void *userdata) { VideoState *is = (VideoState *)userdata; VideoPicture *vp; if(is->video_st) { if(is->pictq_size == 0) { schedule_refresh(is, 1); } else { vp = &is->pictq[is->pictq_rindex]; /* Timing code goes here */ schedule_refresh(is, 80); /* show the picture! */ video_display(is); /* update queue for next picture! */ if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) { is->pictq_rindex = 0; } SDL_LockMutex(is->pictq_mutex); is->pictq_size--; SDL_CondSignal(is->pictq_cond); SDL_UnlockMutex(is->pictq_mutex); } } else { schedule_refresh(is, 100); }}For now, this is a pretty simple function: it pulls from the queue when we have something, sets our timer for when the next video frame should be shown, calls video_display to actually show the video on the screen, then increments the counter on the queue, and decreases its size. You may notice that we don't actually do anything with vp in this function, and here's why: we will. Later. We're going to use it to access timing information when we start syncing the video to the audio. See where it says "timing code here"? In that section, we're going to figure out how soon we should show the next video frame, and then input that value into the schedule_refresh() function. For now we're just putting in a dummy value of 80. Technically, you could guess and check this value, and recompile it for every movie you watch, but 1) it would drift after a while and 2) it's quite silly. We'll come back to it later, though.

We're almost done; we just have one last thing to do: display the video! Here's that video_display function:

void video_display(VideoState *is) { SDL_Rect rect; VideoPicture *vp; AVPicture pict; float aspect_ratio; int w, h, x, y; int i; vp = &is->pictq[is->pictq_rindex]; if(vp->bmp) { if(is->video_st->codec->sample_aspect_ratio.num == 0) { aspect_ratio = 0; } else { aspect_ratio = av_q2d(is->video_st->codec->sample_aspect_ratio) * is->video_st->codec->width / is->video_st->codec->height; } if(aspect_ratio <= 0.0) { aspect_ratio = (float)is->video_st->codec->width / (float)is->video_st->codec->height; } h = screen->h; w = ((int)rint(h * aspect_ratio)) & -3; if(w > screen->w) { w = screen->w; h = ((int)rint(w / aspect_ratio)) & -3; } x = (screen->w - w) / 2; y = (screen->h - h) / 2; rect.x = x; rect.y = y; rect.w = w; rect.h = h; SDL_DisplayYUVOverlay(vp->bmp, &rect); }}Since our screen can be of any size (we set ours to 640x480 and there are ways to set it so it is resizable by the user), we need to dynamically figure out how big we want our movie rectangle to be. So first we need to figure out our movie's aspect ratio, which is just the width divided by the height. Some codecs will have an odd sample aspect ratio, which is simply the width/height radio of a single pixel, or sample. Since the height and width values in our codec context are measured in pixels, the actual aspect ratio is equal to the aspect ratio times the sample aspect ratio. Some codecs will show an aspect ratio of 0, and this indicates that each pixel is simply of size 1x1. Then we scale the movie to fit as big in our screen as we can. The & -3 bit-twiddling in there simply rounds the value to the nearest multiple of 4. Then we center the movie, and call SDL_DisplayYUVOverlay().

So is that it? Are we done? Well, we still have to rewrite the audio code to use the new VideoStruct, but those are trivial changes, and you can look at those in the sample code. The last thing we have to do is to change our callback for ffmpeg's internal "quit" callback function:

VideoState *global_video_state;int decode_interrupt_cb(void) { return (global_video_state && global_video_state->quit);}We set global_video_state to the big struct in main().

So that's it! Go ahead and compile it:

gcc -o tutorial04 tutorial04.c -lavutil -lavformat -lavcodec -lz -lm \`sdl-config --cflags --libs`and enjoy your unsynced movie! Next time we'll finally build a video player that actually works!

>> Syncing Video

评论(?)
阅读(?)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
网易公司版权所有 ©1997-2009