Basic Syntax of FFmpeg

Oct 16, 2021

Since last essay, we have known the basic concepts of media files. Then we need to solve some real-world problems by using the command-line tool ffmpeg.

Overview

Before we gets our hands on the actual commands, let us review the procedure of FFmpeg processing media files:

 _______              ______________
|       |            |              |
| input |  demuxer   | encoded data |   decoder
| file  | ---------> | packets      | -----+
|_______|            |______________|      |
                                           v
                                       _________
                                      |         |
                                      | decoded |
                                      | frames  |
                                      |_________|
 ________             ______________       |
|        |           |              |      |
| output | <-------- | encoded data | <----+
| file   |   muxer   | packets      |   encoder
|________|           |______________|

Refer to FFmpeg - Documentation

ffmpeg calls the libavformat library (containing demuxers) to read input files and get packets containing encoded data from them. When there are multiple input files, ffmpeg tries to keep them synchronized by tracking lowest timestamp on any active input stream.

Encoded packets are then passed to the decoder (unless streamcopy is selected for the stream, see further for a description).

The decoder produces uncompressed frames (raw video/PCM audio/…) which can be processed further by filtering (see next section).

After filtering, the frames are passed to the encoder, which encodes them and outputs encoded packets.

Finally those are passed to the muxer, which writes the encoded packets to the output file.

Synopsis

The command for ffmpeg can be roughly divided into three parts: global options, input variable and input options, output variables and output options. And the template looks like this:

ffmpeg [global_options] \
	{[input_file_options] -i input_url} ... \
	{[output_file_options] output_url} ...

In the template, all the brackets([, ], {, } ) only works as delimiter, which will not appear in actual commands. The back-slash \ is to inform the Shell (Shell is the program which users interact with in the terminal window via commands) that the command is adding a new line.

Explanation

The ... in the command template indicates that the number of input or output files can be arbitrary, not necessary to be limited to one file.
- Although the command syntax allows arbitrary number of input and output files, the output container format may have a limit for the amount and types of streams. (e.g. You cannot put more than one video stream/track into one mp4 container.)
As for the “file”, it contains multiple types, including regular files, pipes, network streams, grabbing devices and so on. And as for the input files, they are all specified by -i
Anything found on the command line which cannot be interpreted as an option is considered to be an output file (shown as output_url in the template)

Container vs. Encoding vs. Postfix

Container
- The file format which saves the media files. The “mux” step in the flow chart is saving the media files into the container. The “demux” step in the flow chart is extracting the media files from the container.
- Example: mp4, mkv, aac, mp3
Encoding
- The algorithm to compress the media files. The “encode” step in the flow chart is compressing the media files by applying the algorithm. The “decode” step in the flow chart is un-compressing the media files by applying the algorithm.
- Example: H.264, H.265, VP9, AAC, ALAC, FLAC
Postfix
- The postfix of files, informing the Operating System of the file format. However, it is not a mandatory operation, since users can change the “Open with” application for opening up the files.
- Example: .txt, .jpg, .mp4, .mkv

As we can see, there might be some kind of overlapping between these three terms, but the meaning they are really indicating are very different.

For instance, an audio file encoded by AAC can be saved in a aac container, and it can also be saved in an mp4 container, or mkv container.

Further more, users can change the postfix of that audio file to whatever they like (even to .txt), but it will not change the actual file. As long as users select the right applications opening up the file, users still can listen to the auido stored in that file.

Quick Start

As we know the difference within container, encoding and postfix, it will be easier to understand the commands below. If we want to use a mp4 container to store an audio file which stored in aac container, then it goes like this:

ffmpeg -i original_audio.aac -c copy audio_in_new_container.mp4

The -c copy here stands for “Don’t change the encoding of the original file, just directly copy the encoded file to the new container”.

Consequently, in this scenario, we only did the “demux” and re-“mux” operation without any transcoding work.

Reference

FFmpeg - Documentation