Skip to content

FFmpeg Audio/Video Format Deep Dive: Savoring Media Like a Gourmet

Tip: If you have FFmpeg installed on your computer, you can use it directly from the command line. If not, download a pre-compiled FFmpeg toolkit. For example, the Windows pre-packaged FFmpeg directory in pyVideoTrans already includes the ffmpeg.exe file. You can navigate to this folder, type cmd in the address bar, and press Enter to execute ffmpeg commands. For instance, type ffmpeg -h to view the FFmpeg help information.

Furthermore, for easier access, add the FFmpeg installation directory to your system's environment variables. To do this, right-click "This PC," select "Properties," and then click "Advanced system settings." In the "System Properties" window, click the "Environment Variables" button. In the "System variables" section, find the variable named "Path," select it, and click "Edit." In the pop-up window, click "New," then browse to and add the FFmpeg installation directory. This allows you to open a CMD or PowerShell window from any location and directly execute ffmpeg commands, including the examples below.

I. Video Formats: The "Common Languages" and "Dialects" of the Video World

Video formats are like different languages; they all convey video information, but their encoding and compression methods vary. Choosing the right video format is like choosing the right language to communicate, enabling you to express your ideas more effectively.

  1. MP4 (.mp4): The "Mandarin" of Video, King of Compatibility

    • Characteristics: MP4 is currently the most widely used and compatible video format, supported by almost all devices and platforms. Like Mandarin, it allows you to communicate smoothly everywhere. Therefore, MP4 is the preferred format for video sharing.

    • Encoding: Common video codecs for MP4 are H.264 (AVC) and H.265 (HEVC). H.264 is well-established with excellent compatibility, while H.265 is a more advanced codec that can compress files smaller while maintaining the same quality.

    • FFmpeg Usage:

      • Convert other video formats to H.264 encoded MP4:

        bash
        ffmpeg -i input.avi -c:v libx264 output.mp4
      • Convert other video formats to H.265 encoded MP4:

        bash
        ffmpeg -i input.avi -c:v libx265 output.mp4
        • -c:v: This parameter specifies the video encoder. libx264 is the H.264 encoder, and libx265 is the H.265 encoder. FFmpeg supports various encoders, allowing you to choose the appropriate one based on your needs.
        • -preset slow: This parameter controls the encoding speed and quality. Slower encoding speeds usually result in better video quality. slow is a choice that balances quality and speed. Other options include ultrafast (fastest, but lower quality) to veryslow (slowest, but highest quality). Generally, medium and slow are common choices.
        • -crf 23: This parameter controls the video quality. It typically ranges from 18-28; the smaller the value, the better the video quality, but the larger the file size. This parameter is crucial when using libx264 and libx265 encoders, helping you balance video quality and file size.
    • Suitable Scenarios: Online video, mobile devices, general storage. MP4 is suitable for almost all video-related scenarios. Whether uploading to video websites or watching on mobile phones, MP4 is a good choice.

  2. AVI (.avi): An Old "Dialect," Historically Significant but Slightly Outdated

    • Characteristics: AVI is a historically significant video format, but its compression efficiency is relatively low, resulting in larger file sizes. Like photos taken with old cameras, they are classic but may lack the clarity of modern cameras.

    • Encoding: AVI can contain various video codecs, such as DivX, Xvid, etc. AVI is like a container that can load different encoding methods.

    • FFmpeg Usage:

      • Convert to other formats: Due to AVI's low compression efficiency, it is often converted to more modern formats like MP4.

        bash
        ffmpeg -i input.avi -c:v libx264 output.mp4
      • Repair AVI files: Sometimes, AVI files may be corrupted, and FFmpeg can be used to attempt repair.

        bash
        ffmpeg -i input.avi -c copy -copyts output.avi
        • -c copy: This parameter directly copies the video and audio streams without re-encoding, making it very fast. This method is suitable when only the container is damaged, but the encoding itself is fine.
  3. MKV (.mkv): A Powerful "Universal Container," All-Encompassing

    • Characteristics: MKV is a very flexible video format that can accommodate multiple video, audio, and subtitle tracks. It is like a powerful "shipping container" that can package various elements together. Therefore, MKV is also known as the "universal container."

    • Encoding: MKV supports various video codecs, such as H.264, H.265, etc., and also supports various audio codecs, such as AAC, MP3, etc. MKV is highly inclusive, supporting almost all common encoding methods.

    • FFmpeg Usage:

      • Extract video or audio from MKV files: Extract video or audio streams from MKV format.

        bash
        ffmpeg -i input.mkv -c:v copy -an video.mp4   # Extract video
        ffmpeg -i input.mkv -c:a copy -vn audio.aac   # Extract audio
        • -an: This parameter removes audio.
        • -vn: This parameter removes video.
      • Merge video and audio into an MKV file: Merge video and audio into a single MKV file.

        bash
        ffmpeg -i video.mp4 -i audio.aac -c copy output.mkv
  4. MOV (.mov): Apple's "Official Language," A High-Quality Choice

    • Characteristics: MOV is a video format developed by Apple, commonly used in QuickTime Player and macOS. MOV typically offers high video quality, like Apple phones, known for their sophistication and high quality.

    • Encoding: The common video codec for MOV is H.264.

    • FFmpeg Usage:

      • Convert between other formats: Similar to MP4, MOV can be easily converted to other formats.

        bash
        ffmpeg -i input.mov -c:v libx264 output.mp4
  5. WebM (.webm): The "Common Language" of the Internet, Open-Source and Free

    • Characteristics: WebM is an open-source, free video format designed specifically for online video, suitable for playback in HTML5 web pages. With WebM, you can play videos on web pages without installing any plugins.

    • Encoding: Common video codecs for WebM are VP8 and VP9. VP9 is a more efficient compression method, making it the recommended codec for WebM.

    • FFmpeg Usage:

      • Convert other video formats to VP9 encoded WebM format:

        bash
        ffmpeg -i input.mp4 -c:v libvpx-vp9 -crf 30 -b:v 0 output.webm
        • -b:v 0: This parameter uses quality-based variable bitrate (VBR) encoding. This encoding method automatically adjusts the bitrate based on the complexity of the video content, minimizing file size while ensuring video quality. The -crf parameter is also applicable to this encoding method.

II. Audio Formats: Let Your Ears Enjoy "Heavenly Music," Choose the Most Suitable Sound

Audio formats determine the sound quality, file size, and compatibility. Like choosing different audio equipment, different audio formats can provide different auditory experiences.

  1. MP3 (.mp3): The "Mandarin" of Audio, Can Be Heard Anywhere

    • Characteristics: MP3 is a highly compressed, small-sized, and compatible audio format, supported by almost all devices. Like the "Mandarin" of audio, it can be played on any device. However, MP3 format sacrifices some sound quality.

    • FFmpeg Usage:

      • Convert other audio formats to MP3 format:

        bash
        ffmpeg -i input.wav -c:a libmp3lame -q:a 4 output.mp3
        • -c:a: This parameter specifies the audio encoder. libmp3lame is a commonly used MP3 encoder.
        • -q:a: This parameter specifies the audio quality, ranging from 0-9. The smaller the value, the higher the audio quality, but the larger the file size. This parameter controls the compression level of the MP3.
  2. AAC (.aac): An Upgraded Version of MP3, Better Sound Quality, Smaller Size

    • Characteristics: AAC is an audio format with better sound quality and higher compression efficiency than MP3. It is a common audio encoding method for MP4 videos. Like a high-definition photo, it has richer details and more vivid colors.

    • FFmpeg Usage:

      • Convert other audio formats to AAC format:

        bash
        ffmpeg -i input.wav -c:a aac -b:a 128k output.aac
        • -b:a: This parameter specifies the audio bitrate. 128kbps is a common choice. The higher the bitrate, the better the sound quality, but the larger the file size.
  3. WAV (.wav): Preserves "Original Flavor," Sound Quality is Paramount

    • Characteristics: WAV is a lossless format that retains all audio information, resulting in the best sound quality. Like an unprocessed original photo, it retains all details, but the file size is also the largest.

    • FFmpeg Usage:

      • Convert between other formats: WAV is often used as high-quality original audio for converting to other formats.

        bash
        ffmpeg -i input.mp3 output.wav
  4. FLAC (.flac): A "Master" of Lossless Compression, Balancing Size and Quality

    • Characteristics: FLAC is a lossless compression format that guarantees sound quality while reducing file size. Therefore, FLAC is the first choice for audiophiles. Like a losslessly compressed photo, it retains all details while reducing file size.

    • FFmpeg Usage:

      • Convert other audio formats to FLAC format:

        bash
        ffmpeg -i input.wav -c:a flac output.flac
  5. OGG (.ogg): An Open-Source "Rising Star," Ideal Choice for Network Streaming

    • Characteristics: OGG is an open-source, free audio format with good sound quality, suitable for network streaming. OGG can be used without any authorization. Like open-source software, it is free and powerful. The commonly used OGG codec is Vorbis.

    • FFmpeg Usage:

      • Convert other audio formats to OGG format, using Vorbis encoding:

        bash
        ffmpeg -i input.wav -c:a libvorbis -q:a 5 output.ogg

III. Common FFmpeg Parameters: Simplify Complexity, Master the Core of Audio and Video Processing

FFmpeg's power lies in its flexibility, allowing you to control the audio and video processing through various parameters. Mastering these parameters is like being a skilled chef, cooking up various delicious audio and video dishes.

  • -i input.xxx: Specify the input file. input.xxx is the path to the audio or video file you want to process.
  • output.xxx: Specify the output file. output.xxx is the path to the file generated after FFmpeg processing.
  • -c:v: Specify the video encoder, such as libx264, libx265, libvpx-vp9, etc. Choosing the appropriate encoder can control the video's compression efficiency and compatibility.
  • -c:a: Specify the audio encoder, such as libmp3lame, aac, libvorbis, etc. Choosing the appropriate audio encoder can control the audio's quality and file size.
  • -b:v: Specify the video bitrate, such as 2000k (2000kbps), 5M (5Mbps), etc. The higher the bitrate, the better the video quality, but the larger the file size.
  • -b:a: Specify the audio bitrate, such as 128k (128kbps), 192k (192kbps), etc. The higher the bitrate, the better the audio quality, but the larger the file size.
  • -r: Specify the frame rate (frames per second), controlling the smoothness of the video. Generally, movies have a frame rate of 24fps, and TV shows have a frame rate of 30fps.
  • -s: Specify the resolution, such as 1280x720 (720p), 640x480 (480p), etc. The higher the resolution, the clearer the video.
  • -ss: Specify the start time, such as 00:00:10 to start processing from the 10th second. Can be used to trim video clips.
  • -t: Specify the duration, such as 00:00:05 for a duration of 5 seconds. Can be used to trim video clips.
  • -vn: Disable video, processing only audio.
  • -an: Disable audio, processing only video.
  • -filter:v: Add video filters, such as scale=640:480 to change the resolution. FFmpeg provides rich filters for various video processing.
  • -filter:a: Add audio filters for various audio processing, such as adjusting volume, noise reduction, etc.
  • -threads: Specify the number of threads to increase encoding speed. Multi-threading can fully utilize CPU performance and accelerate encoding speed.

IV. -c copy: The Magic of Copying Streams, Experience the Speed of "Instant Transfer"

  • -c copy: This parameter directly copies the original video or audio stream without any re-encoding. Like copying files, it is very fast and does not lose any quality. However, using -c copy requires meeting specific conditions.

When can -c copy be used? Like playing a jigsaw puzzle, you can only complete it successfully if the shapes and sizes of the pieces match.

The core of -c copy is "copying," so it can only be used when the input and output formats are compatible, and you don't need to change the encoding method. Specifically:

  1. Change the container format, but the internal encoding remains the same: Like changing a packaging box, but the contents inside remain the same.

    • For example, extracting a video stream from input.mp4 to output.mkv. If the video encoding in input.mp4 is already H.264, and you only want to put the H.264 video into the MKV container, you can use -c copy.

    • Another example is converting an MP4 file to a TS file, which only changes the encapsulation format.

      bash
      ffmpeg -i input.mp4 -c copy output.ts
  2. Extract video or audio streams: Like cutting a piece from a large cake, you only take out a portion without changing its shape.

    • Extract audio from a video file, and the audio encoding does not need to be changed (e.g., both are AAC).

    • For example, extracting H.264 encoded video from input.mkv to output.mp4, and the output.mp4 container format can also well support H.264 encoding.

      bash
      ffmpeg -i input.mkv -c:v copy -an output.mp4   # Extract video
      ffmpeg -i input.mkv -c:a copy -vn output.aac   # Extract audio
  3. Repair files: Like repairing a broken packaging box, the contents inside are not damaged and only need to be repackaged. (In some cases, file corruption is only a container-level issue).

    bash
    ffmpeg -i input.avi -c copy -copyts output.avi

    The -copyts parameter can copy timestamps, which helps fix some issues with timeline disorder. Timestamps are like a video's "diary," recording the playback time of each frame.

When can -c copy not be used? Like trying to fit a square block into a round hole, it won't fit no matter what.

  1. Need to change the encoding method: Like needing to translate one language into another, a conversion must be performed.

    • For example, converting H.264 encoded video to H.265 encoded video.
    • Converting MP3 audio to AAC audio.
  2. Change parameters such as resolution and frame rate: Like needing to enlarge or reduce a photo, it must be reprocessed.

    • As long as modifications to video or audio content are involved, -c copy cannot be used, and the corresponding encoder and parameters need to be specified.
  3. Input and output formats are incompatible: Like needing to plug different types of plugs together, a converter must be used.

    • Although this situation is relatively rare, some formats may not be able to directly copy stream data to each other.

Common Usage Combinations of -c copy:

  • Video:

    • MP4 <-> MOV (if both are H.264 encoded) Like using the same language, you can communicate freely.
    • MKV <-> MP4/MOV/AVI (extract or encapsulate, encoding remains unchanged) Like putting different things into or taking them out of the same container.
    • TS (MPEG Transport Stream) <-> MP4/MKV (encoding remains unchanged, commonly used for live stream processing) TS format is often used in live broadcasts and can be easily converted to other formats.
  • Audio:

    • AAC <-> MP4/MOV/MKV (extract or encapsulate)
    • MP3 <-> MP4/MOV/MKV (extract or encapsulate)
    • WAV -> WAV (repair)

Examples:

  • Correct use of -c copy:

    bash
    # Extract H.264 video from MKV to MP4
    ffmpeg -i input.mkv -c:v copy -an output.mp4
    
    # Encapsulate AAC audio into MP4 container
    ffmpeg -i input.aac -vn -c:a copy output.mp4
  • Incorrect use of -c copy:

    bash
    # Incorrect! Trying to convert H.264 to H.265, cannot use copy
    ffmpeg -i input.mp4 -c:v libx265 output.mp4 -c copy #Incorrect, -c copy should not be added

V. Practical Exercises: Several Common FFmpeg Tasks to Enhance Your Audio and Video Processing Skills

  1. Video Format Conversion: Convert video from one format to another, like translating languages, allowing more people to understand.

    bash
    ffmpeg -i input.mov -c:v libx264 -c:a aac output.mp4
  2. Video Trimming: Extract exciting clips from the video, like editing a movie to highlight the most exciting parts.

    bash
    ffmpeg -i input.mp4 -ss 00:00:10 -t 00:00:05 -c copy output.mp4
  3. Audio Extraction: Extract audio from video, like extracting the accompaniment from a song, making it easier for you to create secondary content.

    bash
    ffmpeg -i input.mp4 -vn -c:a copy output.aac
  4. Merge Video and Audio: Merge video and audio into one file, like mixing dishes and rice together for easy consumption.

    bash
    ffmpeg -i video.mp4 -i audio.aac -c copy output.mkv
  5. Adjust Video Resolution: Change the size of the video image, like adjusting the size of a photo to fit different devices and platforms.

    bash
    ffmpeg -i input.mp4 -vf scale=640:480 output.mp4
    • -vf is equivalent to -filter:v