Ffcms – FFmpeg's -filter_complex made simple

栏目: IT技术 · 发布时间: 5年前

内容简介：This article is not meant to dive into a complete solution or tool. It's going to describe a concept and a simple script that I'm working on and I think will make my relationship with(Yes I know that theFFmpeg is a very useful tool. It comes with a variety

ffcms - FFmpeg's -filter_complex made simple

This article is not meant to dive into a complete solution or tool. It's going to describe a concept and a simple script that I'm working on and I think will make my relationship with FFmpeg easier. Maybe yours too? (:

(Yes I know that the FFmpeg project provides more than only the ffmpeg binary. But, in this art, when saying FFmpeg I mean the ffmpeg binary only.)

FFmpeg

FFmpeg is a very useful tool. It comes with a variety of options, codecs and generally stuff that you may need while operating on media files.

For simple tasks, the tool is very straightforward. You want to convert a video? Beng: ffmpeg -i input.mov output.mp4 . You want to remove audio from a video? Beng beng: ffmpeg -i input.mov -an output.mov .

Simplicity is very important but there are cases when you need to do more complex stuff. E.g. you want to put a video on top of another, bigger one, then you want to adjust colors a little and then...

Yee.. The wishlist can escalate quickly. But, fear no more! FFmpeg is right here for you. Well, you just need to know how to use it. So you google it. FFmpeg put video on another . You quickly find out that it is called overlaying and you must use a filter named, you guessed it, overlay . Fortunately FFmpeg 's community is helpful and you find similar problem at stackoverflow . The command is:

ffmpeg -i input.mov -i overlay.mov \
-filter_complex "[1:v]setpts=PTS-10/TB[a]; \
                 [0:v][a]overlay=enable=gte(t\,5):shortest=1[out]" \
-map [out] -map 0:a \
-c:v libx264 -crf 18 -pix_fmt yuv420p \
-c:a copy \
output.mov

Back in the days, when I found an answer like this I was like

Ffcms – FFmpeg's -filter_complex made simple

At the end, it's not THAT bad. You can actually guess what some of these mean:

input.mov and overlay.mov are probably videos that I want to use.
-i is used twice, which is strange, but it's always followed by a video that I have on disk, so it probably means input .
-filter_complex it's followed by a string which.. ee.. has square brackets.. oh, there is a Python slicing! It's a Python script? It computes 10 / terabyte , substracts it from PTS and assigns it to a variable called setpts . Hmm, why is there a semicolon? Python doesn't need them. I'll get back to it later..
-map so we map [out] whathever that is and then we -map another Thing. Why not? I'll take that.
-c:v .. I refuse.

Of course, that's an exaggeration. But, there is a grain of truth in it, don't you think?

ffcms

The -filter_complex argument is the one that the article is about. I want to describe an idea for a tool that simplifies constructing FFmpeg command that is supposed to use -filter_complex .

(Yep, I know that more arguments understand filtering graph syntax, but let's not bother about it here)

The idea

The concept is actually very simple. Instead of writing a FFmpeg command at once in terminal, you write a JSON that describes what the command's supposed to do. Then you pass the JSON to ffcms tool and it prints a complete ffmpeg command that just can't wait to be run. If you're a visual person:

Ffcms – FFmpeg's -filter_complex made simple

Examples

Original 2x2 grid

FFmpeg provides an example of -filter_complex usage to create a grid out of one video. It can be found here: https://trac.ffmpeg.org/wiki/FilteringGuide#Multipleinputoverlayin2x2grid .

The command there is:

ffmpeg -f lavfi -i testsrc -f lavfi -i testsrc -f lavfi -i testsrc -f lavfi -i testsrc -filter_complex \
"[1:v]negate[a]; \
 [2:v]hflip[b]; \
 [3:v]edgedetect[c]; \
 [0:v][a]hstack=inputs=2[top]; \
 [b][c]hstack=inputs=2[bottom]; \
 [top][bottom]vstack=inputs=2[out]" -map "[out]" -c:v ffv1 -t 5 multiple_input_grid.avi

Let's break it down:

ffmpeg - The FFmpeg binary you want to run.
-f lavfi -i testsrc - declaration of an input. The input format is lavfi , the input file is testsrc . This is duplicated four times to imitate four different inputs.
-filter_complex - that's the place where magic takes place.
- [1:v]negate[a] - Take the video from input at index 1 , use negate filter on it and 'save' output of the filter to a link named a .
- [2:v]hflip[b] - Take the video from input at index 2 , use hflip filter on it and 'save' output of the filter to a link named b .
- [3:v]edgedetect[c] - Take the video from input at index 3 , use edgedetect filter on it and 'save' output of the filter to a link named c .
- [0:v][a]hstack=inputs=2[top] - Take two inputs: the video from input at index 0 and something that is in the link a (which is output of [1:v]negate filtering). Then, on these two inputs, perform filter named hstack with one parameter inputs that is set to 2 . Save the result in link named top
- [b][c]hstack=inputs=2[bottom] - Similar to above. Take two inputs from links b and c perform filter hstack with parameter inputs set to 2 . Save the result in link named bottom .
- [top][bottom]vstack=inputs=2[out] - take two inputs from links top and bottom , perform vstack filter with param inputs set to 2 and save the result in link named out .
-map - map stuff from link named out as input to next arguments.
-c:v ffv1 - use FFV1 codec for video.
-t 5 - result should be a 5 second video
multiple_input_grid.avi - the name of the output video.

Whoa, that's a lot. The result is something like this (source: https://trac.ffmpeg.org/attachment/wiki/FilteringGuide/multiple input overlay.jpg ):

Ffcms – FFmpeg's -filter_complex made simple

ffcms style

I don't have the original testsrc file, so I'll use part of the timelapse video that I made for my fiancée. Here's a gif made out of the original video:

Ffcms – FFmpeg's -filter_complex made simple

(BTW, she paints the best mugs in the world, check it out: ig/fingers.in.paint )

To be 99% accurate, let's change the above ffmpeg command to use my timelapse.mov (the 1% is the removed -f lavfi arguments):

ffmpeg -i timelapse.mov -i timelapse.mov -i timelapse.mov -i timelapse.mov -filter_complex \
"[1:v]negate[a]; \
 [2:v]hflip[b]; \
 [3:v]edgedetect[c]; \
 [0:v][a]hstack=inputs=2[top]; \
 [b][c]hstack=inputs=2[bottom]; \
 [top][bottom]vstack=inputs=2[out]" -map "[out]" -c:v ffv1 -t 5 timelapse_grid.avi

Now let's write a JSON file that ffcms would convert to a command like the above one. Here's the whole JSON:

{
  "in": [
    {
      "id": "first_input",
      "file": "timelapse.mov"
    },
    {
      "id": "second_input",
      "file": "timelapse.mov"
    },
    {
      "id": "third_input",
      "file": "timelapse.mov"
    },
    {
      "id": "fourth_input",
      "file": "timelapse.mov"
    }
  ],

  "out": "timelapse_grid.avi",

  "filters": [
    {
      "in": "second_input",
      "filter": "negate",
      "out": "a"
    },
    {
      "in": "third_input",
      "filter": "hflip",
      "out": "b"
    },
    {
      "in": "fourth_input",
      "filter": "edgedetect",
      "out": "c"
    },
    {
      "in": [
        "first_input",
        "a"
      ],
      "filter": "hstack",
      "out": "top"
    },
    {
      "in": [
        "b",
        "c"
      ],
      "filter": "hstack",
      "out": "bottom"
    },
    {
      "in": [
        "top",
        "bottom"
      ],
      "filter": "vstack",
      "out": "out"
    }
  ]
}

Breaking down.

"in"

The top-level "in" is an array of inputs that you'd pass after -i argument to ffmpeg binary. They are in the same order as you'd write them in the command line. It's used to let ffcms know what the input media files are and to assign a meaningful "id" s to the files. Thanks to that, in "filters" you can refer to them using e.g. first_input , not 0:v .

"out"

The top-level "out" is just the name of the output video.

"filters"

The top-level "filters" are the filters that you'd write in the string provided to -filter_complex argument.

E.g. the first filter:

{
    "in": "second_input",
    "filter": "negate",
    "out": "a"
}

is equivalent of the [1:v]negate[a] filter in -filter_complex 's string.

Mind the hstack and vstack filters. In -filter_complex string you need to pass number of inputs like: [b][c]hstack=inputs=2[bottom] . In ffcms JSON you write:

{
    "in": [
        "b",
        "c"
    ],
    "filter": "hstack",
    "out": "bottom"
}

The number of inputs is well-known. It's the number of elements in the "in" array, so ffcms adds inputs=2 to the result command on its own.

Renaming ids

Actually, I don't like the names of the ids. a , b and c ? Let's make them more meaningful:

"filters": [
    {
        "in": "second_input",
        "filter": "negate",
        "out": "negated"
    },
    {
        "in": "third_input",
        "filter": "hflip",
        "out": "flipped"
    },
    {
        "in": "fourth_input",
        "filter": "edgedetect",
        "out": "edged"
    },
    {
        "in": [
            "first_input",
            "negated"
        ],
        "filter": "hstack",
        "out": "top"
    },
    {
        "in": [
            "flipped",
            "edged"
        ],
        "filter": "hstack",
        "out": "bottom"
    },
    {
        "in": [
            "top",
            "bottom"
        ],
        "filter": "vstack",
        "out": "out"
    }
]

FFmpeg command

After creating the JSON, you can pass it to ffcms . With the JSON above it prints such FFmpeg command:

ffmpeg -y -i timelapse.mov -i timelapse.mov -i timelapse.mov -i timelapse.mov -filter_complex \
"[1:v]negate[negated]; \
 [2:v]hflip[flipped]; \
 [3:v]edgedetect[edged]; \
 [0:v][negated]hstack=inputs=2[top]; \
 [flipped][edged]hstack=inputs=2[bottom]; \
 [top][bottom]vstack=inputs=2[out]" \
 -map "[out]" -c:v ffv1 timelapse_grid.avi

Now you can just grab it and run in terminal.

Here you can see a gif created out of the result video:

Ffcms – FFmpeg's -filter_complex made simple

Cameraman

Now let's go further and make our wishlist a little more complex. Let's create a grid, like in the previous example. Additionally, let's overlay an image on top of edged video part. We want to overlay it at given x and y . Plus, we want to make it visible for the first three seconds only.

First of, the most important. The Cameraman:

Ffcms – FFmpeg's -filter_complex made simple

Now let's adjust JSON to handle additional requirements.

Adding input media

We need to let ffcms know that there is another input media file. Let's add it to the top-level "in" array:

"in": [
    {
        "id": "cameraman",
        "file": "cameraman.png"
    }
]

We need to add one more filter to the top-level filters list - overlay . There we use edged and cameraman as its inputs:

"filters": [
    {
        "in": [
            "edged",
            "cameraman"
        ]
    }
]

Next we define the filter:

"filters": [
    {
        "in": [
            "edged",
            "cameraman"
        ],
        "filter": {
            "name": "overlay",
            "params": {
                "x": 200,
                "y": 100,
                "enable": "'between(t,0,3)'"
            }
        }
    }
]

x and y are obvious but enable not so much. The enable parameter tells when the cameraman input should be visible on top of the edged one. We specify here, that it should be visible when time ( t ) is between 0 and 3 second.

Now, the last thing is to specify the output link id:

"filters": [
    {
        "in": [
            "edged",
            "cameraman"
        ],
        "filter": {
            "name": "overlay",
            "params": {
                "x": 200,
                "y": 100,
                "enable": "'between(t,0,3)'"
            }
        },
        "out": "cameraman_on_edges"
    }
]

Cool. Now we just need to use cameraman_on_edges in one of the hstack filters:

"filters": [
    {
        "in": [
            "flipped",
            "cameraman_on_edges"
        ],
        "filter": "hstack",
        "out": "bottom"
    }
]

And that's it. The full JSON looks like this:

{
    "in": [
        {
            "id": "first_input",
            "file": "timelapse.mov"
        },
        {
            "id": "second_input",
            "file": "timelapse.mov"
        },
        {
            "id": "third_input",
            "file": "timelapse.mov"
        },
        {
            "id": "fourth_input",
            "file": "timelapse.mov"
        },
        {
            "id": "cameraman",
            "file": "cameraman.png"
        }
    ],

    "out": "timelapse_grid.avi",

    "filters": [
        {
            "in": [
                "second_input"
            ],
            "filter": "negate",
            "out": "negated"
        },
        {
            "in": [
                "third_input"
            ],
            "filter": "hflip",
            "out": "flipped"
        },
        {
            "in": [
                "fourth_input"
            ],
            "filter": "edgedetect",
            "out": "edged"
        },
        {
            "in": [
                "edged",
                "cameraman"
            ],
            "filter": {
                "name": "overlay",
                "params": {
                    "x": 200,
                    "y": 100,
                    "enable": "'between(t,0,3)'"
                }
            },
            "out": "cameraman_on_edges"
        },
        {
            "in": [
                "first_input",
                "negated"
            ],
            "filter": "hstack",
            "out": "top"
        },
        {
            "in": [
                "flipped",
                "cameraman_on_edges"
            ],
            "filter": "hstack",
            "out": "bottom"
        },
        {
            "in": [
                "top",
                "bottom"
            ],
            "filter": "vstack",
            "out": "out"
        }
    ]
}

ffcms outputs such command:

ffmpeg -i timelapse.mov -i timelapse.mov -i timelapse.mov -i timelapse.mov -i cameraman.png -filter_complex \
"[1:v]negate[negated]; \
 [2:v]hflip[flipped]; \
 [3:v]edgedetect[edged]; \
 [edged][4:v]overlay=enable='between(t,0,3)':x=200:y=100[cameraman_on_edges]; \
 [0:v][negated]hstack=inputs=2[top]; \
 [flipped][cameraman_on_edges]hstack=inputs=2[bottom]; \
 [top][bottom]vstack=inputs=2[out]" \
 -map "[out]" -c:v ffv1 timelapse_grid.avi

and the result video looks like this:

Ffcms – FFmpeg's -filter_complex made simple

Conclusion

Please note that ffcms is not a finished tool. There is still much work to do. Contributions are welcome (: Nevertheless, it's usable enough for me to use it with helping my fiancée with her stuff, so that's for sure a good start.

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

Python深度学习

[美] 弗朗索瓦•肖莱 / 张亮 / 人民邮电出版社 / 2018-8 / 119.00元

本书由Keras之父、现任Google人工智能研究员的弗朗索瓦•肖莱（François Chollet）执笔，详尽介绍了用Python和Keras进行深度学习的探索实践，涉及计算机视觉、自然语言处理、生成式模型等应用。书中包含30多个代码示例，步骤讲解详细透彻。由于本书立足于人工智能的可达性和大众化，读者无须具备机器学习相关背景知识即可展开阅读。在学习完本书后，读者将具备搭建自己的深度学习环境、建......一起来看看《Python深度学习》这本书的介绍吧!

码农工具

Ffcms – FFmpeg's -filter_complex made simple