Editing Gameplay Videos without Re-encoding using FFmpeg

Tuesday, December 26th 2017

I recently worked on a YouTube let’s play with my brother. It was recorded using Dxtory with the x264vfw codec, meaning that the saved recordings are H.264 streams in an AVI container¹. Our recordings are 1920x1080 at 60fps. Audio commentary was recorded separately in Audacity.

When it came time to edit the videos, I fired up Adobe Premiere², but quickly ran into a problem. Rendering the 1080p/60fps videos was taking upwards of an hour. Futhermore the output videos had significantly lower quality due to the re-encoding. I knew that after YouTube transcoded the videos for it’s internal format, the final output would look even worse. To fix this, I did some experimenting with FFmpeg, a command-line based video processor, and found a worflow for editing our let’s play without re-encoding.

These tips require the command-line, but if you can get past that barrier, you’ll be able to edit your videos without long rendering times and quality downgrades. Also, you can do this for free without purchasing any software!

Trimming Footage

You can use FFmpeg to trim footage off of the beginning and end of your video. Below is an example of trimming a 20-second clip from the 100 second mark ³.

ffmpeg -ss 100 -i input.avi -t 20 -c copy output.avi

If you run this, you might find that your video isn’t exactly 20 seconds, but a little bit longer. This is because this method trims the video starting from the nearest keyframe to the provided timestamp. Unfortunately, there’s no way to get around this without re-encoding.

Concatenating Footage

Now let’s say you have two separate gameplay recordings that you want to concatenate together. FFmpeg lets you do this without having to re-encode. The interface for this is a bit strange - you have to create a file with file '{VIDEO_FILE_NAME}' on each line. Here’s a snippet for cutting two videos out of a source file and concatenating them together:

ffmpeg -ss 100 -i input.avi -t 20 -c copy output_1.avi
ffmpeg -ss 200 -i input.avi -t 20 -c copy output_2.avi
echo "file 'output_1.avi'\nfile 'output_2.avi'" > concat_list.txt
ffmpeg -f concat -safe 0 -i concat_list.txt -c copy concatenated.avi

Adding in Audio Commentary

This is the snippet that I use to mix in audio commentary with game audio, assuming that the audio commentary is already synced with the video. It’s a bit complicated, as it uses FFmpeg’s filtergraph functionality. Note that the audio is encoded using the mp3 codec. I keep my audio in FLAC form up till this point, so this is the first point where the audio is encoded.

ffmpeg -i input.avi -i commentary.flac \
    -filter_complex "[0:a]volume=0.5[volumeadj];[volumeadj]aformat=sample_fmts=s16:channel_layouts=stereo[volumeadj_fmt];[volumeadj_fmt][1:a]amerge=inputs=2[merged];[merged]volume=2.0[aout]" \
    -map 0:v -map "[aout]" \
    -c:v copy \
    -c:a libmp3lame -ac 2 \
    output.avi

However, this assumes that the audio is already synced. To sync it, you have two options:

Pad silence at the beginning of your recording in Audacity.
Concatenate silence to the beginning of the audio in FFmpeg, which is shown in the snippet below.

ffmpeg \
  -t 10 -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 \
  -i commentary.flac \
  -filter_complex "[0:a][1:a]concat=n=2:v=0:a=1[out]" \
  -map "[out]" \
  -c:a flac -ac 2 \
  output.flac

To find sync points, what I’ve done is move the cursor up and down in the menu, while saying the words “up” and “down.” This gives me a point in both the commentary and game-play recording to sync up ⁴.

Putting it All Together

You shouldn’t use this process for videos with a lot of edits as it’s much easier to use an NLE such as Premiere or Vegas ⁵. However, if you’re creative about how you split-up and sync your videos, I think this approach is worth it. The output videos are the same quality as your raw recordings, and the encoding process is often faster than real-time.

More Complicated Editing

Here’s an example of a more complicated edit involving multiple overlays and zoom levels that show up at different times. The entire edit, including trimming, audio syncing, and audio merging is done in one FFmpeg command. Since overlays are applied to the video, it must be re-encoded.

ffmpeg \
    -framerate 60 -loop 1 -i "zoomed-out-frame.png" \
    -i "gameplay-recording.ts" \
    -framerate 60 -loop 1 -i ".zoomed-in-frame.png" \
    -i "commentary.flac" \
    -t 4.11 -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 \
    -filter_complex "[1]scale=1024:640:flags=neighbor[scaled];[0][1]overlay=25:144:shortest=1[zoomedout];[2][scaled]overlay=50:38:shortest=1[zoomedin];[zoomedout][zoomedin]overlay=shortest=1:enable='gte(t,105)'[out];[3:a]atrim=19.131[trimmedaudio];[4][trimmedaudio]concat=n=2:v=0:a=1[adjustedaudio];[1:a]volume=0.35[volumeadj];[volumeadj][adjustedaudio]amerge=inputs=2[aout]" \
    -map "[out]" -map "[aout]" \
    -ss 00:00:21 -to 00:25:07 \
    -c:v libx264 -crf 18 \
    -c:a libmp3lame -ac 2 \
    -y output.mp4

For this kind of video, I highly recommend using an NLE!

The CRF is set to 23 (the default). I also had to increase the encoding speed, at the expense of having larger files. ↩
I’m using CS5.5 ↩
More info on trimming videos with FFmpeg. ↩
Once I forgot to do this and had to sync up by trying to match button press sounds with in-game actions! ↩
The reality is that, right now, there aren’t any good open-source NLEs so you’ll have to open your wallet for that kind of editing. The most promising one that I’ve looked at is OpenShot, and it just got an update that’s supposed to improve stability. ↩

#video-editing #ffmpeg