Goal of the script

Remove ads from TV recordings with optimal cuts with single video frame precision.

When I record a movie on TV, I sometimes wish to archive the movie on my NAS. In that case, I want to remove (at least) the beginning and the end of the recording if the movie was broadcast on a advertisment free channel, or even worse I may have to split it in numerous parts so as to remove the advertisments. And I do not want to reencode the entire movie since it's a really slow process (on my NAS) and the movie is already broadcast using H264 (DVB-T).

Doing this by hand is really painful, because most tools like ffmpeg, or mkvmerge are only able to cut a movie (without reencoding) at a boundary within the video stream that corresponds to a reference frame (so called I-frames). These frames are only present roughly every 10-20 frames which corresponds to quite long duration (in the order of a second).

I really want to cut the movie with a better precision. So I have written a Python script that leverages ffmpeg, ffprobe and mkvmerge, subvodocr to do the job with the required precision.

Parameters

How does it work ?

The processing follows a quite long pipeline:

  1. The original .ts file is first transformed into an .mp4 file using ffmpeg to correct timestamps:

  2. The .mp4 is then transformed into a Matroska container (which is the default container) still using ffmpeg:

  3. Then the movie is then cut using the indications passed as parameters. It is possible to give as many parts as needed.

Each part is treated with the same algorithm. Trouver l'estampille de la trame 'I' la plus proche (mais postérieure) au début de la portion. Trouver l'estampille de la trame 'I' la plus proche (mais antérieure) à la fin de la portion. On a alors debut ----- trame --------- trame --------- fin. 'B/P' 'B/P'* 'I' 'I' 'B/P'* 'B/P' Si la trame de début est déjà 'I', il n'y a rien à faire (idem pour la fin). Sinon on extrait les trames 'B' ou 'P' depuis le début jusqu'à la trame 'I' non incluse

  1. Then each part that have been previously obtained are merged using mkvmerge:

  2. The subtitles (image based) are then extracted using mkvextract:

  3. These images are then processed using vobsubocr to create SRT files:

  4. The SRT files are then remuxed inside the Matroska container using mkvmerge:

How to determine where to cuts

Use mpv --osd-fractions --osd-level=3 ./movie.ts

Description
Remove ads from TV recordings with optimal cuts.
Readme GPL-3.0 870 KiB
Languages
Python 99.9%