Compare commits

...

60 Commits

Author SHA1 Message Date
Frédéric Tronel
cb019413cc More linting: no more camelcase variables, functions, arguments or methods. 2025-10-30 16:33:15 +01:00
Frédéric Tronel
40ca3e136b More linting: no more camelcase for function names. 2025-10-29 14:54:42 +01:00
Frédéric Tronel
367cb440d8 Even more linting: function names with snake case, remove unused variables, wrong format f-strings, masked parameters. 2025-10-29 12:37:51 +01:00
Frédéric Tronel
c565699875 Even more linting: no more variable with camel case. 2025-10-29 10:04:47 +01:00
Frédéric Tronel
75f227786f Still more linting (encoding for open, variables with snake case). 2025-10-28 10:55:47 +01:00
Frédéric Tronel
960de11b1b More linting. 2025-10-27 15:48:41 +01:00
Frédéric Tronel
e192c66157 More linting: camel case for variable names, f-format strings. 2025-10-26 21:01:50 +01:00
Frédéric Tronel
23f1db5ffa Even more linting (long lines, trailing spaces, module importation order, variable names). 2025-10-26 17:14:29 +01:00
Frédéric Tronel
362844f8a1 Improve linting by remove superfluous parenthesis. 2025-10-25 16:55:05 +02:00
Frédéric Tronel
6ad2c3b50a Improve linting by remove bad indentations. 2025-10-25 16:47:28 +02:00
Frédéric Tronel
ddec8633e3 Improve linting by remove trailing spaces. 2025-10-25 16:45:20 +02:00
Frédéric Tronel
c3943ff70e Remove trailing spaces. 2025-10-25 16:33:29 +02:00
Frédéric Tronel
926ee16433 Improve pylint score and fix most errors. 2025-10-25 16:09:11 +02:00
Frédéric Tronel
489435a87f Improve pylint score and fix most errors. 2025-10-25 16:05:25 +02:00
Frédéric Tronel
efceec0e48 Huge improvement in the merging of the different video parts using different encoding profiles, since reproducing the H264 profile of broadcast movies in nearly impossible (some features do not have corresponding options in ffmpeg). 2025-09-19 16:41:16 +02:00
Frédéric Tronel
10234d67da Improve the README with details about the processing workflow of the script. 2025-09-19 16:32:11 +02:00
Frédéric Tronel
7e5a500279 The clean target of the Makefile removes all intermediate files created by the script when used with the --keep option. 2025-09-19 16:31:03 +02:00
Frédéric Tronel
8aca12c422 We ignore mpeg TS and Matroska files. 2025-09-19 16:29:47 +02:00
Frédéric Tronel
b94f865831 We handle the case where subtitles track are eventually empty after processing. 2023-12-24 16:52:40 +01:00
Frédéric Tronel
48cc4f8a27 hexdump package is required. 2023-12-24 14:31:20 +01:00
Frédéric Tronel
889b8dd6dc Subtitles extracted through OCR can be remuxed with the final cut movie (in addition to image based ones). 2023-12-24 14:29:42 +01:00
Frédéric Tronel
ffce9aecdf Handling of OCR to generate subtitles files is working. 2023-12-22 14:57:25 +01:00
Frédéric Tronel
4dbf9d9c03 Suppress SRT files for cleaning. 2023-12-22 14:56:05 +01:00
Frédéric Tronel
03922a76d2 Add dependancy to library iso639 that supports the normalized names of languages. 2023-12-22 10:42:34 +01:00
Frédéric Tronel
f23423ca8d Code to take into account the potential change of length field when modifying the EBML tree structure. 2023-12-20 10:46:54 +01:00
Frédéric Tronel
3681ff33f3 Remove code that was here to debug the codec private data changes. 2023-12-20 10:05:52 +01:00
Frédéric Tronel
59b55bac6c Make mkvmerge speaks english for its outputs so that the code is neutral with respect to locally installed languages. 2023-12-20 09:56:39 +01:00
Frédéric Tronel
2bf9b467bb We handle the cases where the old codec private data size is larger, smaller or equal to the new one. 2023-12-19 14:12:23 +01:00
Frédéric Tronel
6959e83327 Add a new option to not take into account sequences that are shorter than a certain threshold. 2023-12-18 16:14:57 +01:00
Frédéric Tronel
2f425aa9cf Adding a bunch of functions to modify codec private data inside video tracks, correct mkv binary representation after such changes. 2023-12-18 16:14:08 +01:00
Frédéric Tronel
556d88d73a mkvinfo command is now mandatory. 2023-12-18 16:11:46 +01:00
Frédéric Tronel
af52c80a8e Positioning inside files using lseek is made uniformly. 2023-12-18 16:11:05 +01:00
Frédéric Tronel
04d23ca1b2 The langage used by commands cannot be set using locales module. 2023-12-18 16:09:48 +01:00
Frédéric Tronel
88d9d15496 If we only try to convert from .ts to .mp4 or .mkv, without any cut, do not remove output file. 2023-12-15 09:38:47 +01:00
Frédéric Tronel
b1c58fc53a Correction d'un bug qui ne passe pas le chemin de ffmpeg dans la fonction ffmpegConvert. 2023-12-12 12:07:36 +01:00
Frédéric Tronel
4070f34a60 Ajout d'une large partie du code nécessaire à l'extraction des sous-titres via OCR. 2023-12-12 11:57:03 +01:00
Frédéric Tronel
cb600b920d Fix message display. 2023-12-02 21:17:16 +01:00
Frédéric Tronel
bb5333ffca Add some details to message about extraction of video pictures and audio packets. 2023-12-02 21:15:39 +01:00
Frédéric Tronel
1ed4bbf6df Add a function to retrieve packet duration compatible with multiple ffmpeg versions. 2023-12-02 21:11:59 +01:00
Frédéric Tronel
9a8f97a278 Fix missing calls to getTSFrame. 2023-12-02 21:06:25 +01:00
Frédéric Tronel
650724c966 Fix a typo. 2023-12-02 21:04:07 +01:00
Frédéric Tronel
40592dcec2 Add a function to retrieve timestamp of a frame (with multiple ffmpeg version). 2023-12-02 21:03:15 +01:00
Frédéric Tronel
da13f3e9c8 Missing a float conversion. 2023-12-02 20:57:59 +01:00
Frédéric Tronel
b4e304d9ab Fix a typo. 2023-12-02 20:56:49 +01:00
Frédéric Tronel
44d47a564c Make the script compatible with older version of ffmpeg. 2023-12-02 20:53:49 +01:00
Frédéric Tronel
b8394069fb Correct the name of an optional tool: vobsubocr. 2023-12-02 18:10:18 +01:00
Frédéric Tronel
124772aaeb Closing of memory filedescriptor right after their usage (to save memory). 2023-12-02 17:29:36 +01:00
Frédéric Tronel
076e3c990b Better performances and simplification by removing pipes and using memory file descriptors. 2023-12-02 17:25:55 +01:00
Frédéric Tronel
d549311e20 We don't need pygame as a dependency. 2023-12-02 17:23:54 +01:00
Frédéric Tronel
4a1bf64bda A makefile to clean all temp files. 2023-12-01 16:49:59 +01:00
Frédéric Tronel
58ec094cfc A comment to remember how to extract SPS and PPS from a file with ffmpeg. 2023-12-01 16:48:27 +01:00
Frédéric Tronel
0f46dc9fda New functions to extract subtitles at the end of processing. A new option to extract them. 2023-12-01 16:48:01 +01:00
Frédéric Tronel
cf4850c8dc The progress bar for pictures extraction are now counted in number of pictures. 2023-12-01 16:47:15 +01:00
Frédéric Tronel
ed0494b540 Tools that are search for at startup are now categorized in required and optional. 2023-12-01 16:46:16 +01:00
Frédéric Tronel
22592214bb Add a .gitignore to help having a better git status output. 2023-12-01 16:44:09 +01:00
Frédéric Tronel
0678716c1c Add a .gitignore to help having a better git status output. 2023-12-01 16:43:49 +01:00
Frédéric Tronel
685365388e Adding an option to dump images of trailers and headers of each part for debugging purpose. 2023-11-30 21:06:50 +01:00
Frédéric Tronel
4bd294e26b A better README. 2023-11-30 16:14:20 +01:00
Frédéric Tronel
840aa5c41f The tqdm library is necessary. 2023-11-30 16:13:58 +01:00
Frédéric Tronel
9f9e17e6ca Numerous improvements. The base code is fully functional. 2023-11-30 16:13:11 +01:00
5 changed files with 3485 additions and 390 deletions

6
.gitignore vendored Normal file
View File

@@ -0,0 +1,6 @@
*.pcm
*.ppm
*.ts
*.mkv
part*
venv/

2
Makefile Normal file
View File

@@ -0,0 +1,2 @@
clean:
rm -f *.ppm *.pcm part* *.srt *-ts.txt *-full.h264 *-novideo.mkv fre.*

View File

@@ -1,3 +1,51 @@
# removeads # Goal of the script
Remove ads from TV recordings with optimal cuts. Remove ads from TV recordings with optimal cuts with single video frame precision.
When I record a movie on TV, I sometimes wish to archive the movie on my NAS.
In that case, I want to remove (at least) the beginning and the end of the recording
if the movie was broadcast on a advertisment free channel, or even worse I may
have to split it in numerous parts so as to remove the advertisments.
And I do not want to reencode the entire movie since it's a really slow process (on my NAS)
and the movie is already broadcast using H264 (DVB-T).
Doing this by hand is really painful, because most tools like ffmpeg, or mkvmerge are only
able to cut a movie (without reencoding) at a boundary within the video stream that corresponds
to a reference frame (so called I-frames). These frames are only present roughly every 10-20 frames
which corresponds to quite long duration (in the order of a second).
I really want to cut the movie with a better precision. So I have written a Python script
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_, _subvodocr_ to do the job with the required precision.
# Parameters
# How does it work ?
The processing follows a quite long pipeline:
1. The original .ts file is first transformed into an .mp4 file using _ffmpeg_ to correct timestamps:
2. The .mp4 is then transformed into a Matroska container (which is the default container) still using _ffmpeg_:
3. Then the movie is then cut using the indications passed as parameters. It is possible to give as many parts as needed.
Each part is treated with the same algorithm.
Trouver l'estampille de la trame 'I' la plus proche (mais postérieure) au début de la portion.
Trouver l'estampille de la trame 'I' la plus proche (mais antérieure) à la fin de la portion.
On a alors
debut ----- trame --------- trame --------- fin.
'B/P' 'B/P'* 'I' 'I' 'B/P'* 'B/P'
Si la trame de début est déjà 'I', il n'y a rien à faire (idem pour la fin).
Sinon on extrait les trames 'B' ou 'P' depuis le début jusqu'à la trame 'I' non incluse
4. Then each part that have been previously obtained are merged using _mkvmerge_:
5. The subtitles (image based) are then extracted using _mkvextract_:
6. These images are then processed using _vobsubocr_ to create SRT files:
7. The SRT files are then remuxed inside the Matroska container using _mkvmerge_:
# How to determine where to cuts
Use `mpv --osd-fractions --osd-level=3 ./movie.ts`

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,6 @@
xmltodict xmltodict
requests requests
pygame
coloredlogs coloredlogs
tqdm
iso639-lang
hexdump