Compare commits

...

50 Commits

Author SHA1 Message Date
Frédéric Tronel
cb019413cc More linting: no more camelcase variables, functions, arguments or methods. 2025-10-30 16:33:15 +01:00
Frédéric Tronel
40ca3e136b More linting: no more camelcase for function names. 2025-10-29 14:54:42 +01:00
Frédéric Tronel
367cb440d8 Even more linting: function names with snake case, remove unused variables, wrong format f-strings, masked parameters. 2025-10-29 12:37:51 +01:00
Frédéric Tronel
c565699875 Even more linting: no more variable with camel case. 2025-10-29 10:04:47 +01:00
Frédéric Tronel
75f227786f Still more linting (encoding for open, variables with snake case). 2025-10-28 10:55:47 +01:00
Frédéric Tronel
960de11b1b More linting. 2025-10-27 15:48:41 +01:00
Frédéric Tronel
e192c66157 More linting: camel case for variable names, f-format strings. 2025-10-26 21:01:50 +01:00
Frédéric Tronel
23f1db5ffa Even more linting (long lines, trailing spaces, module importation order, variable names). 2025-10-26 17:14:29 +01:00
Frédéric Tronel
362844f8a1 Improve linting by remove superfluous parenthesis. 2025-10-25 16:55:05 +02:00
Frédéric Tronel
6ad2c3b50a Improve linting by remove bad indentations. 2025-10-25 16:47:28 +02:00
Frédéric Tronel
ddec8633e3 Improve linting by remove trailing spaces. 2025-10-25 16:45:20 +02:00
Frédéric Tronel
c3943ff70e Remove trailing spaces. 2025-10-25 16:33:29 +02:00
Frédéric Tronel
926ee16433 Improve pylint score and fix most errors. 2025-10-25 16:09:11 +02:00
Frédéric Tronel
489435a87f Improve pylint score and fix most errors. 2025-10-25 16:05:25 +02:00
Frédéric Tronel
efceec0e48 Huge improvement in the merging of the different video parts using different encoding profiles, since reproducing the H264 profile of broadcast movies in nearly impossible (some features do not have corresponding options in ffmpeg). 2025-09-19 16:41:16 +02:00
Frédéric Tronel
10234d67da Improve the README with details about the processing workflow of the script. 2025-09-19 16:32:11 +02:00
Frédéric Tronel
7e5a500279 The clean target of the Makefile removes all intermediate files created by the script when used with the --keep option. 2025-09-19 16:31:03 +02:00
Frédéric Tronel
8aca12c422 We ignore mpeg TS and Matroska files. 2025-09-19 16:29:47 +02:00
Frédéric Tronel
b94f865831 We handle the case where subtitles track are eventually empty after processing. 2023-12-24 16:52:40 +01:00
Frédéric Tronel
48cc4f8a27 hexdump package is required. 2023-12-24 14:31:20 +01:00
Frédéric Tronel
889b8dd6dc Subtitles extracted through OCR can be remuxed with the final cut movie (in addition to image based ones). 2023-12-24 14:29:42 +01:00
Frédéric Tronel
ffce9aecdf Handling of OCR to generate subtitles files is working. 2023-12-22 14:57:25 +01:00
Frédéric Tronel
4dbf9d9c03 Suppress SRT files for cleaning. 2023-12-22 14:56:05 +01:00
Frédéric Tronel
03922a76d2 Add dependancy to library iso639 that supports the normalized names of languages. 2023-12-22 10:42:34 +01:00
Frédéric Tronel
f23423ca8d Code to take into account the potential change of length field when modifying the EBML tree structure. 2023-12-20 10:46:54 +01:00
Frédéric Tronel
3681ff33f3 Remove code that was here to debug the codec private data changes. 2023-12-20 10:05:52 +01:00
Frédéric Tronel
59b55bac6c Make mkvmerge speaks english for its outputs so that the code is neutral with respect to locally installed languages. 2023-12-20 09:56:39 +01:00
Frédéric Tronel
2bf9b467bb We handle the cases where the old codec private data size is larger, smaller or equal to the new one. 2023-12-19 14:12:23 +01:00
Frédéric Tronel
6959e83327 Add a new option to not take into account sequences that are shorter than a certain threshold. 2023-12-18 16:14:57 +01:00
Frédéric Tronel
2f425aa9cf Adding a bunch of functions to modify codec private data inside video tracks, correct mkv binary representation after such changes. 2023-12-18 16:14:08 +01:00
Frédéric Tronel
556d88d73a mkvinfo command is now mandatory. 2023-12-18 16:11:46 +01:00
Frédéric Tronel
af52c80a8e Positioning inside files using lseek is made uniformly. 2023-12-18 16:11:05 +01:00
Frédéric Tronel
04d23ca1b2 The langage used by commands cannot be set using locales module. 2023-12-18 16:09:48 +01:00
Frédéric Tronel
88d9d15496 If we only try to convert from .ts to .mp4 or .mkv, without any cut, do not remove output file. 2023-12-15 09:38:47 +01:00
Frédéric Tronel
b1c58fc53a Correction d'un bug qui ne passe pas le chemin de ffmpeg dans la fonction ffmpegConvert. 2023-12-12 12:07:36 +01:00
Frédéric Tronel
4070f34a60 Ajout d'une large partie du code nécessaire à l'extraction des sous-titres via OCR. 2023-12-12 11:57:03 +01:00
Frédéric Tronel
cb600b920d Fix message display. 2023-12-02 21:17:16 +01:00
Frédéric Tronel
bb5333ffca Add some details to message about extraction of video pictures and audio packets. 2023-12-02 21:15:39 +01:00
Frédéric Tronel
1ed4bbf6df Add a function to retrieve packet duration compatible with multiple ffmpeg versions. 2023-12-02 21:11:59 +01:00
Frédéric Tronel
9a8f97a278 Fix missing calls to getTSFrame. 2023-12-02 21:06:25 +01:00
Frédéric Tronel
650724c966 Fix a typo. 2023-12-02 21:04:07 +01:00
Frédéric Tronel
40592dcec2 Add a function to retrieve timestamp of a frame (with multiple ffmpeg version). 2023-12-02 21:03:15 +01:00
Frédéric Tronel
da13f3e9c8 Missing a float conversion. 2023-12-02 20:57:59 +01:00
Frédéric Tronel
b4e304d9ab Fix a typo. 2023-12-02 20:56:49 +01:00
Frédéric Tronel
44d47a564c Make the script compatible with older version of ffmpeg. 2023-12-02 20:53:49 +01:00
Frédéric Tronel
b8394069fb Correct the name of an optional tool: vobsubocr. 2023-12-02 18:10:18 +01:00
Frédéric Tronel
124772aaeb Closing of memory filedescriptor right after their usage (to save memory). 2023-12-02 17:29:36 +01:00
Frédéric Tronel
076e3c990b Better performances and simplification by removing pipes and using memory file descriptors. 2023-12-02 17:25:55 +01:00
Frédéric Tronel
d549311e20 We don't need pygame as a dependency. 2023-12-02 17:23:54 +01:00
Frédéric Tronel
4a1bf64bda A makefile to clean all temp files. 2023-12-01 16:49:59 +01:00
5 changed files with 3272 additions and 590 deletions

2
.gitignore vendored
View File

@@ -1,4 +1,6 @@
*.pcm
*.ppm
*.ts
*.mkv
part*
venv/

2
Makefile Normal file
View File

@@ -0,0 +1,2 @@
clean:
rm -f *.ppm *.pcm part* *.srt *-ts.txt *-full.h264 *-novideo.mkv fre.*

View File

@@ -15,12 +15,37 @@ to a reference frame (so called I-frames). These frames are only present roughly
which corresponds to quite long duration (in the order of a second).
I really want to cut the movie with a better precision. So I have written a Python script
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_ to do the job with the required precision.
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_, _subvodocr_ to do the job with the required precision.
# Parameters
# How does it work ?
The processing follows a quite long pipeline:
1. The original .ts file is first transformed into an .mp4 file using _ffmpeg_ to correct timestamps:
2. The .mp4 is then transformed into a Matroska container (which is the default container) still using _ffmpeg_:
3. Then the movie is then cut using the indications passed as parameters. It is possible to give as many parts as needed.
Each part is treated with the same algorithm.
Trouver l'estampille de la trame 'I' la plus proche (mais postérieure) au début de la portion.
Trouver l'estampille de la trame 'I' la plus proche (mais antérieure) à la fin de la portion.
On a alors
debut ----- trame --------- trame --------- fin.
'B/P' 'B/P'* 'I' 'I' 'B/P'* 'B/P'
Si la trame de début est déjà 'I', il n'y a rien à faire (idem pour la fin).
Sinon on extrait les trames 'B' ou 'P' depuis le début jusqu'à la trame 'I' non incluse
4. Then each part that have been previously obtained are merged using _mkvmerge_:
5. The subtitles (image based) are then extracted using _mkvextract_:
6. These images are then processed using _vobsubocr_ to create SRT files:
7. The SRT files are then remuxed inside the Matroska container using _mkvmerge_:
# How to determine where to cuts
Use `mpv --osd-fractions --osd-level=3 ./movie.ts`

File diff suppressed because it is too large Load Diff

View File

@@ -1,5 +1,6 @@
xmltodict
requests
pygame
coloredlogs
tqdm
iso639-lang
hexdump