Compare commits

...

28 Commits

Author SHA1 Message Date
Frédéric Tronel
cb019413cc More linting: no more camelcase variables, functions, arguments or methods. 2025-10-30 16:33:15 +01:00
Frédéric Tronel
40ca3e136b More linting: no more camelcase for function names. 2025-10-29 14:54:42 +01:00
Frédéric Tronel
367cb440d8 Even more linting: function names with snake case, remove unused variables, wrong format f-strings, masked parameters. 2025-10-29 12:37:51 +01:00
Frédéric Tronel
c565699875 Even more linting: no more variable with camel case. 2025-10-29 10:04:47 +01:00
Frédéric Tronel
75f227786f Still more linting (encoding for open, variables with snake case). 2025-10-28 10:55:47 +01:00
Frédéric Tronel
960de11b1b More linting. 2025-10-27 15:48:41 +01:00
Frédéric Tronel
e192c66157 More linting: camel case for variable names, f-format strings. 2025-10-26 21:01:50 +01:00
Frédéric Tronel
23f1db5ffa Even more linting (long lines, trailing spaces, module importation order, variable names). 2025-10-26 17:14:29 +01:00
Frédéric Tronel
362844f8a1 Improve linting by remove superfluous parenthesis. 2025-10-25 16:55:05 +02:00
Frédéric Tronel
6ad2c3b50a Improve linting by remove bad indentations. 2025-10-25 16:47:28 +02:00
Frédéric Tronel
ddec8633e3 Improve linting by remove trailing spaces. 2025-10-25 16:45:20 +02:00
Frédéric Tronel
c3943ff70e Remove trailing spaces. 2025-10-25 16:33:29 +02:00
Frédéric Tronel
926ee16433 Improve pylint score and fix most errors. 2025-10-25 16:09:11 +02:00
Frédéric Tronel
489435a87f Improve pylint score and fix most errors. 2025-10-25 16:05:25 +02:00
Frédéric Tronel
efceec0e48 Huge improvement in the merging of the different video parts using different encoding profiles, since reproducing the H264 profile of broadcast movies in nearly impossible (some features do not have corresponding options in ffmpeg). 2025-09-19 16:41:16 +02:00
Frédéric Tronel
10234d67da Improve the README with details about the processing workflow of the script. 2025-09-19 16:32:11 +02:00
Frédéric Tronel
7e5a500279 The clean target of the Makefile removes all intermediate files created by the script when used with the --keep option. 2025-09-19 16:31:03 +02:00
Frédéric Tronel
8aca12c422 We ignore mpeg TS and Matroska files. 2025-09-19 16:29:47 +02:00
Frédéric Tronel
b94f865831 We handle the case where subtitles track are eventually empty after processing. 2023-12-24 16:52:40 +01:00
Frédéric Tronel
48cc4f8a27 hexdump package is required. 2023-12-24 14:31:20 +01:00
Frédéric Tronel
889b8dd6dc Subtitles extracted through OCR can be remuxed with the final cut movie (in addition to image based ones). 2023-12-24 14:29:42 +01:00
Frédéric Tronel
ffce9aecdf Handling of OCR to generate subtitles files is working. 2023-12-22 14:57:25 +01:00
Frédéric Tronel
4dbf9d9c03 Suppress SRT files for cleaning. 2023-12-22 14:56:05 +01:00
Frédéric Tronel
03922a76d2 Add dependancy to library iso639 that supports the normalized names of languages. 2023-12-22 10:42:34 +01:00
Frédéric Tronel
f23423ca8d Code to take into account the potential change of length field when modifying the EBML tree structure. 2023-12-20 10:46:54 +01:00
Frédéric Tronel
3681ff33f3 Remove code that was here to debug the codec private data changes. 2023-12-20 10:05:52 +01:00
Frédéric Tronel
59b55bac6c Make mkvmerge speaks english for its outputs so that the code is neutral with respect to locally installed languages. 2023-12-20 09:56:39 +01:00
Frédéric Tronel
2bf9b467bb We handle the cases where the old codec private data size is larger, smaller or equal to the new one. 2023-12-19 14:12:23 +01:00
5 changed files with 2969 additions and 708 deletions

2
.gitignore vendored
View File

@@ -1,4 +1,6 @@
*.pcm
*.ppm
*.ts
*.mkv
part*
venv/

View File

@@ -1,2 +1,2 @@
clean:
rm -f *.ppm *.pcm part*
rm -f *.ppm *.pcm part* *.srt *-ts.txt *-full.h264 *-novideo.mkv fre.*

View File

@@ -15,12 +15,37 @@ to a reference frame (so called I-frames). These frames are only present roughly
which corresponds to quite long duration (in the order of a second).
I really want to cut the movie with a better precision. So I have written a Python script
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_ to do the job with the required precision.
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_, _subvodocr_ to do the job with the required precision.
# Parameters
# How does it work ?
The processing follows a quite long pipeline:
1. The original .ts file is first transformed into an .mp4 file using _ffmpeg_ to correct timestamps:
2. The .mp4 is then transformed into a Matroska container (which is the default container) still using _ffmpeg_:
3. Then the movie is then cut using the indications passed as parameters. It is possible to give as many parts as needed.
Each part is treated with the same algorithm.
Trouver l'estampille de la trame 'I' la plus proche (mais postérieure) au début de la portion.
Trouver l'estampille de la trame 'I' la plus proche (mais antérieure) à la fin de la portion.
On a alors
debut ----- trame --------- trame --------- fin.
'B/P' 'B/P'* 'I' 'I' 'B/P'* 'B/P'
Si la trame de début est déjà 'I', il n'y a rien à faire (idem pour la fin).
Sinon on extrait les trames 'B' ou 'P' depuis le début jusqu'à la trame 'I' non incluse
4. Then each part that have been previously obtained are merged using _mkvmerge_:
5. The subtitles (image based) are then extracted using _mkvextract_:
6. These images are then processed using _vobsubocr_ to create SRT files:
7. The SRT files are then remuxed inside the Matroska container using _mkvmerge_:
# How to determine where to cuts
Use `mpv --osd-fractions --osd-level=3 ./movie.ts`

File diff suppressed because it is too large Load Diff

View File

@@ -2,3 +2,5 @@ xmltodict
requests
coloredlogs
tqdm
iso639-lang
hexdump