Compare commits

...

21 Commits

Author SHA1 Message Date
Frédéric Tronel
cb019413cc More linting: no more camelcase variables, functions, arguments or methods. 2025-10-30 16:33:15 +01:00
Frédéric Tronel
40ca3e136b More linting: no more camelcase for function names. 2025-10-29 14:54:42 +01:00
Frédéric Tronel
367cb440d8 Even more linting: function names with snake case, remove unused variables, wrong format f-strings, masked parameters. 2025-10-29 12:37:51 +01:00
Frédéric Tronel
c565699875 Even more linting: no more variable with camel case. 2025-10-29 10:04:47 +01:00
Frédéric Tronel
75f227786f Still more linting (encoding for open, variables with snake case). 2025-10-28 10:55:47 +01:00
Frédéric Tronel
960de11b1b More linting. 2025-10-27 15:48:41 +01:00
Frédéric Tronel
e192c66157 More linting: camel case for variable names, f-format strings. 2025-10-26 21:01:50 +01:00
Frédéric Tronel
23f1db5ffa Even more linting (long lines, trailing spaces, module importation order, variable names). 2025-10-26 17:14:29 +01:00
Frédéric Tronel
362844f8a1 Improve linting by remove superfluous parenthesis. 2025-10-25 16:55:05 +02:00
Frédéric Tronel
6ad2c3b50a Improve linting by remove bad indentations. 2025-10-25 16:47:28 +02:00
Frédéric Tronel
ddec8633e3 Improve linting by remove trailing spaces. 2025-10-25 16:45:20 +02:00
Frédéric Tronel
c3943ff70e Remove trailing spaces. 2025-10-25 16:33:29 +02:00
Frédéric Tronel
926ee16433 Improve pylint score and fix most errors. 2025-10-25 16:09:11 +02:00
Frédéric Tronel
489435a87f Improve pylint score and fix most errors. 2025-10-25 16:05:25 +02:00
Frédéric Tronel
efceec0e48 Huge improvement in the merging of the different video parts using different encoding profiles, since reproducing the H264 profile of broadcast movies in nearly impossible (some features do not have corresponding options in ffmpeg). 2025-09-19 16:41:16 +02:00
Frédéric Tronel
10234d67da Improve the README with details about the processing workflow of the script. 2025-09-19 16:32:11 +02:00
Frédéric Tronel
7e5a500279 The clean target of the Makefile removes all intermediate files created by the script when used with the --keep option. 2025-09-19 16:31:03 +02:00
Frédéric Tronel
8aca12c422 We ignore mpeg TS and Matroska files. 2025-09-19 16:29:47 +02:00
Frédéric Tronel
b94f865831 We handle the case where subtitles track are eventually empty after processing. 2023-12-24 16:52:40 +01:00
Frédéric Tronel
48cc4f8a27 hexdump package is required. 2023-12-24 14:31:20 +01:00
Frédéric Tronel
889b8dd6dc Subtitles extracted through OCR can be remuxed with the final cut movie (in addition to image based ones). 2023-12-24 14:29:42 +01:00
5 changed files with 2856 additions and 747 deletions

2
.gitignore vendored
View File

@@ -1,4 +1,6 @@
*.pcm
*.ppm
*.ts
*.mkv
part*
venv/

View File

@@ -1,2 +1,2 @@
clean:
rm -f *.ppm *.pcm part* *.srt
rm -f *.ppm *.pcm part* *.srt *-ts.txt *-full.h264 *-novideo.mkv fre.*

View File

@@ -15,12 +15,37 @@ to a reference frame (so called I-frames). These frames are only present roughly
which corresponds to quite long duration (in the order of a second).
I really want to cut the movie with a better precision. So I have written a Python script
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_ to do the job with the required precision.
that leverages _ffmpeg_, _ffprobe_ and _mkvmerge_, _subvodocr_ to do the job with the required precision.
# Parameters
# How does it work ?
The processing follows a quite long pipeline:
1. The original .ts file is first transformed into an .mp4 file using _ffmpeg_ to correct timestamps:
2. The .mp4 is then transformed into a Matroska container (which is the default container) still using _ffmpeg_:
3. Then the movie is then cut using the indications passed as parameters. It is possible to give as many parts as needed.
Each part is treated with the same algorithm.
Trouver l'estampille de la trame 'I' la plus proche (mais postérieure) au début de la portion.
Trouver l'estampille de la trame 'I' la plus proche (mais antérieure) à la fin de la portion.
On a alors
debut ----- trame --------- trame --------- fin.
'B/P' 'B/P'* 'I' 'I' 'B/P'* 'B/P'
Si la trame de début est déjà 'I', il n'y a rien à faire (idem pour la fin).
Sinon on extrait les trames 'B' ou 'P' depuis le début jusqu'à la trame 'I' non incluse
4. Then each part that have been previously obtained are merged using _mkvmerge_:
5. The subtitles (image based) are then extracted using _mkvextract_:
6. These images are then processed using _vobsubocr_ to create SRT files:
7. The SRT files are then remuxed inside the Matroska container using _mkvmerge_:
# How to determine where to cuts
Use `mpv --osd-fractions --osd-level=3 ./movie.ts`

File diff suppressed because it is too large Load Diff

View File

@@ -3,3 +3,4 @@ requests
coloredlogs
tqdm
iso639-lang
hexdump