6e42088656
Merge remote-tracking branch 'upstream/master' into prompt
2024-09-04 17:48:06 +08:00
Mahmoud Ashraf
d57c5b40b0
Remove the usage of transformers.pipeline from BatchedInferencePipeline and fix word timestamps for batched inference ( #921 )
...
* fix word timestamps for batched inference
* remove hf pipeline
2024-07-27 09:02:58 +07:00
zh-plus
83a368e98a
Make vad-related parameters configurable for batched inference. ( #923 )
2024-07-24 09:00:32 +07:00
Jilt Sebastian
eb8390233c
New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements ( #856 )
...
Batching Support, Speed Boosts, and Quality Enhancements
---------
Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com >
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com >
2024-07-18 16:48:52 +07:00
4a59bb011d
Merge remote-tracking branch 'upstream/master' into prompt
2024-07-10 10:16:35 +08:00
trungkienbkhn
fbcf58bf98
Fix language detection with non-speech audio ( #895 )
2024-07-05 14:43:45 +07:00
Jordi Mas
1195359984
Filter out non_speech_tokens in suppressed tokens ( #898 )
...
* Filter out non_speech_tokens in suppressed tokens
2024-07-05 14:43:11 +07:00
ABen
8862bee1f8
Improve language detection when using clip_timestamps ( #867 )
2024-07-01 16:12:45 +07:00
Napuh
f53be1e811
Add distil models to WhisperModel init and download_model docstrings ( #847 )
...
* chore: add distil models to WhisperModel init docstring and download_model docstring
2024-05-20 08:51:22 +07:00
Natanael Tan
4acdb5c619
Fix #839 incorrect clip_timestamps being used in model ( #842 )
...
* Fix #839
Changed the code from updating the TranscriptionOptions class instead of the options object which likely was the cause of unexpected behaviour
2024-05-17 16:35:07 +07:00
Keating Reid
49a80eb8a8
Clarify documentation for hotwords ( #817 )
...
* Clarify documentation for hotwords
* Remove redundant type specifications
2024-05-06 08:52:59 +07:00
trungkienbkhn
8d5e6d56d9
Support initializing more whisper model args ( #807 )
2024-05-04 15:12:59 +07:00
jax
847fec4492
Feature/add hotwords ( #731 )
...
* add hotword params
---------
Co-authored-by: jax <jax_builder@gamil.com >
2024-05-04 15:11:52 +07:00
4ee1d54c14
Merge branch 'master' into prompt
2024-04-08 20:56:49 +08:00
Purfview
b024972a56
Foolproof: Disable VAD if clip_timestamps is in use ( #769 )
...
* Foolproof: Disable VAD if clip_timestamps is in use
Prevent silly things to happen.
2024-04-02 18:20:34 +02:00
Purfview
8ae82c8372
Bugfix: code breaks if audio is empty ( #768 )
...
* Bugfix: code breaks if audio is empty
Regression since https://github.com/SYSTRAN/faster-whisper/pull/732 PR
2024-04-02 18:18:12 +02:00
trungkienbkhn
1eb9a8004c
Improve language detection ( #732 )
2024-03-12 15:44:49 +01:00
e50d82c18c
Merge remote-tracking branch 'upstream/master' into prompt
2024-03-10 11:53:58 +08:00
Purfview
5090cc9d0d
Fix window end heuristic for hallucination_silence_threshold ( #706 )
...
Removes the wishful heuristic causing more issues than it's fixing.
Same as https://github.com/openai/whisper/pull/2043
Example of the issue: https://github.com/openai/whisper/pull/1838#issuecomment-1960041500
2024-02-29 17:59:32 +01:00
trungkienbkhn
16141e65d9
Add pad_or_trim function to handle segment before encoding ( #705 )
2024-02-29 17:08:28 +01:00
4b64ef1f70
Merge branch 'master' into prompt
2024-02-23 10:52:53 +08:00
Purfview
30d6043e90
Prevent infinite loop for out-of-bound timestamps in clip_timestamps ( #697 )
...
Same as https://github.com/openai/whisper/pull/2005
2024-02-22 09:48:35 +01:00
trungkienbkhn
092067208b
Add clip_timestamps and hallucination_silence_threshold options ( #646 )
2024-02-20 17:34:54 +01:00
d04e685ca2
Merge branch 'master' into prompt
2024-02-19 17:31:58 +08:00
Purfview
3aec421849
Add: More clarity of what "max_new_tokens" does ( #658 )
...
* Add: More clarity of what "max_new_tokens" does
2024-01-28 21:40:33 +01:00
Purfview
00efce1696
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" ( #652 )
2024-01-24 11:54:43 +01:00
metame
ad3c83045b
support distil-whisper ( #557 )
2024-01-24 10:17:12 +01:00
b835bdaaf1
Merge remote-tracking branch 'upstream/master' into prompt
2023-12-25 17:56:50 +08:00
Purfview
ebcfd6b964
Fix broken prompt_reset_on_temperature ( #604 )
...
* Fix broken prompt_reset_on_temperature
Fixing: https://github.com/SYSTRAN/faster-whisper/issues/603
Broken because `generate_with_fallback()` doesn't return final temperature.
Regression since PR356 -> https://github.com/SYSTRAN/faster-whisper/pull/356
2023-12-13 13:14:39 +01:00
trungkienbkhn
19329a3611
Word timing tweaks ( #616 )
2023-12-13 12:38:44 +01:00
Oscaarjs
3084409633
Add V3 Support ( #578 )
...
* Add V3 Support
* update conversion example
---------
Co-authored-by: oscaarjs <oscar.johansson@conversy.se >
2023-11-24 23:16:12 +01:00
Guillaume Klein
e94711bb5c
Add property WhisperModel.supported_languages ( #476 )
...
* Expose function supported_languages
* Make it a method
2023-09-14 17:42:02 +02:00
Guillaume Klein
a49097e655
Add some missing typing annotations in transcribe.py
2023-09-12 15:45:54 +02:00
Guillaume Klein
81086f6d33
Always run the encoder at the beginning of the loop ( #468 )
2023-09-12 14:44:37 +02:00
Guillaume Klein
4a41746e55
Log a warning when the model is English-only but the language is set to something else ( #454 )
2023-09-04 11:55:40 +02:00
Guillaume Klein
1e6eb967c9
Add "large" alias for "large-v2" model ( #453 )
2023-09-04 11:54:42 +02:00
Guillaume Klein
f0ff12965a
Expose generation parameter no_repeat_ngram_size ( #449 )
2023-09-01 17:31:30 +02:00
MinorJinx
e87fbf8a49
Added audio duration after VAD to TranscriptionInfo object ( #445 )
...
* Added VAD removed audio duration to TranscriptionInfo object
Along with the duration of the original audio, this commit adds the seconds of audio removed by the VAD to the returned info obj
* Chaning naming for duration_after_vad
Instead of the property returning the audio duration removed, it now returns the final duration after the vad.
If vad_filter is False or if it doesn't remove any audio, the original duration is returned.
2023-08-31 17:19:48 +02:00
Aisu Wata
1562b02345
added repetition_penalty to TranscriptionOptions ( #403 )
...
Co-authored-by: Aisu Wata <aisu.wata0@gmail.com >
2023-08-06 10:08:24 +02:00
Purfview
1ce16652ee
Adds DEBUG log message for prompt_reset_on_temperature ( #399 )
...
Produce DEBUG log message if prompt_reset_on_temperature threshold is met.
2023-08-04 09:06:17 +02:00
Purfview
857be6f621
Rename clear_previous_text_on_temperature argument ( #398 )
...
`prompt_reset_on_temperature` is more clear what it does.
2023-08-03 18:44:37 +02:00
KH
1a1eb1a027
Add clear_previous_text_on_temperature parameter ( #397 )
...
* Add clear_previous_text_on_temperature parameter
* Add a description
2023-08-03 15:40:58 +02:00
Guillaume Klein
0f55c436fe
Invalidate the cached encoder output when no_speech threshold is met ( #376 )
2023-07-24 10:57:15 +02:00
KH
e786e26f75
Return result with best log prob when all temperature fallbacks failed ( #356 )
...
* Resolve Inference Selection Bug
* Refactor for better readability
* Filter out results with compression_ratio
* Refactor to avoid variable repetition
* Fix incorrect index and perform minor refactoring
* Remove final_temperature variable
2023-07-20 16:13:11 +02:00
KH
687db319e0
Remove duplicate code ( #359 )
2023-07-18 16:03:01 +02:00
Guillaume Klein
0e051a5b77
Prepend prefix tokens with the initial timestamp token ( #358 )
2023-07-18 15:22:39 +02:00
Hoon
3b4a6aa1c2
Improve timestamp heuristics ( #336 )
...
* Improve timestamp heuristics
* Chore
2023-07-05 15:16:53 +02:00
zh-plus
c7cb2aa8d4
Add support for using whisper models from Huggingface by specifying the model id. ( #334 )
...
* Add support for downloading CTranslate-converted models from Huggingface.
* Update utils.py to pass Flake8.
* Update utils.py to pass black.
* Remove redundant usage instructions.
* Apply suggestions from code review
Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com >
---------
Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com >
2023-07-03 17:40:10 +02:00
Guillaume Klein
c0d93d0829
Avoid computing higher temperatures on no_speech segments ( #225 )
...
Port commit e334ff141d
2023-07-03 10:20:36 +02:00
Guillaume Klein
19c294f978
Squash long words at window and sentence boundaries ( #226 )
...
Port commit 255887f219
2023-07-03 10:20:20 +02:00