Keating Reid
49a80eb8a8
Clarify documentation for hotwords ( #817 )
...
* Clarify documentation for hotwords
* Remove redundant type specifications
2024-05-06 08:52:59 +07:00
trungkienbkhn
8d5e6d56d9
Support initializing more whisper model args ( #807 )
2024-05-04 15:12:59 +07:00
jax
847fec4492
Feature/add hotwords ( #731 )
...
* add hotword params
---------
Co-authored-by: jax <jax_builder@gamil.com >
2024-05-04 15:11:52 +07:00
otakutyrant
91c8307aa6
make faster_whisper.assets as a valid python package to distribute ( #772 ) ( #774 )
2024-04-02 18:22:22 +02:00
Purfview
b024972a56
Foolproof: Disable VAD if clip_timestamps is in use ( #769 )
...
* Foolproof: Disable VAD if clip_timestamps is in use
Prevent silly things to happen.
2024-04-02 18:20:34 +02:00
Purfview
8ae82c8372
Bugfix: code breaks if audio is empty ( #768 )
...
* Bugfix: code breaks if audio is empty
Regression since https://github.com/SYSTRAN/faster-whisper/pull/732 PR
2024-04-02 18:18:12 +02:00
trungkienbkhn
e0c3a9ed34
Update project github link to SYSTRAN ( #746 )
2024-03-27 08:31:17 +01:00
Sanchit Gandhi
a67e0e47ae
Add support for distil-large-v3 ( #755 )
...
* add distil-large-v3
* Update README.md
* use fp16 weights from Systran
2024-03-26 14:58:39 +01:00
trungkienbkhn
1eb9a8004c
Improve language detection ( #732 )
2024-03-12 15:44:49 +01:00
trungkienbkhn
a342b028b7
Bump version to 1.0.1 ( #725 )
2024-03-01 11:32:12 +01:00
Purfview
5090cc9d0d
Fix window end heuristic for hallucination_silence_threshold ( #706 )
...
Removes the wishful heuristic causing more issues than it's fixing.
Same as https://github.com/openai/whisper/pull/2043
Example of the issue: https://github.com/openai/whisper/pull/1838#issuecomment-1960041500
2024-02-29 17:59:32 +01:00
trungkienbkhn
16141e65d9
Add pad_or_trim function to handle segment before encoding ( #705 )
2024-02-29 17:08:28 +01:00
trungkienbkhn
06d32bf0c1
Bump version to 1.0.0 ( #696 )
2024-02-22 09:49:01 +01:00
Purfview
30d6043e90
Prevent infinite loop for out-of-bound timestamps in clip_timestamps ( #697 )
...
Same as https://github.com/openai/whisper/pull/2005
2024-02-22 09:48:35 +01:00
trungkienbkhn
092067208b
Add clip_timestamps and hallucination_silence_threshold options ( #646 )
2024-02-20 17:34:54 +01:00
Purfview
3aec421849
Add: More clarity of what "max_new_tokens" does ( #658 )
...
* Add: More clarity of what "max_new_tokens" does
2024-01-28 21:40:33 +01:00
Purfview
00efce1696
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" ( #652 )
2024-01-24 11:54:43 +01:00
metame
ad3c83045b
support distil-whisper ( #557 )
2024-01-24 10:17:12 +01:00
Purfview
ebcfd6b964
Fix broken prompt_reset_on_temperature ( #604 )
...
* Fix broken prompt_reset_on_temperature
Fixing: https://github.com/SYSTRAN/faster-whisper/issues/603
Broken because `generate_with_fallback()` doesn't return final temperature.
Regression since PR356 -> https://github.com/SYSTRAN/faster-whisper/pull/356
2023-12-13 13:14:39 +01:00
trungkienbkhn
19329a3611
Word timing tweaks ( #616 )
2023-12-13 12:38:44 +01:00
Clayton Yochum
9641d5f56a
Force read-mode in av.open ( #566 )
...
The `av.open` functions checks input metadata to determine the mode to open with ("r" or "w"). If an input to `decode_audio` is found to be in write-mode, without this change it can't be read. Forcing read mode solves this.
2023-11-27 10:43:35 +01:00
Dang Chuan Nguyen
e1a218fab1
Bump version to 0.10.0
2023-11-24 23:19:47 +01:00
Oscaarjs
3084409633
Add V3 Support ( #578 )
...
* Add V3 Support
* update conversion example
---------
Co-authored-by: oscaarjs <oscar.johansson@conversy.se >
2023-11-24 23:16:12 +01:00
Guillaume Klein
5a0541ea7d
Bump version to 0.9.0
2023-09-18 16:21:37 +02:00
Guillaume Klein
e94711bb5c
Add property WhisperModel.supported_languages ( #476 )
...
* Expose function supported_languages
* Make it a method
2023-09-14 17:42:02 +02:00
Guillaume Klein
0048844f54
Expose function available_models ( #475 )
...
* Expose function available_models
* Add test case
2023-09-14 17:17:01 +02:00
Guillaume Klein
a49097e655
Add some missing typing annotations in transcribe.py
2023-09-12 15:45:54 +02:00
Guillaume Klein
81086f6d33
Always run the encoder at the beginning of the loop ( #468 )
2023-09-12 14:44:37 +02:00
Guillaume Klein
727ab81f31
Improve error message for invalid task and language parameters ( #466 )
2023-09-12 10:02:23 +02:00
Guillaume Klein
ad388cd394
Bump version to 0.8.0
2023-09-04 11:56:48 +02:00
Guillaume Klein
4a41746e55
Log a warning when the model is English-only but the language is set to something else ( #454 )
2023-09-04 11:55:40 +02:00
Guillaume Klein
1e6eb967c9
Add "large" alias for "large-v2" model ( #453 )
2023-09-04 11:54:42 +02:00
Guillaume Klein
f0ff12965a
Expose generation parameter no_repeat_ngram_size ( #449 )
2023-09-01 17:31:30 +02:00
Guillaume Klein
5871858a5f
Force the garbage collector to run after decoding the audio with PyAV ( #448 )
2023-09-01 15:25:13 +02:00
MinorJinx
e87fbf8a49
Added audio duration after VAD to TranscriptionInfo object ( #445 )
...
* Added VAD removed audio duration to TranscriptionInfo object
Along with the duration of the original audio, this commit adds the seconds of audio removed by the VAD to the returned info obj
* Chaning naming for duration_after_vad
Instead of the property returning the audio duration removed, it now returns the final duration after the vad.
If vad_filter is False or if it doesn't remove any audio, the original duration is returned.
2023-08-31 17:19:48 +02:00
Aisu Wata
1562b02345
added repetition_penalty to TranscriptionOptions ( #403 )
...
Co-authored-by: Aisu Wata <aisu.wata0@gmail.com >
2023-08-06 10:08:24 +02:00
Purfview
1ce16652ee
Adds DEBUG log message for prompt_reset_on_temperature ( #399 )
...
Produce DEBUG log message if prompt_reset_on_temperature threshold is met.
2023-08-04 09:06:17 +02:00
Purfview
857be6f621
Rename clear_previous_text_on_temperature argument ( #398 )
...
`prompt_reset_on_temperature` is more clear what it does.
2023-08-03 18:44:37 +02:00
KH
1a1eb1a027
Add clear_previous_text_on_temperature parameter ( #397 )
...
* Add clear_previous_text_on_temperature parameter
* Add a description
2023-08-03 15:40:58 +02:00
Guillaume Klein
5c17de1771
Bump version to 0.7.1
2023-07-24 11:10:12 +02:00
Guillaume Klein
0f55c436fe
Invalidate the cached encoder output when no_speech threshold is met ( #376 )
2023-07-24 10:57:15 +02:00
KH
e786e26f75
Return result with best log prob when all temperature fallbacks failed ( #356 )
...
* Resolve Inference Selection Bug
* Refactor for better readability
* Filter out results with compression_ratio
* Refactor to avoid variable repetition
* Fix incorrect index and perform minor refactoring
* Remove final_temperature variable
2023-07-20 16:13:11 +02:00
KH
687db319e0
Remove duplicate code ( #359 )
2023-07-18 16:03:01 +02:00
Guillaume Klein
171d90dd1f
Bump version to 0.7.0
2023-07-18 15:23:47 +02:00
Guillaume Klein
0e051a5b77
Prepend prefix tokens with the initial timestamp token ( #358 )
2023-07-18 15:22:39 +02:00
Hoon
3b4a6aa1c2
Improve timestamp heuristics ( #336 )
...
* Improve timestamp heuristics
* Chore
2023-07-05 15:16:53 +02:00
zh-plus
c7cb2aa8d4
Add support for using whisper models from Huggingface by specifying the model id. ( #334 )
...
* Add support for downloading CTranslate-converted models from Huggingface.
* Update utils.py to pass Flake8.
* Update utils.py to pass black.
* Remove redundant usage instructions.
* Apply suggestions from code review
Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com >
---------
Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com >
2023-07-03 17:40:10 +02:00
Guillaume Klein
c0d93d0829
Avoid computing higher temperatures on no_speech segments ( #225 )
...
Port commit e334ff141d
2023-07-03 10:20:36 +02:00
Guillaume Klein
19c294f978
Squash long words at window and sentence boundaries ( #226 )
...
Port commit 255887f219
2023-07-03 10:20:20 +02:00
FlippFuzz
fee52c9229
Allow users to input an Iterable of token ids into initial_prompt ( #306 )
...
* Allow users to input an Iterable of token ids into initial_prompt
* Need to check for String first because string is also an Iterable
2023-06-21 14:46:20 +02:00