225 Commits

Author SHA1 Message Date
Guillaume Klein
a150adcc19 Enable onnxruntime dependency for Python 3.11 (#260) 2023-05-24 16:07:54 +02:00
Guillaume Klein
ae1e6d9883 Remove reference to the VAD function from the README 2023-05-24 15:56:21 +02:00
Guillaume Klein
cf7c021573 Export __version__ at the module level (#258) 2023-05-24 15:50:37 +02:00
Guillaume Klein
4db549b800 Make get_speech_timestamps backward compatible with the previous usage (#259) 2023-05-24 15:49:36 +02:00
Guillaume Klein
c99feb22dc Include requirements files in sdist (#240) 2023-05-24 12:55:15 +02:00
Guillaume Klein
723cb97483 Fix occasional IndexError on empty segments (#227) 2023-05-24 12:55:04 +02:00
Guillaume Klein
6a2da9a95c Also catch client-side network exceptions when synchronizing models (#228) 2023-05-11 15:07:15 +02:00
Guillaume Klein
6a1d331d66 Add CONTRIBUTING.md (#229) 2023-05-11 15:06:46 +02:00
Guillaume Klein
2d7c984bfc Reformat function download_model for clarity 2023-05-11 14:47:22 +02:00
Guillaume Klein
8e5c747ab5 Reformat list of community integrations 2023-05-11 12:15:41 +02:00
Purfview
32b962bed8 Adds: whisper-standalone-win (#216) 2023-05-09 20:20:41 +02:00
David Axelrod
53d247b0bb retry model download locally if huggingface throws an http error. (#215)
* rety model download locally if huggingface throws an http error.

* appease the linter

* key error fix

* use non internal lib error

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>

---------

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>
2023-05-09 17:20:22 +02:00
Ozan Caglayan
91f948b0d6 transcribe: return all language probabilities if requested (#210)
* transcribe: return all language probabilities if requested

If return_all_language_probs is True, TranscriptionInfo structure
will have a list of tuples reflecting all language probabilities
as returned by the model.

* transcribe: fix docstring

* transcribe: remove return_all_lang_probs parameter
2023-05-09 14:53:47 +02:00
FlippFuzz
5d8f3e2d90 Implement VadOptions (#198)
* Implement VadOptions

* Fix line too long

./faster_whisper/transcribe.py:226:101: E501 line too long (111 > 100 characters)

* Reformatted files with black

* black .\faster_whisper\vad.py    
* black .\faster_whisper\transcribe.py

* Fix import order with isort

* isort .\faster_whisper\vad.py
* isort .\faster_whisper\transcribe.py

* Made recommended changes

Recommended in https://github.com/guillaumekln/faster-whisper/pull/198

* Fix typing of vad_options argument

---------

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>
2023-05-09 12:47:02 +02:00
Mahmoud Ashraf
d889345e07 added whisper-diarize (#193) 2023-04-28 10:56:13 +02:00
Jordi Mas
5d203d2757 Update Github link to community project (#187) 2023-04-27 14:53:28 +02:00
Guillaume Klein
a3dcb90081 Bump version to 0.5.1 2023-04-26 17:38:16 +02:00
Guillaume Klein
89a4c7f1f0 Update docstring to clarify download_root and output_dir 2023-04-26 17:37:51 +02:00
Guillaume Klein
6f9d68dd6b Fix typing of local_files_only 2023-04-26 17:36:24 +02:00
Jordi Mas
68df3214ba Use cache_dir instead of local_dir (#182)
* Use cache_dir instead of local_dir

* Fix unit test

* Use cache_dir and preserve local_dir parameter

* Remove blank line at the end

* Disable ut

* Implement  download_root suggestion

* Use cache_dir=download_root
2023-04-26 16:35:18 +02:00
Guillaume Klein
67cce3f552 Bump version to 0.5.0 2023-04-25 17:00:41 +02:00
Guillaume Klein
8340e04dc6 Assign words to the speech chunk with the greatest coverage (#180) 2023-04-25 15:54:31 +02:00
Guillaume Klein
8cf5d5a4b3 Increase the default value of speech_pad_ms to 400 ms (#179) 2023-04-25 15:54:22 +02:00
Guillaume Klein
32dc625f11 Update README.md 2023-04-25 15:47:38 +02:00
Guillaume Klein
e06511f96b Rename AudioInfo to TranscriptionInfo (#174) 2023-04-24 16:29:17 +02:00
Anthony
338a725ff8 fix where the tokens are reset (#175) 2023-04-24 16:28:47 +02:00
Amar Sood
f893113759 Align segment structure with openai/whisper (#154)
* Align segment structure with openai/whisper

* Update code to apply requested changes

* Move increment below the segment filtering

---------

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>
2023-04-24 15:04:42 +02:00
FlippFuzz
2b51a97e61 Add transcription_options to AudioInfo (#170)
* Add transcription_options to AudioInfo

It would be great if we can include the transcription_options in AudioInfo.

My application is only making a few changes but leaving the rest as default.
However, I would like to record down all settings (including those that I did not specify) so that the audio can be transcribed again identically in future if need be.

* Make TranscriptionOptions appear before AudioInfo

* Remove unnecessary whitespace
2023-04-24 15:02:19 +02:00
Jordi Mas
358d373691 Allow specifying local_files_only to prevent checking the Internet everytime (#166) 2023-04-20 14:26:06 +02:00
9a646b69e6 format code 2023-04-20 02:00:57 +08:00
49af9564ab Ignore repeated prompt 2023-04-20 01:49:10 +08:00
Guillaume Klein
3adcc12d0f Clarify that the returned segments value is a generator (#144)
* Clarify that the returned segments value is a generator

* Update README.md
2023-04-13 09:50:53 +02:00
Ewald Enzinger
2b53dee6b6 Expose download location in WhisperModel constructor (#126)
This increases compatibility with OpenAI Whisper's whisper.load_model() and is useful for downstream integrations
2023-04-08 10:02:36 +02:00
Bekir Bakar
06d24056e9 Configure ignore for more files. (#122) 2023-04-06 19:13:09 +02:00
Guillaume Klein
e9a082dcf2 Keep segment timestamps aligned with words timestamps after VAD (#119) 2023-04-06 11:54:40 +02:00
Guillaume Klein
051b3350e5 Add some info and debug logs (#113) 2023-04-05 16:57:59 +02:00
Guillaume Klein
746f2698db Bump version to 0.4.1 2023-04-04 12:16:23 +02:00
Guillaume Klein
a5d03e55fa Prevent out of range error in method split_tokens_on_unicode (#111) 2023-04-04 10:51:14 +02:00
Guillaume Klein
9fa1989073 Revert "Prevent out of range error in method split_tokens_on_unicode"
This reverts commit 36160c1e7e.
2023-04-04 10:25:41 +02:00
Guillaume Klein
36160c1e7e Prevent out of range error in method split_tokens_on_unicode 2023-04-04 10:17:56 +02:00
Guillaume Klein
2f266eb844 Fix VAD index error when a predicted timestamps is too large (#107) 2023-04-03 19:34:54 +02:00
Guillaume Klein
8c36ac1be8 Bump version to 0.4.0 2023-04-03 17:24:49 +02:00
Guillaume Klein
19698c95f8 Support VAD filter (#95)
* Support VAD filter

* Generalize function collect_samples

* Define AudioSegment class

* Only pass prompt and prefix to the first chunk

* Add dict argument vad_parameters

* Fix isort format

* Rename method

* Update README

* Add shortcut when the chunk offset is 0

* Reword readme

* Fix end property

* Concatenate the speech chunks

* Cleanup diff

* Increase default speech pad

* Update README

* Increase default speech pad
2023-04-03 17:22:48 +02:00
palladium123
b4c1c57781 Added retrieval mechanism (avg_log_prob/no_speech_prob) (#103)
* Added retrieval mechanism 

Added retrieval mechanism to retrieve avg_log_prob and no_speech_prob from the Transcribe method.

* Update transcribe.py

* Update transcribe.py

* Initial commit
2023-04-03 16:56:35 +02:00
Guillaume Klein
f20bb258de Support separating the left and right audio channels (#97) 2023-04-03 11:22:43 +02:00
Guillaume Klein
1a968a4323 Pass prefix only to the first window 2023-04-01 09:27:20 +02:00
Guillaume Klein
def70d8496 Update headings in the Usage section 2023-03-31 18:54:55 +02:00
mayeaux
7301df7f8b Update README.md (#101)
* Update README.md

* Update README.md

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>

* Update README.md

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>

---------

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>
2023-03-31 17:06:44 +02:00
Guillaume Klein
d03383f902 Simplify reuse of the encoder output 2023-03-30 15:58:27 +02:00
Guillaume Klein
39fddba886 Suppress some special tokens when the default set is not used 2023-03-30 12:42:29 +02:00