faster-whisper

heimoshuiyu/faster-whisper

Fork 0

a150adcc19 Enable onnxruntime dependency for Python 3.11 (#260) Guillaume Klein 2023-05-24 16:07:54 +02:00
ae1e6d9883 Remove reference to the VAD function from the README Guillaume Klein 2023-05-24 15:56:03 +02:00
cf7c021573 Export __version__ at the module level (#258) Guillaume Klein 2023-05-24 15:50:37 +02:00
4db549b800 Make get_speech_timestamps backward compatible with the previous usage (#259) Guillaume Klein 2023-05-24 15:49:36 +02:00
c99feb22dc Include requirements files in sdist (#240) Guillaume Klein 2023-05-24 12:55:15 +02:00
723cb97483 Fix occasional IndexError on empty segments (#227) Guillaume Klein 2023-05-24 12:55:04 +02:00
6a2da9a95c Also catch client-side network exceptions when synchronizing models (#228) Guillaume Klein 2023-05-11 15:07:15 +02:00
6a1d331d66 Add CONTRIBUTING.md (#229) Guillaume Klein 2023-05-11 15:06:46 +02:00
2d7c984bfc Reformat function download_model for clarity Guillaume Klein 2023-05-11 14:47:22 +02:00
8e5c747ab5 Reformat list of community integrations Guillaume Klein 2023-05-11 12:15:41 +02:00
32b962bed8 Adds: whisper-standalone-win (#216) Purfview 2023-05-09 19:20:41 +01:00
53d247b0bb retry model download locally if huggingface throws an http error. (#215) David Axelrod 2023-05-09 11:20:22 -04:00
91f948b0d6 transcribe: return all language probabilities if requested (#210) Ozan Caglayan 2023-05-09 13:53:47 +01:00
5d8f3e2d90 Implement VadOptions (#198) FlippFuzz 2023-05-09 18:47:02 +08:00
d889345e07 added whisper-diarize (#193) Mahmoud Ashraf 2023-04-28 10:56:13 +02:00
5d203d2757 Update Github link to community project (#187) Jordi Mas 2023-04-27 14:53:28 +02:00
a3dcb90081 Bump version to 0.5.1 Guillaume Klein 2023-04-26 17:38:16 +02:00
89a4c7f1f0 Update docstring to clarify download_root and output_dir Guillaume Klein 2023-04-26 17:37:51 +02:00
6f9d68dd6b Fix typing of local_files_only Guillaume Klein 2023-04-26 17:36:24 +02:00
68df3214ba Use cache_dir instead of local_dir (#182) Jordi Mas 2023-04-26 16:35:18 +02:00
67cce3f552 Bump version to 0.5.0 Guillaume Klein 2023-04-25 17:00:41 +02:00
8340e04dc6 Assign words to the speech chunk with the greatest coverage (#180) Guillaume Klein 2023-04-25 15:54:31 +02:00
8cf5d5a4b3 Increase the default value of speech_pad_ms to 400 ms (#179) Guillaume Klein 2023-04-25 15:54:22 +02:00
32dc625f11 Update README.md Guillaume Klein 2023-04-25 15:47:38 +02:00
e06511f96b Rename AudioInfo to TranscriptionInfo (#174) Guillaume Klein 2023-04-24 16:29:17 +02:00
338a725ff8 fix where the tokens are reset (#175) Anthony 2023-04-24 16:28:47 +02:00
f893113759 Align segment structure with openai/whisper (#154) Amar Sood 2023-04-24 09:04:42 -04:00
2b51a97e61 Add transcription_options to AudioInfo (#170) FlippFuzz 2023-04-24 21:02:19 +08:00
358d373691 Allow specifying local_files_only to prevent checking the Internet everytime (#166) Jordi Mas 2023-04-20 14:26:06 +02:00
9a646b69e6 format code heimoshuiyu 2023-04-20 02:00:57 +08:00
49af9564ab Ignore repeated prompt heimoshuiyu 2023-04-20 01:49:10 +08:00
3adcc12d0f Clarify that the returned segments value is a generator (#144) Guillaume Klein 2023-04-13 09:50:53 +02:00
2b53dee6b6 Expose download location in WhisperModel constructor (#126) Ewald Enzinger 2023-04-08 10:02:36 +02:00
06d24056e9 Configure ignore for more files. (#122) Bekir Bakar 2023-04-06 20:13:09 +03:00
e9a082dcf2 Keep segment timestamps aligned with words timestamps after VAD (#119) Guillaume Klein 2023-04-06 11:54:40 +02:00
051b3350e5 Add some info and debug logs (#113) Guillaume Klein 2023-04-05 16:57:59 +02:00
746f2698db Bump version to 0.4.1 Guillaume Klein 2023-04-04 12:16:23 +02:00
a5d03e55fa Prevent out of range error in method split_tokens_on_unicode (#111) Guillaume Klein 2023-04-04 10:51:14 +02:00
9fa1989073 Revert "Prevent out of range error in method split_tokens_on_unicode" Guillaume Klein 2023-04-04 10:25:41 +02:00
36160c1e7e Prevent out of range error in method split_tokens_on_unicode Guillaume Klein 2023-04-04 10:17:56 +02:00
2f266eb844 Fix VAD index error when a predicted timestamps is too large (#107) Guillaume Klein 2023-04-03 19:34:54 +02:00
8c36ac1be8 Bump version to 0.4.0 Guillaume Klein 2023-04-03 17:24:49 +02:00
19698c95f8 Support VAD filter (#95) Guillaume Klein 2023-04-03 17:22:48 +02:00
b4c1c57781 Added retrieval mechanism (avg_log_prob/no_speech_prob) (#103) palladium123 2023-04-03 22:56:35 +08:00
f20bb258de Support separating the left and right audio channels (#97) Guillaume Klein 2023-04-03 11:22:43 +02:00
1a968a4323 Pass prefix only to the first window Guillaume Klein 2023-04-01 09:26:42 +02:00
def70d8496 Update headings in the Usage section Guillaume Klein 2023-03-31 18:54:55 +02:00
7301df7f8b Update README.md (#101) mayeaux 2023-03-31 17:06:44 +02:00
d03383f902 Simplify reuse of the encoder output Guillaume Klein 2023-03-30 15:58:27 +02:00
39fddba886 Suppress some special tokens when the default set is not used Guillaume Klein 2023-03-30 12:42:29 +02:00
eda840f8ff Always disable the progress bar specific to snapshot_download Guillaume Klein 2023-03-29 12:11:24 +02:00
0224400584 Add large-v1 model Guillaume Klein 2023-03-28 14:36:10 +02:00
8246479fda Ignore the invalid audio frames (#82) Guillaume Klein 2023-03-27 10:19:22 +02:00
e2705d11c9 Raise an explicit error message if the model size is invalid Guillaume Klein 2023-03-26 16:29:11 +02:00
f8d2fb169f Fix variable name reference (#77) Jordi Mas 2023-03-25 10:00:59 +01:00
a10732c74a Only download the required model files Guillaume Klein 2023-03-24 17:59:11 +01:00
7808eddf06 Bump version to 0.3.0 Guillaume Klein 2023-03-24 10:56:42 +01:00
de7682a2f0 Automatically download converted models from the Hugging Face Hub (#70) Guillaume Klein 2023-03-24 10:55:55 +01:00
523ae2180f Run the encoder only once for each 30-second window (#73) Guillaume Klein 2023-03-24 10:53:49 +01:00
2b7be47041 Update README.md Guillaume Klein 2023-03-24 09:15:05 +01:00
3f02c53610 Add .gitignore file Guillaume Klein 2023-03-23 20:52:46 +01:00
e663186a4b Add some badges at the top of the README Guillaume Klein 2023-03-23 20:33:19 +01:00
e44a8c7ba0 Update the README following the PyPI release Guillaume Klein 2023-03-22 21:07:27 +01:00
33f41d84e3 Add job to push a package for each new Git tag Guillaume Klein 2023-03-22 21:01:53 +01:00
c910ec0293 Bump version to 0.2.0 Guillaume Klein 2023-03-22 20:54:07 +01:00
e9dfe23eaa Complete the package metadata Guillaume Klein 2023-03-22 20:53:51 +01:00
66efd02bd0 Run some automatic tests with GitHub Actions (#68) Guillaume Klein 2023-03-22 20:50:03 +01:00
52264f2277 Fix typing for device_index argument Guillaume Klein 2023-03-22 13:51:12 +01:00
c27c010f96 Ignore Unicode errors in input file metadata Guillaume Klein 2023-03-21 17:13:37 +01:00
0ab8db2b37 Remove debug prints Guillaume Klein 2023-03-18 09:48:02 +01:00
a70aac18ae Remove unused import Guillaume Klein 2023-03-18 09:47:02 +01:00
d82be59d5f Fix unset attribute when using English-only models Guillaume Klein 2023-03-17 18:33:16 +01:00
58f4447964 Update benchmark results with latest openai/whisper and faster-whisper Guillaume Klein 2023-03-17 16:44:07 +01:00
cce6b53e45 Fix incorrect attribute access Guillaume Klein 2023-03-16 10:32:36 +01:00
2007adf0b5 Fix typing of words attribute Guillaume Klein 2023-03-15 17:49:07 +01:00
ae9898f0d8 Include duration in AudioInfo structure Guillaume Klein 2023-03-15 15:30:29 +01:00
c5f6b91b7d Port utility function format_timestamp Guillaume Klein 2023-03-15 15:27:20 +01:00
eafb2c79a3 Add more typing annotations Guillaume Klein 2023-03-15 15:22:53 +01:00
8bd013ea99 Add word-level timestamps (#43) Guillaume Klein 2023-03-15 15:02:28 +01:00
b41fd05948 Update python_requires to >=3.8 Guillaume Klein 2023-03-10 11:15:58 +01:00
3301dd9273 Make get_input a free function Guillaume Klein 2023-03-09 12:54:41 +01:00
c52adaca90 Create a helper class Tokenizer Guillaume Klein 2023-03-09 12:53:49 +01:00
f0a21ea916 Use a dict to represent intermediate segments Guillaume Klein 2023-03-09 11:53:55 +01:00
6a84df400f Fix all_tokens handling Guillaume Klein 2023-03-09 10:02:25 +01:00
4176da0d68 Rename offset to seek to match the OpenAI implementation Guillaume Klein 2023-03-09 09:58:58 +01:00
6b16b8a69c Pad the audio instead of the spectrogram Guillaume Klein 2023-03-08 10:50:46 +01:00
2646906596 Fix error in decode_audio for long audio inputs Guillaume Klein 2023-03-07 10:15:36 +01:00
01ef12a6a0 Do not ignore last segment ending with one timestamp Guillaume Klein 2023-03-07 10:05:04 +01:00
469244a57d Update CTranslate2 to 3.8.0 Guillaume Klein 2023-03-06 16:21:48 +01:00
4a18adc382 Load the tokenizer from the model directory if it exists Guillaume Klein 2023-03-01 15:47:16 +01:00
873992623c Accept the audio waveform as an input to transcribe() (#21) Guillaume Klein 2023-02-28 19:01:31 +01:00
ed32002aea Add instructions to install without git clone Guillaume Klein 2023-02-27 12:21:54 +01:00
a4f1cc8f11 Add prefix parameter Guillaume Klein 2023-02-27 12:09:40 +01:00
528aa3e784 Make threshold parameters optional Guillaume Klein 2023-02-27 11:32:03 +01:00
f0add58bdc Add typing to constructor and transcribe method Guillaume Klein 2023-02-27 11:22:02 +01:00
b1c69927f8 Update code snippet to be consistent with the conversion example Guillaume Klein 2023-02-24 15:52:23 +01:00
ef71be09ed Update CTranslate2 to 3.7.0 Guillaume Klein 2023-02-23 11:18:58 +01:00
f5c0e44935 Update README.md Guillaume Klein 2023-02-22 14:59:29 +01:00
d91365e321 Minor code simplification Guillaume Klein 2023-02-22 11:02:11 +01:00
4b8237da1b Strip the leading space before computing the compression ratio Guillaume Klein 2023-02-22 10:28:04 +01:00

Commit Graph Select branches Hide Pull Requests master prompt #1 Mono Color

Commit Graph

Select branches

Hide Pull Requests

master

prompt

#1