Commit Graph

224 Commits

Author SHA1 Message Date
6e42088656 Merge remote-tracking branch 'upstream/master' into prompt 2024-09-04 17:48:06 +08:00
Mahmoud Ashraf
d57c5b40b0 Remove the usage of transformers.pipeline from BatchedInferencePipeline and fix word timestamps for batched inference (#921)
* fix word timestamps for batched inference

* remove hf pipeline
2024-07-27 09:02:58 +07:00
zh-plus
83a368e98a Make vad-related parameters configurable for batched inference. (#923) 2024-07-24 09:00:32 +07:00
Jilt Sebastian
eb8390233c New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements (#856)
Batching Support, Speed Boosts, and Quality Enhancements

---------

Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com>
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
2024-07-18 16:48:52 +07:00
4a59bb011d Merge remote-tracking branch 'upstream/master' into prompt 2024-07-10 10:16:35 +08:00
trungkienbkhn
fbcf58bf98 Fix language detection with non-speech audio (#895) 2024-07-05 14:43:45 +07:00
Jordi Mas
1195359984 Filter out non_speech_tokens in suppressed tokens (#898)
* Filter out non_speech_tokens in suppressed tokens
2024-07-05 14:43:11 +07:00
trungkienbkhn
c22db5125d Bump version to 1.0.3 (#887) 2024-07-01 16:36:12 +07:00
ABen
8862bee1f8 Improve language detection when using clip_timestamps (#867) 2024-07-01 16:12:45 +07:00
Ki Hoon Kim
8d400e9870 Upgrade to Silero-Vad V5 (#884)
* Fix window_size_samples to 512

* Update SileroVADModel

* Replace ONNX file with V5 version
2024-07-01 15:40:37 +07:00
Fedir Zadniprovskyi
bced5f04c0 docs: add 'faster-whisper-server' community integration (#861)
Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com>
2024-06-05 22:27:41 +07:00
Fedir Zadniprovskyi
65551c081f Docker file improvements (#848)
Docker file improvements

Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com>
2024-05-20 09:13:19 +07:00
Napuh
f53be1e811 Add distil models to WhisperModel init and download_model docstrings (#847)
* chore: add distil models to WhisperModel init docstring and download_model docstring
2024-05-20 08:51:22 +07:00
Natanael Tan
4acdb5c619 Fix #839 incorrect clip_timestamps being used in model (#842)
* Fix #839

Changed the code from updating the TranscriptionOptions class instead of the options object which likely was the cause of unexpected behaviour
2024-05-17 16:35:07 +07:00
Peter Krantz
a1c3583c96 Update README.md (#841)
Spelling correction for copy/pasters
2024-05-17 15:24:47 +07:00
trungkienbkhn
2036d12634 Add Dockerfile example (#828) 2024-05-13 16:33:09 +07:00
trungkienbkhn
2f6913efc8 Bump version to 1.0.2 (#816) 2024-05-06 09:02:54 +07:00
ddorian
e11d58599d Allow av to include version 12. (#819) 2024-05-06 08:57:35 +07:00
Keating Reid
49a80eb8a8 Clarify documentation for hotwords (#817)
* Clarify documentation for hotwords

* Remove redundant type specifications
2024-05-06 08:52:59 +07:00
trungkienbkhn
8d5e6d56d9 Support initializing more whisper model args (#807) 2024-05-04 15:12:59 +07:00
trungkienbkhn
6eec07739e Add benchmarking logic for memory, wer and speed (#773) 2024-05-04 15:12:43 +07:00
jax
847fec4492 Feature/add hotwords (#731)
* add hotword params

---------

Co-authored-by: jax <jax_builder@gamil.com>
2024-05-04 15:11:52 +07:00
Keating Reid
46080e584e Loosening tokenizers version constraint (#804) 2024-05-04 15:10:24 +07:00
Sidharth Rajaram
3d1de60ef3 CUDA version and updated installation instructions (#785)
* CUDA version note and updated instructions in README

* ctranslate2 downgrade note, cuDNN v9 consideration

* clearer note on cuDNN v9 package
2024-05-04 15:09:59 +07:00
4ee1d54c14 Merge branch 'master' into prompt 2024-04-08 20:56:49 +08:00
otakutyrant
91c8307aa6 make faster_whisper.assets as a valid python package to distribute (#772) (#774) 2024-04-02 18:22:22 +02:00
Purfview
b024972a56 Foolproof: Disable VAD if clip_timestamps is in use (#769)
* Foolproof: Disable VAD if clip_timestamps is in use

Prevent silly things to happen.
2024-04-02 18:20:34 +02:00
Purfview
8ae82c8372 Bugfix: code breaks if audio is empty (#768)
* Bugfix: code breaks if audio is empty

Regression since https://github.com/SYSTRAN/faster-whisper/pull/732 PR
2024-04-02 18:18:12 +02:00
trungkienbkhn
e0c3a9ed34 Update project github link to SYSTRAN (#746) 2024-03-27 08:31:17 +01:00
Sanchit Gandhi
a67e0e47ae Add support for distil-large-v3 (#755)
* add distil-large-v3

* Update README.md

* use fp16 weights from Systran
2024-03-26 14:58:39 +01:00
trungkienbkhn
1eb9a8004c Improve language detection (#732) 2024-03-12 15:44:49 +01:00
e50d82c18c Merge remote-tracking branch 'upstream/master' into prompt 2024-03-10 11:53:58 +08:00
trungkienbkhn
a342b028b7 Bump version to 1.0.1 (#725) 2024-03-01 11:32:12 +01:00
Purfview
5090cc9d0d Fix window end heuristic for hallucination_silence_threshold (#706)
Removes the wishful heuristic causing more issues than it's fixing.

Same as https://github.com/openai/whisper/pull/2043

Example of the issue: https://github.com/openai/whisper/pull/1838#issuecomment-1960041500
2024-02-29 17:59:32 +01:00
Gabriel F
09cd57e7f3 Fix typo 'ditil' (#721) 2024-02-29 17:08:58 +01:00
trungkienbkhn
16141e65d9 Add pad_or_trim function to handle segment before encoding (#705) 2024-02-29 17:08:28 +01:00
4b64ef1f70 Merge branch 'master' into prompt 2024-02-23 10:52:53 +08:00
trungkienbkhn
06d32bf0c1 Bump version to 1.0.0 (#696) 2024-02-22 09:49:01 +01:00
Purfview
30d6043e90 Prevent infinite loop for out-of-bound timestamps in clip_timestamps (#697)
Same as https://github.com/openai/whisper/pull/2005
2024-02-22 09:48:35 +01:00
BBC-Esq
22c75d0cc3 Update README.md (#672)
Add Faster-Whisper-Transcriber to community integrations.
2024-02-21 10:18:11 +01:00
trungkienbkhn
092067208b Add clip_timestamps and hallucination_silence_threshold options (#646) 2024-02-20 17:34:54 +01:00
Jordi Mas
6ffcbdfbc2 Fix typos in README.md (#668) 2024-02-20 17:33:17 +01:00
Purfview
52695567c9 Bumps up PyAV version to support Python 3.12.x (#679) 2024-02-20 17:31:07 +01:00
IlianP
c6b28ed3a0 Update README.md (#685)
I'm surprised that WhisperX hasn't made it into this list yet, as it has more stars than faster-whisper itself 🚀
2024-02-20 17:28:00 +01:00
trungkienbkhn
4ab646035f Upgrade ctranslate2 version to support CUDA 12 (#694) 2024-02-20 17:26:55 +01:00
d04e685ca2 Merge branch 'master' into prompt 2024-02-19 17:31:58 +08:00
Purfview
f144e4c83d Expands the note for distil-whisper (#659) 2024-01-28 21:48:40 +01:00
Purfview
3aec421849 Add: More clarity of what "max_new_tokens" does (#658)
* Add: More clarity of what "max_new_tokens" does
2024-01-28 21:40:33 +01:00
Dominik Macháček
64b9f244bd Whisper-Streaming mention (#656)
under community integrations
2024-01-25 18:27:27 +01:00
Purfview
00efce1696 Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#652) 2024-01-24 11:54:43 +01:00