whisper

heimoshuiyu/whisper

Fork 0

62e34f0c03 format code prompt heimoshuiyu 2023-04-20 02:02:11 +08:00
67fac2a4ce Check multiple prompts heimoshuiyu 2023-04-18 18:58:22 +08:00
6fc4e6f230 fix id start heimoshuiyu 2023-04-18 15:13:07 +08:00
55756284ac Ignore repeated prompt heimoshuiyu 2023-04-18 12:15:21 +08:00
c09a7ae299 Update decoding.py (#1219) main Jong Wook Kim 2023-04-11 18:13:13 -04:00
b0022b3283 Update decoding.py (#1155) Fernando O. Gallego 2023-04-12 00:06:03 +02:00
76c901ab8d Update README.md to reference tiktoken (#1105) Arseniy Bushyn 2023-04-11 03:39:17 +03:00
43940fc978 Implement max line width and max line count, and make word highlighting optional (#1184) ryanheise 2023-04-11 10:28:35 +10:00
255887f219 Squash long words at window and sentence boundaries. (#1114) ryanheise 2023-04-11 10:23:53 +10:00
a151816b6b python-publish.yml: bump actions version to fix node warning (#1211) K.B.Dharun Krishna 2023-04-11 02:24:09 +05:30
b5851c6c40 Update tokenizer.py (#1163) Jong Wook Kim 2023-03-29 16:12:36 -04:00
6dea21fd7f Release 20230314 Jong Wook Kim 2023-03-15 00:39:05 -07:00
79c43e4859 abort find_alignment on empty input (#1090) Jong Wook Kim 2023-03-14 15:47:58 -04:00
5f9ac653b7 Fix truncated words list when the replacement character is decoded (#1089) Guillaume Klein 2023-03-14 17:32:41 +01:00
ba88b8e1b3 fix github language stats getting dominated by jupyter notebook (#1076) Akash Mahajan 2023-03-14 00:07:09 -07:00
671ac5a4ce Fix alignment between the segments and the list of words (#1087) Guillaume Klein 2023-03-14 00:34:09 +01:00
839639a223 Use tiktoken (#1044) Jong Wook Kim 2023-03-13 05:34:16 -04:00
ad3250a846 Release 20230308 Jong Wook Kim 2023-03-08 15:48:57 -08:00
c4b50c0824 kwargs in decode() for convenience (#1061) Jong Wook Kim 2023-03-08 18:46:38 -05:00
38f2f4d99d fix all_tokens handling that caused more repetitions and discrepancy in JSON (#1060) Jong Wook Kim 2023-03-08 18:34:07 -05:00
aac47c9834 fix typo Jong Wook Kim 2023-03-07 20:43:49 -08:00
26807ec6d3 Release 20230307 Jong Wook Kim 2023-03-07 20:36:29 -08:00
919a713499 attempt to fix the repetition/hallucination issue identified in #1046 (#1052) Jong Wook Kim 2023-03-07 23:08:45 -05:00
38e990d853 Use triton==2.0.0 (#1053) Jong Wook Kim 2023-03-07 19:56:31 -05:00
924e1f8e06 Try installing triton only if linux & x86_64 (#1051) Jong Wook Kim 2023-03-07 14:31:40 -05:00
4b0d5e58d0 Update setup.py Jong Wook Kim 2023-03-07 04:47:46 -08:00
8180fde939 Release 20230306 Jong Wook Kim 2023-03-06 18:50:41 -08:00
c6e4e5efb3 remove auxiliary audio extension (#1021) Local State 2023-03-06 20:48:14 -05:00
b80bcf610d apply formatting with black (#1038) Jong Wook Kim 2023-03-06 18:50:37 -05:00
500d0fe966 word-level timestamps in transcribe() (#869) Jong Wook Kim 2023-03-06 17:00:49 -05:00
eab8d920ed Decoding improvements (#1033) Jong Wook Kim 2023-03-06 14:32:32 -05:00
3e1780fd37 Update README.md (#894) Roman Vasilenko 2023-03-03 19:41:59 -05:00
7858aa9c08 Fix infinite loop caused by incorrect timestamp tokens prediction (#914) Andrey Chernykh 2023-02-02 06:46:51 +07:00
5c1a8c10e7 clarify that 3.11 is not supported Jong Wook Kim 2023-01-27 00:01:49 -08:00
4e635c6644 Update README.md about Python 3.8+ requirement Jong Wook Kim 2023-01-24 14:45:56 -08:00
a6b36ede1f drop python 3.7 support (#889) Jong Wook Kim 2023-01-24 14:05:57 -08:00
55f690af79 Release 20230124 Jong Wook Kim 2023-01-24 11:11:08 -08:00
7f1ef223ab handle printing even if sys.stdout.buffer is not available (#887) Jong Wook Kim 2023-01-24 10:12:04 -08:00
f5bfe004ec Add TSV formatted output in transcript, using integer start/end times in milliseconds. (#228) Niels Mayer 2023-01-22 00:27:17 -08:00
da600abd2b Added --output_format option (#333) Aaryan YVS 2023-01-22 13:28:38 +05:30
9f7aba6099 Handle XDG_CACHE_HOME properly for download_root (#864) zer0-x 2023-01-21 12:09:39 +03:00
12e1089462 use stdout for printing transcription progress (#867) Jong Wook Kim 2023-01-20 00:54:05 -08:00
ea1c266709 Fix bug where mm is mistakenly replaced with hmm in e.g. 20mm (#659) Markus Hennerbichler 2023-01-18 18:41:11 +00:00
8135a7c31c verbose outputs from pytest Jong Wook Kim 2023-01-18 10:30:18 -08:00
9d646db9d8 print '?' if a letter can't be encoded using the system default encoding (#859) Jong Wook Kim 2023-01-17 23:28:36 -08:00
37a4f1be6d Release 20230117 Jong Wook Kim 2023-01-17 16:08:28 -08:00
b9f9b433ae Add github action to automatically push to pypi on Release x.y.z commit (#681) Romain Beaumont 2023-01-18 00:50:26 +01:00
f0083e7eb2 Use ndimage.median_filter instead of signal.medfilter (#812) Umar Farooqi 2023-01-17 17:43:05 -05:00
a84191faae rename GitHub workflow Jong Wook Kim 2023-01-17 13:54:40 -08:00
b1d213c0c7 allow test_transcribe to run on CPU when CUDA is not available Jong Wook Kim 2023-01-17 13:43:36 -08:00
493dfffa37 add github action to run pytest Jong Wook Kim 2023-01-17 13:35:48 -08:00
0f39c89d92 Update README.md (#804) Mikko Vedru 2023-01-17 09:46:42 +02:00
6df3ea1fb5 Support batch-dimension in log_mel_spectogram (#839) Markus Hennerbichler 2023-01-17 07:46:15 +00:00
70861c7ce3 Fix tiny transcribe() docstring typo (#857) adamreis 2023-01-16 22:42:01 -08:00
f82bc59f5e torch.concatenate -> torch.cat for compatibility Jong Wook Kim 2023-01-10 10:53:18 -08:00
28769fcfe5 word-level timestamps in Multilingual_ASR notebook Jong Wook Kim 2022-12-31 10:03:42 -07:00
53807677fe MultiHeadAttention to return qk as well Jong Wook Kim 2022-12-30 01:53:06 -07:00
9323b2526c Revert "saving the qk matrix in the attention module for convenience" Jong Wook Kim 2022-12-29 23:53:31 -07:00
68e44bd83c saving the qk matrix in the attention module for convenience Jong Wook Kim 2022-12-29 23:02:52 -07:00
0b5dcfdef7 large-v2 figure and arxiv url update Jong Wook Kim 2022-12-09 00:12:39 -05:00
b9265e5796 Update Hebrew language code to he per IANA registry (#401) altryne 2022-12-07 11:45:31 -07:00
fd8f80c8b8 Explicitly closing model file after reading it (#630) Paul Harter 2022-12-06 17:07:19 +00:00
4179ed2475 add large-v2 model Jong Wook Kim 2022-12-05 11:07:14 -05:00
ec1b34bb90 fix compression ratio function (#561) jumon 2022-12-05 08:27:42 +09:00
eff383b27b invoking __call__ instead of forward() Jong Wook Kim 2022-11-16 04:18:50 -08:00
02aa851a49 fix to return only the text token ids Jong Wook Kim 2022-11-15 16:25:11 -08:00
76148a56c5 suppress generating non-timestamp tokens at the beginning (#532) jumon 2022-11-16 04:44:36 +09:00
9f70a352f9 Fix attention caching to make it actually work (#370) Vicki Anand 2022-10-19 19:44:03 -04:00
7f3e408e09 Add package metadata to setup.py (#315) Sumana Harihareswara 2022-10-17 16:51:16 -04:00
f680570016 Fix bug (#305) Michael Monashev 2022-10-17 21:38:20 +03:00
d18e9ea5dd transcribe() on English-only model won't complain when language="en" is not given Jong Wook Kim 2022-10-09 02:40:12 -07:00
82725cea9c infer download_root from XDG_CACHE_HOME if avail (#257) David Marx 2022-10-09 02:14:03 -07:00
35713c66e0 Add --threads option to transcribe (#278) eudoxos 2022-10-09 11:11:15 +02:00
9e653bd0ea Fixed CoW RuntimeError in DecodingTask.run() (#240) Corentin Jemine 2022-10-04 17:49:31 +02:00
02b74308ff Fix timestamps and strip extraneous whitespace in WebVTT output (#219) Tom Stuart 2022-10-03 22:51:07 +01:00
0b1ba3d46e Add model_dir to arguments (#202) Jibin Mathew 2022-09-30 14:45:51 -07:00
60132ade70 Use , character instead of . for SRT output. (#197) Caleb McQuillin 2022-09-29 23:44:12 -04:00
7cb4cc21bf allowing nonzero initial temperature Jong Wook Kim 2022-09-29 18:05:12 -07:00
30dc5c581b pointer to the show and tell section Jong Wook Kim 2022-09-29 14:57:12 -07:00
5905e503b8 Update README.md (#161) Szabolcs Pasztor 2022-09-29 23:18:54 +02:00
0457aac342 Adds missing command for install (mac) (#90) Fabiano 2022-09-30 10:08:58 +13:00
deafef05f3 Update audio.py (#178) sawadata 2022-09-30 04:34:04 +09:00
2b0c2971af Don't update duration if last timestamp is same as begin (#191) Vicki Anand 2022-09-29 15:27:48 -04:00
62fe7f1009 patience definition to match the paper Jong Wook Kim 2022-09-27 19:00:41 -07:00
b4308c4782 fix: transcribe verbosity (#140) Nick Konovalchuk 2022-09-26 21:46:21 +03:00
9c8183a179 Use PyTorch as logits transpose for ONNX support (#141) Michael Goin 2022-09-26 13:54:26 -04:00
2037b65f3f Context prompt (#128) VulumeCode 2022-09-26 14:22:33 +02:00
fc0f40981d Write each sentence as a separate line for the txt output (#101) EliEron 2022-09-26 13:52:28 +02:00
520796a34c fix token suppression (#123) VulumeCode 2022-09-26 13:35:21 +02:00
ead77fab97 add srt subtitle export utility (#102) fatih 2022-09-26 13:50:26 +03:00
5485428c81 arch linux ffmpeg install (#93) Ashutosh Tripathi 2022-09-26 15:54:47 +05:30
9e7e418ff1 add progress bar for transcribe loop (#100) fatih 2022-09-26 13:24:13 +03:00
5d8d3e75a4 add --condition_on_previous_text Jong Wook Kim 2022-09-25 05:16:08 -07:00
2d3032de01 improved warning message for English-only models Jong Wook Kim 2022-09-25 02:10:36 -07:00
8cf36f3508 allow hyphens and single quotes between words Jong Wook Kim 2022-09-23 20:11:27 +09:00
15ab548263 nocaptions -> nospeech to match the paper figure Jong Wook Kim 2022-09-23 15:45:32 +09:00
61989529b7 Fix possible mistake when loading model to device (#57) mj-kh 2022-09-23 09:51:47 +03:30
f296bcd3fa Avoid keeping redundant copies of model weights in memory during load (#42) Niklas K 2022-09-23 05:57:39 +02:00
a4fe05aa71 Add conda environment.yml (and fix requirements.txt) (#8) Sidney Radcliffe 2022-09-23 04:30:45 +01:00
957ffc77de Add rust as a dependency (#30) Giovanni Lanzani 2022-09-23 05:26:38 +02:00

Commit Graph Select branches Hide Pull Requests main prompt Mono Color

Commit Graph

Select branches

Hide Pull Requests

main

prompt