Use tiktoken (#1044)

* use tiktoken==0.3.0

* formatting

* tuple should be safer

* Update whisper/tokenizer.py

Co-authored-by: Ruhollah Majdoddin <r.majdodin@gmail.com>

* use tiktoken 0.3.1

* reflecting suggestions

* cleanup

* bypassing load_tiktoken_bpe to avoid blobfile dep

---------

Co-authored-by: Ruhollah Majdoddin <r.majdodin@gmail.com>
This commit is contained in:
Jong Wook Kim
2023-03-13 05:34:16 -04:00
committed by GitHub
parent ad3250a846
commit 839639a223
15 changed files with 100601 additions and 100096 deletions

View File

@@ -3,5 +3,5 @@ numpy
torch
tqdm
more-itertools
transformers>=4.19.0
tiktoken==0.3.1
ffmpeg-python==0.2.0