faster-whisper

Author	SHA1	Message	Date
Jordi Mas	1195359984	Filter out non_speech_tokens in suppressed tokens (#898 ) * Filter out non_speech_tokens in suppressed tokens	2024-07-05 14:43:11 +07:00
Oscaarjs	3084409633	Add V3 Support (#578 ) * Add V3 Support * update conversion example --------- Co-authored-by: oscaarjs <oscar.johansson@conversy.se>	2023-11-24 23:16:12 +01:00
Guillaume Klein	727ab81f31	Improve error message for invalid task and language parameters (#466 )	2023-09-12 10:02:23 +02:00
Guillaume Klein	a5d03e55fa	Prevent out of range error in method split_tokens_on_unicode (#111 )	2023-04-04 10:51:14 +02:00
Guillaume Klein	9fa1989073	Revert "Prevent out of range error in method split_tokens_on_unicode" This reverts commit `36160c1e7e`.	2023-04-04 10:25:41 +02:00
Guillaume Klein	36160c1e7e	Prevent out of range error in method split_tokens_on_unicode	2023-04-04 10:17:56 +02:00
Guillaume Klein	39fddba886	Suppress some special tokens when the default set is not used	2023-03-30 12:42:29 +02:00
Guillaume Klein	d82be59d5f	Fix unset attribute when using English-only models	2023-03-17 18:33:16 +01:00
Guillaume Klein	8bd013ea99	Add word-level timestamps (#43 ) * Add word-level timestamps * Fix alignment between the segments and the lists of words * Fix truncated words list when the replacement character is decoded * Check for empty text_tokens * Add usage example in the readme * Update ctranslate2 to 3.9 * Skip empty segment * Set typing for the new methods	2023-03-15 15:02:28 +01:00
Guillaume Klein	c52adaca90	Create a helper class Tokenizer	2023-03-09 12:53:49 +01:00

10 Commits