Add word-level timestamps (#43)

* Add word-level timestamps

* Fix alignment between the segments and the lists of words

* Fix truncated words list when the replacement character is decoded

* Check for empty text_tokens

* Add usage example in the readme

* Update ctranslate2 to 3.9

* Skip empty segment

* Set typing for the new methods
This commit is contained in:
Guillaume Klein
2023-03-15 15:02:28 +01:00
committed by GitHub
parent b41fd05948
commit 8bd013ea99
4 changed files with 314 additions and 8 deletions

View File

@@ -99,6 +99,16 @@ for segment in segments:
print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
```
#### Word-level timestamps
```python
segments, _ = model.transcribe("audio.mp3", word_timestamps=True)
for segment in segments:
for word in segment.words:
print("[%.2fs -> %.2fs] %s" % (word.start, word.end, word.word))
```
See more model and transcription options in the [`WhisperModel`](https://github.com/guillaumekln/faster-whisper/blob/master/faster_whisper/transcribe.py) class implementation.
## Comparing performance against other implementations