ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

Repository Summary

Checkout URI https://github.com/ros-ai/ros2_whisper.git
VCS Type git
VCS Version main
Last Updated 2024-12-13
Dev Status UNMAINTAINED
CI status No Continuous Integration
Released UNRELEASED
Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Packages

README

ROS 2 Whisper

ROS 2 inference for whisper.cpp.

Example

This example shows live transcription of first minute of the 6’th chapter in Harry Potter and the Philosopher’s Stone from Audible:

harry_potter_sample

Build

mkdir -p ros-ai/src && cd ros-ai/src && \
git clone https://github.com/ros-ai/ros2_whisper.git && cd .. && \
colcon build --symlink-install --cmake-args -DGGML_CUDA=On --no-warn-unused-cli

Demos

Configure whisper parameters in whisper.yaml.

Whisper On Key

Run the inference action server (this will download models to $HOME/.cache/whisper.cpp):

ros2 launch whisper_bringup bringup.launch.py

Run a client node (activated on space bar press):

ros2 run whisper_demos whisper_on_key

Stream

Bringup whisper:

ros2 launch whisper_bringup bringup.launch.py

Launch the live transcription stream:

ros2 run whisper_demos stream

Parameters

To enable/disable inference, you can set the active parameter from the command line with:

ros2 param set /whisper/inference active false # false/true

  • Audio will still be saved in the buffer but whisper will not be run.

Available Actions

Action server under topic inference of type Inference.action.

  • The feedback message regularly publishes the actively changing portion of the transcript.

  • The final result contains stale and active portions from the start of the inference.

Published Topics

Topics of type AudioTranscript.msg on /whisper/transcript_stream, which contain the entire transcript (stale and active), are published on updates to the transcript.

Internally, the topic /whisper/tokens of type WhisperTokens.msg is used to transfer the model output between nodes.

Troubleshoot

  • Encoder inference time: https://github.com/ggerganov/whisper.cpp/issues/10#issuecomment-1302462960

CONTRIBUTING

No CONTRIBUTING.md found.

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository

ros2_whisper repository