No version for distro humble. Known supported distros are highlighted in the buttons above.

No version for distro jazzy. Known supported distros are highlighted in the buttons above.

No version for distro rolling. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

API Docs
Browse Code
Wiki

Package Summary

Tags	No category tags.
Version	2.1.29
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-01-09
Dev Status	DEVELOPED
CI status	No Continuous Integration
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (0) Good First Issues (0) Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition/README.md

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)
~energy_threshold (Double, default: 300)

Threshold for Voice activity detection
~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold
~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD
~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD
~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete
~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out
~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up
~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached
~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase
~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording
~duration (Double, default: 10.0)

Seconds of waiting for speech
~depth (Int, default: 16)

Depth of audio signal
~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)
~sample_rate (Int, default: 16000)

Sample rate of audio signal
~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition
~start_signal (String, default: /usr/share/sounds/freedesktop/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption
~recognized_signal (String, default: /usr/share/sounds/freedesktop/stereo/message.ogg)

Path to sound file for bell on the end of audio caption
~success_signal (String, default: /usr/share/sounds/freedesktop/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result
~timeout_signal (String, default: /usr/share/sounds/freedesktop/stereo/network-connectivity-lost.ogg)

Path to sound file for bell on timeout for recognition
~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.
~auto_start (Bool, default: True)

Starting the speech recognition when launching.
~self_cancellation (Bool, default: True)

Whether the node recognize the sound heard when ~tts_action_names is running or not.

This options is for ignoring self voice sounds from recognition.
~tts_action_names (List[String], default: ['sound_play'])

Text-to-speech action name for self cancellation.

The node ignores the voice heard when these Text-to-speech action is running.
~tts_tolerance (Float, default: 1.0)

Tolerance seconds for self cancellation.

The node ignores the voice with this tolerance seconds after ~tts_action_names finish running.
~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.
~google_cloud_credentials_json (String, default: None)

Path to credential json file. For JSK users, you can download from Google Drive link. This is valid only if ~engine is GoogleCloud.
~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.
~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.
~vosk_model_path (String, default: None)

Path to trainded model for Vosk API. This is valid only if ~engine is Vosk.

If en-US or ja is selected as ~language, you do not need to specify the path. To load other models, please download them from Model list.

Author

Yuki Furuta «furushchev@jsk.imi.i.u-tokyo.ac.jp»

CHANGELOG

Changelog for package ros_speech_recognition

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

2.1.21 (2020-08-19)

add missing packages, closes https://github.com/ros/rosdistro/pull/26216 (#211)
Contributors: Kei Okada

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

Fixed issue #201 as requested, see https://github.com/jsk-ros-pkg/jsk_3rdparty/pull/202
Contributors: MrMarshy

2.1.18 (2020-07-20)

Fix for noetic (#200)
- fix 2to3, with print, raise, exception
[ros_speech_recognition] Enable multi channel audio recognition (#198)
- adjust type code to the CPU platform
- replace rosparam name: channels -> n_channel
- add rosparam description to README
- enable multi channel audio recognition
Add args to ros_speech_recognition (#197)
- Add flac as run_depend for SpeechRecognition pip package
- Use catkin_virtualenv to use SpeechRecognition pip package
- Add arguments and params to pass rostest
- Add test for ros_speech_recognition
- add args to launch
- add pip install to tutorials
- add param description to README
Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

set SoundRequest.volume for kinetic (#173)
Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

fixes GoogleCloud auth (#158)
Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

ros_speech_recognition: add continuous mode (#127)
ros_speech_recognition: add README (#123)
add ros_speech_recognition package (#121)
Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

This package does not provide any links to tutorials in it's rosindex metadata. You can check on the ROS Wiki Tutorials page for the package.

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

API Docs Browse Code Wiki

No version for distro galactic. Known supported distros are highlighted in the buttons above.

No version for distro iron. Known supported distros are highlighted in the buttons above.

ros_speech_recognition package from jsk_3rdparty repo

aques_talk assimp_devel downward ff ffha google_cloud_texttospeech julius libcmt libsiftfast lpg_planner mini_maxwell nlopt osqp slic voice_text zdepth bayesian_belief_networks chaplus_ros dialogflow_task_executive gdrive_ros google_chat_ros influxdb_store jsk_3rdparty collada_urdf_jsk_patch laser_filters_jsk_patch julius_ros nfc_ros opt_camera pgm_learner respeaker_ros ros_google_cloud_language ros_speech_recognition rospatlite rosping rostwitter sesame_ros switchbot_ros webrtcvad_ros zdepth_image_transport

API Docs
Browse Code
Wiki

Package Summary

Tags	No category tags.
Version	2.1.29
License	BSD
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Checkout URI	https://github.com/jsk-ros-pkg/jsk_3rdparty.git
VCS Type	git
VCS Version	master
Last Updated	2025-01-09
Dev Status	DEVELOPED
CI status	Continuous Integration
Released	RELEASED
Tags	No category tags.
Contributing	Help Wanted (0) Good First Issues (0) Pull Requests to Review (0)

Package Description

ROS wrapper for Python SpeechRecognition library

Additional Links

Website

Maintainers

Yuki Furuta

Authors

Yuki Furuta

ros_speech_recognition/README.md

ros_speech_recognition

A ROS package for speech-to-text services.
This package uses Python package SpeechRecognition as a backend.

Tutorials

Normal tutorial

Install this package and SpeechReconition

  sudo apt install ros-${ROS_DISTRO}-ros-speech-recognition
  

Launch speech recognition node

  roslaunch ros_speech_recognition speech_recognition.launch

Echo /speech_to_text

  rostopic echo /speech_to_text
  # you can get the recognition result
  

Parrotry tutorial

Parrotry mean オウム返し in Japanese

# english
roslaunch ros_speech_recognition parrotry.launch
# japanese
roslaunch ros_speech_recognition parrotry.launch language:=ja-JP

`speech_recognition_node.py` Interface

Publishing Topics

~voice_topic (speech_recognition_msgs/SpeechRecognitionCandidates)

Speech recognition candidates topic name.

Topic name is set by parameter ~voice_topic, and default value is speech_to_text.
sound_play (sound_play/SoundRequestAction)

Action client to play sound on events. If the action server is not available or ~enable_sound_effect is False, no sound is played.

Subscribing Topics

~audio_topic (audio_common_msgs/AudioData)

Audio stream data to be recognized.

Topis name is set by parameter ~audio_topic and default value is audio.

Advertising Services

speech_recognition (speech_recognition_msgs/SpeechRecognition)

Service for speech recognition
speech_recognition/start (std_srvs/Empty)

Start service for speech recognition

This service is available when parameter ~contiunous is True.
speech_recognition/start (std_srvs/Empty)

Stop service for speech recognition

This service is available when parameter ~contiunous is True.

Parameters

~voice_topic (String, default: speech_to_text)

Publishing voice topic name
~audio_topic (String, default: audio)

Subscribing audio topic name
~enable_sound_effect (Bool, default: True)

Flag to enable or disable sound to play sound on recognition.
~language (String, default: en-US)

Language to be recognized
~engine (Enum[String], default: Google)

Speech-to-text engine (To see full options use dynamic_reconfigure)
~energy_threshold (Double, default: 300)

Threshold for Voice activity detection
~dynamic_energy_threshold (Bool, default: True)

Adaptive estimation for energy_threshold
~dynamic_energy_adjustment_damping (Double, default: 0.15)

Damping threshold for dynamic VAD
~dynamic_energy_ratio (Double, default: 1.5)

Energy ratio for dynamic VAD
~pause_threshold (Double, default: 0.8)

Seconds of non-speaking audio before a phrase is considered complete
~operation_timeout (Double, default: 0.0)

Seconds after an internal operation (e.g., an API request) starts before it times out
~listen_timeout (Double, default: 0.0)

The maximum number of seconds that this will wait for a phrase to start before giving up
~phrase_time_limit (Double, default: 10.0)

The maximum number of seconds that this will allow a phrase to continue before stopping and returning the part of the phrase processed before the time limit was reached
~phrase_threshold (Double, default: 0.3)

Minimum seconds of speaking audio before we consider the speaking audio a phrase
~non_speaking_duration (Double, default: 0.5)

Seconds of non-speaking audio to keep on both sides of the recording
~duration (Double, default: 10.0)

Seconds of waiting for speech
~depth (Int, default: 16)

Depth of audio signal
~n_channel (Int, default: 1)

Total number of channels in audio data (e.g. 1: mono, 2: stereo)
~sample_rate (Int, default: 16000)

Sample rate of audio signal
~buffer_size (Int, default: 10240)

Maximum buffer size to store audio data for speech recognition
~start_signal (String, default: /usr/share/sounds/freedesktop/stereo/bell.ogg)

Path to sound file for bell on the start of audio caption
~recognized_signal (String, default: /usr/share/sounds/freedesktop/stereo/message.ogg)

Path to sound file for bell on the end of audio caption
~success_signal (String, default: /usr/share/sounds/freedesktop/stereo/message-new-instant.ogg)

Path to sound file for bell on getting successful recognition result
~timeout_signal (String, default: /usr/share/sounds/freedesktop/stereo/network-connectivity-lost.ogg)

Path to sound file for bell on timeout for recognition
~continuous (Bool, default: False)

Selecting to use topic or service. By default, service is used.
~auto_start (Bool, default: True)

Starting the speech recognition when launching.
~self_cancellation (Bool, default: True)

Whether the node recognize the sound heard when ~tts_action_names is running or not.

This options is for ignoring self voice sounds from recognition.
~tts_action_names (List[String], default: ['sound_play'])

Text-to-speech action name for self cancellation.

The node ignores the voice heard when these Text-to-speech action is running.
~tts_tolerance (Float, default: 1.0)

Tolerance seconds for self cancellation.

The node ignores the voice with this tolerance seconds after ~tts_action_names finish running.
~google_key (String, default: None)

Auth Key for Google API. If None, use public key. (No guarantee to be blocked.)
This is valid only if ~engine is Google.
~google_cloud_credentials_json (String, default: None)

Path to credential json file. For JSK users, you can download from Google Drive link. This is valid only if ~engine is GoogleCloud.
~google_cloud_preferred_phrases ([String], default: None)

Preferred phrases parameters. This is valid only if ~engine is GoogleCloud.
~bing_key (String, default: None)

Auth key for Bing API.
This is valid only if ~engine is bing.
~vosk_model_path (String, default: None)

Path to trainded model for Vosk API. This is valid only if ~engine is Vosk.

If en-US or ja is selected as ~language, you do not need to specify the path. To load other models, please download them from Model list.

Author

Yuki Furuta «furushchev@jsk.imi.i.u-tokyo.ac.jp»

CHANGELOG

Changelog for package ros_speech_recognition

2.1.29 (2025-01-05)

[doc] fix typo in jsk_3rdparty/ros_speech_recognition/README.md (#499)
Contributors: Yukina Iwata

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

fix package.xml/CMakeLists.txt to supress catkin_lint errors (#479)
Contributors: Kei Okada

2.1.26 (2023-06-14)

add LICENSE files (#476)
Contributors: Kei Okada

2.1.25 (2023-06-08)

[ros_speech_recognition] Add vosk engine (#474)
Pr/use sound themes freedesktop (#472)
add test to check if ros node is loadable (#463)
add self.conf_thresh in __init_ function (#457)
[ros_speech_recognition] add ubuntu-sounds dependency (#453)
[ros_speech_recognition] Return if result is empty (#443)
[ros_speece_recognition] Set confidence value of google (#434)
[ros_speech_recognition] add parrotry.launch (#414)
[ros_ speech_recognition] update default arg for speech_recognition.launch (#412)
[ros_speech_recogniton, respeaker_ros] add confidence field (#411)
[ros_speech_recognition] add self cancellation for speech recogntion (#413)
[#405 and #410] Fix CI (#415)
add ROS interface for https://cloud.google.com/natural-language (#304)
GithubAction: add test for aarch64(melodic) / indigo (arm64) (#365)
- pgm_learner/respeaker_ros/ros_speech_recognition/rosping: increase time-limit/wait-time
Explicit python interpreter in catkin_virtualenv (#367)
.github/workflow: integrate all yaml to one (#338)
[ros_speech_recognition] Fixed the behavior of launch file (#336)
[ros_speech_recognition] add auto_start in speech_recognition_node.py (#301)
[ros_speech_recognition] add SpeechRecognitionCandidatesToString node (#303)
Enable sound play flag (#315)
Contributors: Aiko Ichikura, Aoi Nakane, Kei Okada, Koki Shinjo, Naoto Tsukamoto, Naoya Yamaguchi, Shingo Kitagawa, Yoshiki Obinata, Iory Yanokura

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

enable to change topic name from speech_recognition.launch (#254)
support SpeakerDiarization, see https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize#SpeechRecognitionAlternative (#244)
- [ros_speech_recognition] Add doc to speech_recognition.launch add doc to args, and we need to use rosparm for device, not param. because 'device: ' causes load_parameters: unable to set parameters (last param was [/speech_recognition/depth=16]): cannot marshal None unless allow_none is enabled error
- more exception message for self.recognize
Use PYTHON_INTERPRETER python3 in ros_speech_recognition (#225)
Contributors: Kei Okada, Naoya Yamaguchi, Shingo Kitagawa

2.1.21 (2020-08-19)

add missing packages, closes https://github.com/ros/rosdistro/pull/26216 (#211)
Contributors: Kei Okada

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

Fixed issue #201 as requested, see https://github.com/jsk-ros-pkg/jsk_3rdparty/pull/202
Contributors: MrMarshy

2.1.18 (2020-07-20)

Fix for noetic (#200)
- fix 2to3, with print, raise, exception
[ros_speech_recognition] Enable multi channel audio recognition (#198)
- adjust type code to the CPU platform
- replace rosparam name: channels -> n_channel
- add rosparam description to README
- enable multi channel audio recognition
Add args to ros_speech_recognition (#197)
- Add flac as run_depend for SpeechRecognition pip package
- Use catkin_virtualenv to use SpeechRecognition pip package
- Add arguments and params to pass rostest
- Add test for ros_speech_recognition
- add args to launch
- add pip install to tutorials
- add param description to README
Contributors: Kei Okada, Naoya Yamaguchi

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

set SoundRequest.volume for kinetic (#173)
Contributors: Kei Okada

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

fixes GoogleCloud auth (#158)
Contributors: jonasius

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

ros_speech_recognition: add continuous mode (#127)
ros_speech_recognition: add README (#123)
add ros_speech_recognition package (#121)
Contributors: Yuki Furuta

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

1.0.60 (2015-02-03 10:12)

1.0.59 (2015-02-03 04:05)

1.0.58 (2015-01-07)

1.0.57 (2014-12-23)

1.0.56 (2014-12-17)

1.0.55 (2014-12-09)

1.0.54 (2014-11-15)

1.0.53 (2014-11-01)

1.0.52 (2014-10-23)

1.0.51 (2014-10-20 16:01)

1.0.50 (2014-10-20 01:50)

1.0.49 (2014-10-13)

1.0.48 (2014-10-12)

1.0.47 (2014-10-08)

1.0.46 (2014-10-03)

1.0.45 (2014-09-29)

1.0.44 (2014-09-26 09:17)

1.0.43 (2014-09-26 01:08)

1.0.42 (2014-09-25)

1.0.41 (2014-09-23)

1.0.40 (2014-09-19)

1.0.39 (2014-09-17)

1.0.38 (2014-09-13)

1.0.37 (2014-09-08)

1.0.36 (2014-09-01)

1.0.35 (2014-08-16)

1.0.34 (2014-08-14)

1.0.33 (2014-07-28)

1.0.32 (2014-07-26)

1.0.31 (2014-07-23)

1.0.30 (2014-07-15)

1.0.29 (2014-07-02)

1.0.28 (2014-06-24)

1.0.27 (2014-06-10)

1.0.26 (2014-05-30)

1.0.25 (2014-05-26)

1.0.24 (2014-05-24)

1.0.23 (2014-05-23)

1.0.22 (2014-05-22)

1.0.21 (2014-05-20)

1.0.20 (2014-05-09)

1.0.19 (2014-05-06)

1.0.18 (2014-05-04)

1.0.17 (2014-04-20)

1.0.16 (2014-04-19 23:29)

1.0.15 (2014-04-19 20:19)

1.0.14 (2014-04-19 12:52)

1.0.13 (2014-04-19 11:06)

1.0.12 (2014-04-18 16:58)

1.0.11 (2014-04-18 08:18)

1.0.10 (2014-04-17)

1.0.9 (2014-04-12)

1.0.8 (2014-04-11)

1.0.7 (2014-04-10)

1.0.6 (2014-04-07)

1.0.5 (2014-03-31)

1.0.4 (2014-03-29)

1.0.3 (2014-03-19)

1.0.2 (2014-03-12)

1.0.1 (2014-03-07)

1.0.0 (2014-03-05)

Wiki Tutorials

This package does not provide any links to tutorials in it's rosindex metadata. You can check on the ROS Wiki Tutorials page for the package.

Package Dependencies

Deps	Name
	catkin_virtualenv
	dynamic_reconfigure
	jsk_data
	speech_recognition_msgs
	catkin
	audio_capture
	audio_common_msgs
	sound_play
	rostest
	roslaunch

System Dependencies

Name
g++-static
flac
sound-theme-freedesktop
python3-pycryptodome
python3-gnupg

Dependant Packages

Name	Deps
jsk_3rdparty

Launch files

launch/parrotry.launch
- - use_google [default: true]
  - language [default: en-US]
  - confidence_threshold [default: 0.8]
launch/speech_recognition.launch
- - launch_sound_play [default: true] — Launch sound_play node to speak
  - launch_audio_capture [default: true] — Launch audio_capture node to publish audio topic from microphone
  - audio_topic [default: /audio] — Name of audio topic captured from microphone
  - voice_topic [default: /speech_to_text] — Name of text topic of recognized speech
  - n_channel [default: 1] — Number of channels of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - depth [default: 16] — Bit depth of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - sample_rate [default: 16000] — Frame rate of audio topic and microphone. '$ pactl list short sinks' to check your hardware
  - device [default: ] — Card and device number of microphone (e.g. hw:0,0). you can check card number and device number by '$ arecord -l', then uses hw:[card number],[device number]
  - engine [default: Google] — Speech to text engine. TTS engine, Google, GoogleCloud, Sphinx, Wit, Bing Houndify, IBM
  - language [default: en-US] — Speech to text language. For Japanese, set ja-JP.
  - continuous [default: true] — If false, /speech_recognition service is published. If true, /speech_to_text topic is published.
  - auto_start [default: true] — Whether speech_recognition starts automatically or not. This parameter works when continuous is true
  - self_cancellation [default: true] — Do not recognize the audio when robot is speaking or not.
  - tts_tolerance [default: 1.0] — Tolerance second for recognizing whether robot is speaking or not
  - tts_action_names [default: ['sound_play']] — tts action name. these servers outputs are ignored by sound_recognition

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange

API Docs Browse Code Wiki

ros_speech_recognition package from jsk_3rdparty repo

Package Summary

Repository Summary

Package Description

Additional Links

Maintainers

Authors

ros_speech_recognition

Tutorials

Normal tutorial

Parrotry tutorial

speech_recognition_node.py Interface

Publishing Topics

Subscribing Topics

Advertising Services

Parameters

Author

Changelog for package ros_speech_recognition

2.1.29 (2025-01-05)

2.1.28 (2023-07-24)

2.1.27 (2023-06-24)

2.1.26 (2023-06-14)

2.1.25 (2023-06-08)

2.1.24 (2021-07-26)

2.1.23 (2021-07-21)

2.1.22 (2021-06-10)

2.1.21 (2020-08-19)

2.1.20 (2020-08-07)

2.1.19 (2020-07-21)

2.1.18 (2020-07-20)

2.1.17 (2020-04-16)

2.1.16 (2020-04-16)

2.1.15 (2019-12-12)

2.1.14 (2019-11-21)

2.1.13 (2019-07-10)

2.1.12 (2019-05-25)

2.1.11 (2018-08-29)

2.1.10 (2018-04-25)

2.1.9 (2018-04-24)

2.1.8 (2018-04-17)

2.1.7 (2018-04-09)

2.1.6 (2017-11-21)

2.1.5 (2017-11-20)

2.1.4 (2017-07-16)

2.1.3 (2017-07-07)

2.1.2 (2017-07-06)

2.1.1 (2017-07-05)

2.1.0 (2017-07-02)

2.0.20 (2017-05-09)

2.0.19 (2017-02-22)

2.0.18 (2016-10-28)

2.0.17 (2016-10-22)

2.0.16 (2016-10-17)

2.0.15 (2016-10-16)

2.0.14 (2016-03-20)

2.0.13 (2015-12-15)

2.0.12 (2015-11-26)

2.0.11 (2015-10-07 14:16)

2.0.10 (2015-10-07 12:47)

2.0.9 (2015-09-26)

2.0.8 (2015-09-15)

2.0.7 (2015-09-14)

2.0.6 (2015-09-08)

2.0.5 (2015-08-23)

2.0.4 (2015-08-18)

2.0.3 (2015-08-01)

2.0.2 (2015-06-29)

2.0.1 (2015-06-19 21:21)

2.0.0 (2015-06-19 10:41)

1.0.71 (2015-05-17)

1.0.70 (2015-05-08)

1.0.69 (2015-05-05 12:28)

1.0.68 (2015-05-05 09:49)

1.0.67 (2015-05-03)

1.0.66 (2015-04-03)

1.0.65 (2015-04-02)

1.0.64 (2015-03-29)

1.0.63 (2015-02-19)

1.0.62 (2015-02-17)

1.0.61 (2015-02-11)

`speech_recognition_node.py` Interface

Recent questions tagged `ros_speech_recognition` at Robotics Stack Exchange