Repository Summary
Checkout URI | https://github.com/thinking-machines-rl/openrobotgpt.git |
VCS Type | git |
VCS Version | main |
Last Updated | 2025-01-20 |
Dev Status | UNMAINTAINED |
CI status | No Continuous Integration |
Released | UNRELEASED |
Tags | No category tags. |
Contributing |
Help Wanted (0)
Good First Issues (0) Pull Requests to Review (0) |
Packages
Name | Version |
---|---|
code_bot | 0.0.0 |
panda_env | 0.0.0 |
robot_api_layer | 0.0.0 |
robotgpt_interfaces | 0.0.0 |
README
OpenRobotGPT
Open source implementation of RobotGPT
Abstract: We present RobotGPT, an innovative decision framework for robotic manipulation that prioritizes stability and safety. The execution code generated by ChatGPT cannot guarantee the stability and safety of the system. ChatGPT may provide different answers for the same task, leading to unpredictability. This instability prevents the direct integration of ChatGPT into the robot manipulation loop. Although setting the temperature to 0 can generate more consistent outputs, it may cause ChatGPT to lose diversity and creativity. Our objective is to leverage ChatGPT’s problem-solving capabilities in robot manipulation and train a reliable agent. The framework includes an effective prompt structure and a robust learning model. Additionally, we introduce a metric for measuring task difficulty to evaluate ChatGPT’s performance in robot manipulation. Furthermore, we evaluate RobotGPT in both simulation and realworld environments. Compared to directly using ChatGPT to generate code, our framework significantly improves task success rates, with an average increase from 38.5% to 92.5%. Therefore, training a RobotGPT by utilizing ChatGPT as an expert is a more stable approach compared to directly using ChatGPT as a task planner.
ROS2 structure:
The Ros nodes were created with the Idea of having 3 modular components.
- LLM component: This is where the call to a LLM model is done, based on the API described in the node robot_api_node. This node then convert the LLM request to tradition Action for the environment. This component can also be replaced by a traditional RL agent.
- ENV component: manage the connection with a LLM or RL agent, giving the possibility to ask for a certain action to be executed and receives the next_state associated with it. The actions are in the form of end effector final position-orientation-grip_status and how this positions are reached is managed by the trajectory_generator node. The physics env used as a basis is Pybullet.
- Trajectroy generator: component responsible for generating trajectory. One can decide to use moveit or other custom packages.
How to run
Create an image from the dockerfile
docker build -t <image_name> .
Run the docker with gpus, ssh port enabled and x11 forwarding
docker run --gpus all -it --rm \
-p 2222:22 \
-e DISPLAY=$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix:rw \
-v $(pwd)/workspace:/root/workspace:rw \
<image_name>
(From another command line) Connect to the docker
ssh -X -p 2222 root@localhost
The password is “password”
Every time you need to reconnect to the container the SHA key will change. You can fix the error that pops up with this bash code
ssh-keygen -f "/home/nicola/.ssh/known_hosts" -R "[localhost]:2222"
In order to be able to connect from an Ubuntu machine use the following command
xhost +
Testing of the pipeline:
The node to launch are in order:
- GymNode.
- the trajectory node (ex. MoveNodebasic, MoveIt).
- robot_api_node
- code_node