Repository that contains the essential files regarding the paper "Intuitive Control of a Social Robot Using Natural Language with Large Language Model and Error Correction Capabilities". The files cannot be replicated or downloaded for commercial purposes. The files have been made available to the viewer for research and informative purposes.
The repository allows reviewers and users to inspect the main files involved in the project. The repository is structured as follows:
- GPT_error_correction.py: the file illustrates the script used to trigger the Error Correction explained in Sec.6 Error Correction.
- GPT_prompt_chat_client.py: the file illustrates the script used to trigger the framework to control the Nao Robot through voice without the Error Correction involved.
- main_server.py: this is the script to activate the Local Server used to mediate between the Nao Robot Controller and the OpenAI Service.
- server.py: this script contains the class and methods referenced in the main_server.py script.
- nao_file_transfer.py: this file illustrates the class used to transfer audio files between laptop and Nao Robot via FTP.
- nao_robot_wrapper.py: this file contains the wrapper class to convert GPT code into NAOqi interpretable code for the Robot Controller.
- performance_results.ipynb: the notebook file illustrates the code used to compute the results and graph shown in the Sec. 7 Performance Evaluation chapter of the paper.
- prompt_manager.py: this file contains the class to manipulate the Instruction Prompt.
- PromptTable_eval_with_Score.csv: this csv file collects the tests of the Performance Evaluation.
- demo.py: this file contains a demonstration of the Error Collection pipeline involved in the project, as well as a demo of how ChatGPT tries to satisfy the user's request by issuing commands to the robot.
A virtual environment is recommended.
- Python 3.7+
- OpenAI Account
- OpenAI API Key
Clone the repository on a dedicated directory and use the package manager pip to install the required Python libraries.
pip install -r requirements.txt- Set up your OpenAI API KEY as an environment variable called "OPENAI_API_KEY"
- Open a terminal and launch the Local Server running the Python script main_server.py:
C:/location/on/laptop/python.exe main_server.py
- Open another terminal and run the demo.py script:
C:/location/on/laptop/python.exe demo.py
- Write a request on the demo.py terminal as if you were asking to the robot
This video shows an interaction with the chatbot that is able to control the robot. The error correction algorithm gets activated. The user asks the robot to move closer. After an initial unsatisfactory response, the chatbot gets corrected. In the subsequente interaction, the chatbot no longer commits the same error.
NaoLLMDemo.mp4
- Python 3.7+
- OpenAI Account
- OpenAI API Key
- naoqi SDK for Python 2.7
Make sure to have in a Python 2.7 project directory:
- GPT_prompt_chat_client_new.py
- nao_file_transfer.py
- naorobot_wrapper.py
- utils.py
- record_speech.py
- requirements_py27.txt
Make sure to have in a Python 3 project directory:
- server.py
- main.py
- requirements_server_py3.txt
- speech_engine.py
- system_prompts/initial_setup.txt
- Define a python 2.7 environment in which you have naoqi SDK installed. Then, install required packages:
pip install -r requirements_py27.txt- Define a separate python project. Define a conda environment with Python 3 and install the required packages:
pip3 install -r requirements_server_py3.txt- In the Python3 environment, set up your OpenAI API KEY as an environment variable called "OPENAI_API_KEY"
In the Python 3 project, launch:
C:/conda/location/on/laptop/of/python3/python.exe main.pyIn the Python 2.7 project, launch:
C:/location/on/laptop/python.exe GPT_prompt_chat_client_new.py --robot_ip your_nao_robot_ipWait a few seconds until NAO deactivates it's autonomous life and says "I'm ready".
To ask NAO to do something, pronounce the word 'Gino' and then wait for NAO to ask for your command. Then you have 7 seconds to ask NAO to do something. You can also pronounce "listen" to make NAO listen to your request.
This video shows a doctor asking a request to the robot which was not previously mapped on the system. The framework provides actions to satisfy the doctor's request and the robot is correctly able to execute them.
gpt_control._final.mp4
This video shows the simulation of a complete visit with the LLM-enabled control.
gpt_control_visit_subtitled.mp4
These videos show some movements performed by NAO based on the user requests:
NAO_first_row_interaction.mp4
NAO_second_row_interaction.mp4
If you encounter any issues related to installation or usage with a real NAO Robot, please report them through the Issues tab in this repository or contact me directly via email at federico.biagi@unimore.it. This project has been tested on a limited set of devices, so you may experience technical difficulties when attempting to replicate it on different setups. If that happens, please let me know, I’ll do my best to help and correct any mistakes in the repository!
Sentences pronounced by the robot and users' requests saved on the .csv file are issued in Italian language as the research project has been pursued in "Policlinico di Modena" Hospital in Italy.
This table provides some interesting examples of the code produced by ChatGPT for the task completion, along with the difficulty score.
| Request | Code | Score |
|---|---|---|
| Show me where the heart is | nao.setAngle(['RShoulderPitch'], [1.5])\nnao.setAngle(['RElbowRoll'], [-1.0])\nnao.setAngle(['RElbowYaw'], [1.0])\nnao.openHand('RHand')\nnao.closeHand('RHand') |
5.93 |
| Let's play rock paper scissors, I choose rock | nao.say("Okay, let's play! I choose rock!")\nnao.setAngle(['RShoulderPitch'], [1.5])\nnao.closeHand('RHand') |
2.7 |
| Give me your hand and guide me along the corridor | nao.setAngle(['RShoulderPitch'], [1.8])\nnao.openHand('RHand')\nnao.moveForward(2.0)\nnao.setAngle(['RShoulderPitch'], [1.8])\nnao.closeHand('RHand')\nnao.moveForward(2.0)\nnao.say("Of course, let's start the visit.") |
3.3 |
| Assist the patient during the cardiologic examination | nao.setAngle(['RShoulderPitch'], [1.5])\nnao.setAngle(['RElbowRoll'], [-1.0])\nnao.setAngle(['RElbowYaw'], [1.0])\nnao.openHand('RHand')\nnao.say("Here is where the heart is.")\nnao.closeHand('RHand')\nnao.say("Now breathe deeply while we listen to your heart.")\nnao.moveForward(0.5)\nnao.say("Alright, now move a little back.")\nnao.moveForward(-0.5)\nnao.say("The visit is over!") |
6.42 |
| Give me a High Five! | nao.setAngle(['RElbowRoll'], [-0.0349])\nnao.setAngle(['RShoulderPitch'], [2.0])\nnao.setAngle(['RElbowRoll'], [0.0])\nnao.setAngle(['RWristYaw'], [1.0])\nnao.openHand('RHand')\nnao.setAngle(['RShoulderPitch'], [1.8]) |
3.2 |
| Scratch your head | nao.setAngle(['RElbowRoll'], [-1.57])\nnao.setAngle(['RWristYaw'], [1.0])\nnao.openHand('RHand')\nnao.closeHand('RHand') |
3.9 |