您现在的位置是:Watch ChatGPT >>正文

Watch ChatGPT

上海品茶网 - 夜上海最新论坛社区 - 上海千花论坛256人已围观

简介By subscribing, you agree to our Terms of Use and Policies You may unsubscribe at any time.In a fusi...

By subscribing, you agree to our Terms of Use and Policies You may unsubscribe at any time.

In a fusion of cutting-edge technology, a highly advanced humanoid robot has been paired with unparalleled capabilities by leveraging artificial intelligence (AI). 

Introducing Alter3, a humanoid robot with the ability to generate spontaneous motion through the utilization of a Large Language Model (LLM), namely GPT-4.

Developed by a University of Tokyo team, Alter3 employs Open AI's latest tool to dynamically assume various poses, from a selfie stance to mimicking a ghost, all without the need for pre-programmed entries in its database.

According to the team, Alter3's "response to conversational content through facial expressions and gestures represents asignificant advancement in humanoid robotics, easily adaptable to other androids with minimal modifications," said a research paper by the team. 

See Also Related
  • IHMC's Nadia: A task-ready humanoid robot with a boxing edge 
  • Google delays its ChatGPT 4 rival 'Gemini' until next year over language issues 
  • UN Security Council meets for first time to discuss artificial intelligence risks 

The details regarding the team's research were published in the journal Arxiv.

From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3"

paper page: https://t.co/QKIKfWKyPZ

report the development of Alter3, a humanoid robot capable of generating spontaneous motion using a Large Language Model (LLM), specifically GPT-4. This achievement was… pic.twitter.com/WEViidVM1L

— AK (@_akhaliq) December 12, 2023

LLM in robots 

In the realm of integrating LLMs with robots, emphasis has been on enhancing basic communication and simulating lifelike responses. Researchers are also delving into the capacity of LLMs to enable robots to comprehend and execute complex instructions, thereby enhancing their autonomy and functionality.

Traditionally, low-level robot control is tied to hardware and lies beyond the purview of LLM corpora, posing difficulties for direct LLM-driven robot control. In tackling this challenge, the Japanese team devised a method to convert human movement expressions into code comprehensible to the android.

Watch ChatGPT-powered humanoid robot pose funny, play terrifying ghost
Alter3 striking a selfie pose and pretending to be a ghost.

University of Tokyo 

This implies that the robot can autonomously generate action sequences over time without the need for developers to individually program each body part. Users can modify poses or clarify distinctions, such as between a dab and an infinite dab. 

During interactions, a human can instruct Alter3 with commands like "Take a selfie with your iPhone." Subsequently, the robot initiates a series of queries to OpenAI's GPT-4, seeking guidance on the steps involved. GPT-4 then translates this into Python code, enabling the robot to comprehend and execute the necessary movements, according to Tom's Guide. 

This innovation empowers Alter3 to produce spontaneous upper-body motion, although its lower body remains stationary, affixed to a stand, limiting its capabilities for now. 

Advanced proposition 

Alter3 is the third iteration in the Alter humanoid robot series since 2016, which boasts 43 actuators, encompassing facial expressions and limb movements, driven by compressed air. This configuration facilitates a diverse range of expressive gestures. Notably, Alter3 cannot walk but can simulate walking and running motions. 

In our earlier studies, Alter3 showcased its ability to copy human poses using a camera and the OpenPose framework. The robot adjusts its joints to mimic observed poses and stores successful imitations for future use. The effectiveness is measured by increased transfer entropy, showing information flow from humans to the robot.

In experiments, interaction with humans resulted in more diverse poses, supporting the idea that various movements come from human imitation, similar to how newborns learn by imitating. These findings opened the door to exploring the imitative skills of robots, particularly with advanced LLMs like those integrated into Alter3.

Before the advent of LLM, researchers had to meticulously control all 43 axes in a specific sequence to replicate a person's pose or simulate behaviors like serving tea or playing chess. This typically involved numerous manual refinements. With LLM, the team was able to liberate it from the repetitive manual labor. 

"We expect Alter3 to effectively engage in dialogue, manifesting contextually relevant facial expressions and gestures. Notably, it has shown capability in mirroring emotions, such as displaying sadness or happiness in response to corresponding narratives, thereby sharing emotions with us," said a research paper. 

 Study Abstract

We report the development of Alter3, a humanoid robot capable of generating spontaneous motion using a Large Language Model (LLM), specifically GPT-4. This achievement was realized by integrating GPT-4 into our proprietary android, Alter3, thereby effectively grounding the LLM with Alter's bodily movement. Typically, low-level robot control is hardware-dependent and falls outside the scope of LLM corpora, presenting challenges for direct LLM-based robot control. However, in the case of humanoid robots like Alter3, direct control is feasible by mapping the linguistic expressions of human actions onto the robot's body through program code. Remarkably, this approach enables Alter3 to adopt various poses, such as a 'selfie' stance or 'pretending to be a ghost,' and generate sequences of actions over time without explicit programming for each body part. This demonstrates the robot's zero-shot learning capabilities. Additionally, verbal feedback can adjust poses, obviating the need for fine-tuning.

Tags:

相关文章



友情链接