您现在的位置是:This new simulation platform trains robots to hear their environment >>正文

This new simulation platform trains robots to hear their environment

上海品茶网 - 夜上海最新论坛社区 - 上海千花论坛429人已围观

简介By subscribing, you agree to our Terms of Use and Policies You may unsubscribe at any time.Today, th...

By subscribing, you agree to our Terms of Use and Policies You may unsubscribe at any time.

Today, there are many platforms to train robots before they can be deployed in real-world environments. However, the vast majority of them miss a crucial step: they do not consider the sounds that robots might detect and interact with in the real world.

This new simulation platform trains robots to hear their environment

To address this lack of adequate training platforms, a team of researchers at Stanford University recently created Sonicverse, a simulated environment that includes both visual and auditory elements, according to a report by TechXplorepublished on Friday.

"While we humans perceive the world by both looking and listening, very few prior works tackled embodied learning with audio," Ruohan Gao, one of the researchers who carried out the study, told the news outlet.

See Also Related
  • Researchers train 'world's most advanced humanoid robot' Ameca on GPT-4, finds her less responsive 
  • Microsoft trains ChatGPT to control robots 
  • New ‘SoftZoo’ allows engineers to test a variety of animal-inspired robots 

 "Existing embodied AI simulators either assume that environments are silent and the agents unable to detect sound, or deploy audio-visual agents only in simulation. Our goal was to introduce a new multisensory simulation platform with realistic integrated audio-visual simulation for training household agents that can both see and hear."

Sonicverse recreates both the visual components of any given environment as well as its sounds, offering robots more "realistic" virtual spaces thus improving their performance in the real world.

"Unlike prior work, we hope to demonstrate that agents trained in simulation can successfully perform audio-visual navigation in challenging real-world environments," Gao explained. 

"Sonicverse is a new multisensory simulation platform that models continuous audio rendering in 3D environments in real time, It can serve as a testbed for many embodied AI and human-robot interaction tasks that need audio-visual perception, such as audio-visual navigation."

The researchers tested their new simulation onTurtleBot, a robot created by Willow Garage.

"We demonstrated Sonicverse's realism via sim-to-real transfer, which has not been achieved by other audio-visual simulators," Gao said. 

"In other words, we showed that an agent trained in our simulator can successfully perform audio-visual navigation in real-world environments, such as in an office kitchen."

From this experiment, the researchers concluded that their simulation platform could train robots to tackle real-world tasks more effectively by using both visual and auditory stimuli. 

"Embodied learning with multiple modalities has great potential to unlock many new applications for future household robots," Gao told TechXplore

"In our next studies, we plan to integrate multisensory object assets, such as those in our recent work ObjectFolder into the simulator, so that we can model the multisensory signals at both the space level and the object level, and also incorporate other sensory modalities such as tactile sensing."

The study is published in th arXiv preprint server).

Study abstract:

Developing embodied agents in simulation has been a key research topic in recent years. Exciting new tasks, algorithms, and benchmarks have been developed in various simulators. However, most of them assume deaf agents in silent environments, while we humans perceive the world with multiple senses. We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. Sonicverse models realistic continuous audio rendering in 3D environments in real-time. Together with a new audio-visual VR interface that allows humans to interact with agents with audio, Sonicverse enables a series of embodied AI tasks that need audio-visual perception. For semantic audio-visual navigation in particular, we also propose a new multi-task learning model that achieves state-of-the-art performance. In addition, we demonstrate Sonicverse's realism via sim-to-real transfer, which has not been achieved by other simulators: an agent trained in Sonicverse can successfully perform audio-visual navigation in real-world environments.

Tags:

相关文章



友情链接