MAROKO133 Eksklusif ai: ByteDance Introduces Astra: A Dual-Model Architecture for Autonomo

📌 MAROKO133 Update ai: ByteDance Introduces Astra: A Dual-Model Architecture for A

The increasing integration of robots across various sectors, from industrial manufacturing to daily life, highlights a growing need for advanced navigation systems. However, contemporary robot navigation systems face significant challenges in diverse and complex indoor environments, exposing the limitations of traditional approaches. Addressing the fundamental questions of “Where am I?”, “Where am I going?”, and “How do I get there?”, ByteDance has developed Astra, an innovative dual-model architecture designed to overcome these traditional navigation bottlenecks and enable general-purpose mobile robots.

Traditional navigation systems typically consist of multiple, smaller, and often rule-based modules to handle the core challenges of target localization, self-localization, and path planning. Target localization involves understanding natural language or image cues to pinpoint a destination on a map. Self-localization requires a robot to determine its precise position within a map, especially challenging in repetitive environments like warehouses where traditional methods often rely on artificial landmarks (e.g., QR codes). Path planning further divides into global planning for rough route generation and local planning for real-time obstacle avoidance and reaching intermediate waypoints.

While foundation models have shown promise in integrating smaller models to tackle broader tasks, the optimal number of models and their effective integration for comprehensive navigation remained an open question.

ByteDance’s Astra, detailed in their paper “Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning” (website: https://astra-mobility.github.io/), addresses these limitations. Following the System 1/System 2 paradigm, Astra features two primary sub-models: Astra-Global and Astra-Local. Astra-Global handles low-frequency tasks like target and self-localization, while Astra-Local manages high-frequency tasks such as local path planning and odometry estimation. This architecture promises to revolutionize how robots navigate complex indoor spaces.

Astra-Global: The Intelligent Brain for Global Localization

Astra-Global serves as the intelligent core of the Astra architecture, responsible for critical low-frequency tasks: self-localization and target localization. It functions as a Multimodal Large Language Model (MLLM), adept at processing both visual and linguistic inputs to achieve precise global positioning within a map. Its strength lies in utilizing a hybrid topological-semantic graph as contextual input, allowing the model to accurately locate positions based on query images or text prompts.

The construction of this robust localization system begins with offline mapping. The research team developed an offline method to build a hybrid topological-semantic graph G=(V,E,L):

  • V (Nodes): Keyframes, obtained by temporal downsampling of input video and SfM-estimated 6-Degrees-of-Freedom (DoF) camera poses, act as nodes encoding camera poses and landmark references.
  • E (Edges): Undirected edges establish connectivity based on relative node poses, crucial for global path planning.
  • L (Landmarks): Semantic landmark information is extracted by Astra-Global from visual data at each node, enriching the map’s semantic understanding. These landmarks store semantic attributes and are connected to multiple nodes via co-visibility relationships.

In practical localization, Astra-Global’s self-localization and target localization capabilities leverage a coarse-to-fine two-stage process for visual-language localization. The coarse stage analyzes input images and localization prompts, detects landmarks, establishes correspondence with a pre-built landmark map, and filters candidates based on visual consistency. The fine stage then uses the query image and coarse output to sample reference map nodes from the offline map, comparing their visual and positional information to directly output the predicted pose.

For language-based target localization, the model interprets natural language instructions, identifies relevant landmarks using their functional descriptions within the map, and then leverages landmark-to-node association mechanisms to locate relevant nodes, retrieving target images and 6-DoF poses.

To empower Astra-Global with robust localization abilities, the team employed a meticulous training methodology. Using Qwen2.5-VL as the backbone, they combined Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO). SFT involved diverse datasets for various tasks, including coarse and fine localization, co-visibility detection, and motion trend estimation. In the GRPO phase, a rule-based reward function (including format, landmark extraction, map matching, and extra landmark rewards) was used to train for visual-language localization. Experiments showed GRPO significantly improved Astra-Global’s zero-shot generalization, achieving 99.9% localization accuracy in unseen home environments, surpassing SFT-only methods.

Astra-Local: The Intelligent Assistant for Local Planning

Astra-Local acts as the intelligent assistant for Astra’s high-frequency tasks, a multi-task network capable of efficiently generating local paths and accurately estimating odometry from sensor data. Its architecture comprises three core components: a 4D spatio-temporal encoder, a planning head, and an odometry head.

The 4D spatio-temporal encoder replaces traditional mobile stack perception and prediction modules. It begins with a 3D spatial encoder that processes N omnidirectional images through a Vision Transformer (ViT) and Lift-Splat-Shoot to convert 2D image features into 3D voxel features. This 3D encoder is trained using self-supervised learning via 3D volumetric differentiable neural rendering. The 4D spatio-temporal encoder then builds upon the 3D encoder, taking past voxel features and future timestamps as input to predict future voxel features through ResNet and DiT modules, providing current and future environmental representations for planning and odometry.

The planning head, based on pre-trained 4D features, robot speed, and task information, generates executable trajectories using Transformer-based flow matching. To prevent collisions, the planning head incorporates a masked ESDF loss (Euclidean Signed Distance Field). This loss calculates the ESDF of a 3D occupancy map and applies a 2D ground truth trajectory mask, significantly reducing collision rates. Experiments demonstrate its superior performance in collision rate and overall score on out-of-distribution (OOD) datasets compared to other methods.

The odometry head predicts the robot’s relative pose using current and past 4D features and additional sensor data (e.g., IMU, wheel data). It trains a Transformer model to fuse information from different sensors. Each sensor modality is processed by a specific tokenizer, combined with modality embeddings and temporal positional embeddi…

Konten dipersingkat otomatis.

🔗 Sumber: syncedreview.com


📌 MAROKO133 Eksklusif ai: When You Hear What Happens When They Put Lab Mice in Nat

Lab mice are subjected to countless horrors during their short lives, from being injected with cancerous cells to getting exposed to microplastics or ending up dosed with cocaine.

It’s a controversial research standard that has long been criticized by animal rights activists. Even some in the scientific community argue they deserve better treatment, much like their human counterparts in clinical trials.

And things get even worse when you learn about the lives they could be living if they weren’t confined inside a lab. As detailed in a new paper published in the journal Current Biology, researchers at Cornell University found that lab mice that were released into a large, enclosed field near the institution’s campus almost immediately became less anxious — a lesson we might all learn something from.

These “rewilded” mice behaved in dramatically different ways, even when they had an established record of anxiety.

“We release the mice into these very large, enclosed fields where they can run around and touch grass and dirt for the first time in their lives,” said senior author and Cornell associate professor of neurobiology and behavior Michael Sheehan in a statement about the research.

Beyond the positive effects of having lab mice live healthier lives, scientists suggest the data gleaned from them might become more reliable and generalizable. Scientists have long debated whether we can or should apply what we learn from lab mouse-based experiments to other animals or humans, especially when it comes to health-related research.

“It’s a new approach to understanding more about how experiences shape subsequent responses to the world, and the hope is that what we learn from these mice will have more generalizability to other animals and to ourselves as well,” Sheehan added.

To gauge anxiety in the test mice, the researchers used the most commonly used technique out there, called the “elevated plus maze.” The cross-shaped maze has an open and a closed arm. Rodents with higher anxiety seek the shelter of the closed-off arm, while those with lower anxiety generally spend more time in the exposed arm.

In an experiment, the Cornell team introduced lab mice to the maze before releasing them into the field. Once they returned from their odyssey, their behavior in the maze changed significantly.

“The rewilded mice show either no fear response or a much, much weaker response,” said first author and Cornell postdoctoral researcher Matthew Zipple in the statement.

Even mice that were reintroduced several times to the maze before being rewilded reversed their anxiety after their transition to a more natural environment.

“We put them in the field for a week, and they returned to their original levels of anxiety behavior,” Zipple explained. “Living in this naturalistic environment both blocks the formation of the initial fear response, and it can reset a fear response that’s already been developed in these animals in the lab.”

The team suggests mice may learn from having a wider range of experiences, findings that echo previous studies into the psychology of humans.

“We think this change in behavior is about agency, at its core,” Zipple said. “What I mean by agency is the ability of an animal to change its experiences in an environment through its own behavior.”

The more the mice experienced, like finding their own food or escaping from predators, the more they were able to cope with adversity.

“If you experience lots of different things that happen to you every day, you have a better way to calibrate whether or not something is scary or threatening,” Sheehan said in the statement. “But if you’ve only had five experiences, you come across your sixth experience, and it’s quite different from everything you’ve done before, that’s going to invoke anxiety.”

There might even be a lesson for humans in there somewhere: that it can’t hurt to touch grass every once in a while.

“One of the things that could be causing a rise in anxiety in young people is that they’re living more sheltered lives,” Sheehan explained. “There are conversations around modernity and our own lives that are echoed in this research that make it really interesting.”

More on lab mice: Lab Mice Exposed to Microplastics Show Signs of Dementia

The post When You Hear What Happens When They Put Lab Mice in Nature, You Might Rethink Your Entire Life appeared first on Futurism.

🔗 Sumber: futurism.com


🤖 Catatan MAROKO133

Artikel ini adalah rangkuman otomatis dari beberapa sumber terpercaya. Kami pilih topik yang sedang tren agar kamu selalu update tanpa ketinggalan.

✅ Update berikutnya dalam 30 menit — tema random menanti!

Author: timuna