MAROKO133 Breaking ai: Amateurs Using AI to “Vibe Code” Are Now Begging Real Programmers t

📌 MAROKO133 Breaking ai: Amateurs Using AI to “Vibe Code” Are Now Begging Real Pro

Welcome to the future, where the vibes are bad in almost every meaningful respect — but where you do, at the very least, get to “vibe code,” or use an AI model to write code and even build entire pieces of software.

But rarely does the process go smoothly enough for prime time. The jury’s still out on whether experienced programmers actually benefit from using AI coding assistants, and the tech’s shortcomings are even more obvious when it’s being relied on by untrained amateurs who openly embrace the whole shtick of working off mainly “vibes.”

Nothing illustrates that last point better than the fact that some veteran programmers are apparently now making a killing by fixing these AI-hallucinated disasters, as spotlighted by 404 Media, which interviewed a few of these canny opportunists.

“I started fixing vibe-coded projects because I noticed a growing number of developers and small teams struggling to refine AI-generated code that was functional but lacked the polish or ‘vibe’ needed to align with their vision,” Hamid Siddiqi, a programmer who offers to “fix your vibe code” on Fiverr, told the outlet.

Siddiqi added that these clients need help with everything from horrendously optimized code to botched AI-generated UIs.

And business is booming.

“I’ve been offering vibe coding fixer services for about two years now, starting in late 2023,” Siddiqi told 404. “Currently, I work with around 15-20 clients regularly, with additional one-off projects throughout the year.”

AI models are notorious for hallucinating and generally not doing what you intend them to do. One man found this out the hard way after his vibe-coding AI wiped out his business’s entire database. Nonetheless, even the largest tech firms have embraced using AI coding assistants. Google CEO Sundar Pichai claimed that as much as 25 percent of the company’s code is now AI-generated; Microsoft chief Satya Nadella did one better and claimed that it was 30 percent at his company.

Some research has suggested that relying on the tech does the opposite of making workflows more efficient, as programmers have to constantly double and triple check the AI’s error-laden outputs. One recent study found that programmers who used tools like Anthropic’s Claude were a whopping 19 percent slower, and ended up using less than half of the AI’s suggestions.

It’s no surprise, then, that Siddiqi is far from alone. Searching “vibe code fixer” on Fiverr, which is just one of many popular gig work platforms, returns over 230 results. Fixing “vibe code,” or some permutation thereof, is explicitly mentioned by many of these programmers describing their services.

Some companies are getting in on the scene, too. 404 cited one software firm, Ulam Labs, which says on its website that “we clean up after vibe coding. Literally.”

There’s even an entire website dedicated to the niche: VibeCodeFixers.com. Its founder Swatantra Sohni told 404 that over 300 veteran programmers have already signed up. He bought the domain immediately after Andrej Karpathy, a prominent computer scientist and a former director of AI at Tesla, coined the term in February. The writing on the wall was that obvious.

“Most of these vibe coders, either they are product managers or they are sales guys, or they are small business owners, and they think that they can build something,” Sohni told 404.

Often, he found that vibe coders burn money on AI usage fees in the final stages of development when they try to add new features that break the app, at which point it would be cheaper to just start from scratch.

Luckily for Siddiqi and company, they often don’t.

More on AI: Programmers Using AI Create Way More Glaring Security Issues, Data Shows

The post Amateurs Using AI to “Vibe Code” Are Now Begging Real Programmers to Fix Their Botched Software appeared first on Futurism.

🔗 Sumber: futurism.com

📌 MAROKO133 Update ai: ByteDance Introduces Astra: A Dual-Model Architecture for A

The increasing integration of robots across various sectors, from industrial manufacturing to daily life, highlights a growing need for advanced navigation systems. However, contemporary robot navigation systems face significant challenges in diverse and complex indoor environments, exposing the limitations of traditional approaches. Addressing the fundamental questions of “Where am I?”, “Where am I going?”, and “How do I get there?”, ByteDance has developed Astra, an innovative dual-model architecture designed to overcome these traditional navigation bottlenecks and enable general-purpose mobile robots.

Traditional navigation systems typically consist of multiple, smaller, and often rule-based modules to handle the core challenges of target localization, self-localization, and path planning. Target localization involves understanding natural language or image cues to pinpoint a destination on a map. Self-localization requires a robot to determine its precise position within a map, especially challenging in repetitive environments like warehouses where traditional methods often rely on artificial landmarks (e.g., QR codes). Path planning further divides into global planning for rough route generation and local planning for real-time obstacle avoidance and reaching intermediate waypoints.

While foundation models have shown promise in integrating smaller models to tackle broader tasks, the optimal number of models and their effective integration for comprehensive navigation remained an open question.

ByteDance’s Astra, detailed in their paper “Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning” (website: https://astra-mobility.github.io/), addresses these limitations. Following the System 1/System 2 paradigm, Astra features two primary sub-models: Astra-Global and Astra-Local. Astra-Global handles low-frequency tasks like target and self-localization, while Astra-Local manages high-frequency tasks such as local path planning and odometry estimation. This architecture promises to revolutionize how robots navigate complex indoor spaces.

Astra-Global: The Intelligent Brain for Global Localization

Astra-Global serves as the intelligent core of the Astra architecture, responsible for critical low-frequency tasks: self-localization and target localization. It functions as a Multimodal Large Language Model (MLLM), adept at processing both visual and linguistic inputs to achieve precise global positioning within a map. Its strength lies in utilizing a hybrid topological-semantic graph as contextual input, allowing the model to accurately locate positions based on query images or text prompts.

The construction of this robust localization system begins with offline mapping. The research team developed an offline method to build a hybrid topological-semantic graph G=(V,E,L):

V (Nodes): Keyframes, obtained by temporal downsampling of input video and SfM-estimated 6-Degrees-of-Freedom (DoF) camera poses, act as nodes encoding camera poses and landmark references.
E (Edges): Undirected edges establish connectivity based on relative node poses, crucial for global path planning.
L (Landmarks): Semantic landmark information is extracted by Astra-Global from visual data at each node, enriching the map’s semantic understanding. These landmarks store semantic attributes and are connected to multiple nodes via co-visibility relationships.

In practical localization, Astra-Global’s self-localization and target localization capabilities leverage a coarse-to-fine two-stage process for visual-language localization. The coarse stage analyzes input images and localization prompts, detects landmarks, establishes correspondence with a pre-built landmark map, and filters candidates based on visual consistency. The fine stage then uses the query image and coarse output to sample reference map nodes from the offline map, comparing their visual and positional information to directly output the predicted pose.

For language-based target localization, the model interprets natural language instructions, identifies relevant landmarks using their functional descriptions within the map, and then leverages landmark-to-node association mechanisms to locate relevant nodes, retrieving target images and 6-DoF poses.

To empower Astra-Global with robust localization abilities, the team employed a meticulous training methodology. Using Qwen2.5-VL as the backbone, they combined Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO). SFT involved diverse datasets for various tasks, including coarse and fine localization, co-visibility detection, and motion trend estimation. In the GRPO phase, a rule-based reward function (including format, landmark extraction, map matching, and extra landmark rewards) was used to train for visual-language localization. Experiments showed GRPO significantly improved Astra-Global’s zero-shot generalization, achieving 99.9% localization accuracy in unseen home environments, surpassing SFT-only methods.

Astra-Local: The Intelligent Assistant for Local Planning

Astra-Local acts as the intelligent assistant for Astra’s high-frequency tasks, a multi-task network capable of efficiently generating local paths and accurately estimating odometry from sensor data. Its architecture comprises three core components: a 4D spatio-temporal encoder, a planning head, and an odometry head.

The 4D spatio-temporal encoder replaces traditional mobile stack perception and prediction modules. It begins with a 3D spatial encoder that processes N omnidirectional images through a Vision Transformer (ViT) and Lift-Splat-Shoot to convert 2D image features into 3D voxel features. This 3D encoder is trained using self-supervised learning via 3D volumetric differentiable neural rendering. The 4D spatio-temporal encoder then builds upon the 3D encoder, taking past voxel features and future timestamps as input to predict future voxel features through ResNet and DiT modules, providing current and future environmental representations for planning and odometry.

The planning head, based on pre-trained 4D features, robot speed, and task information, generates executable trajectories using Transformer-based flow matching. To prevent collisions, the planning head incorporates a masked ESDF loss (Euclidean Signed Distance Field). This loss calculates the ESDF of a 3D occupancy map and applies a 2D ground truth trajectory mask, significantly reducing collision rates. Experiments demonstrate its superior performance in collision rate and overall score on out-of-distribution (OOD) datasets compared to other methods.

The odometry head predicts the robot’s relative pose using current and past 4D features and additional sensor data (e.g., IMU, wheel data). It trains a Transformer model to fuse information from different sensors. Each sensor modality is processed by a specific tokenizer, combined with modality embeddings and temporal positional embeddi…

Konten dipersingkat otomatis.

🔗 Sumber: syncedreview.com

🤖 Catatan MAROKO133

Artikel ini adalah rangkuman otomatis dari beberapa sumber terpercaya. Kami pilih topik yang sedang tren agar kamu selalu update tanpa ketinggalan.

✅ Update berikutnya dalam 30 menit — tema random menanti!