📌 MAROKO133 Update ai: Hidden in plain sight: Open-source maps track America’s pow
A recent report from 404 Media highlights how a small research team is quietly mapping the rapid expansion of America’s datacenter infrastructure, using publicly available information and satellite imagery to track facilities often missing from public debate.
The work comes from Epoch AI, a non-profit research institute focused on understanding the scale and pace of artificial intelligence development.
Its researchers use open-source intelligence to identify, analyze, and document datacenters rising across the United States.
By reviewing satellite images, construction permits, and local regulatory filings, the team builds an interactive map that estimates cost, ownership, and power consumption.
The project offers rare visibility into an industry expanding faster than public scrutiny.
Mapping hidden infrastructure
Datacenter construction has become a major flashpoint across the country. The facilities demand vast amounts of electricity and water.
Many communities only learn about them after construction begins.
Epoch AI’s map places visual markers over known sites. Each marker links to satellite views and project details. One green circle appears over New Albany, Ohio.
The marker identifies Meta’s “Prometheus” datacenter complex. Epoch AI estimates the project has cost $18 billion so far. It draws 691 megawatts of power.
“A combination of weatherproof tents, colocation facilities and Meta’s traditional datacenter buildings, this datacenter shows Meta’s pivot towards AI,” Epoch said in its notes.
Users can scroll through a timeline and watch the complex grow. Satellite images show new buildings and cooling systems added over time.
How the estimates work
Much of Epoch AI’s analysis focuses on cooling infrastructure. Modern AI systems generate extreme heat. Datacenters often place cooling units outside buildings or across rooftops.
“Modern AI data centers generate so much heat that the cooling equipment extends outside the buildings,” Epoch AI explained on its website.
The team counts fans, measures their size, and analyzes their placement.
It feeds those details into a custom model to estimate energy use. That power estimate then helps infer compute capacity and construction cost.
“We focus on cooling because it’s a very useful clue for figuring out the power consumption,” Jean-Stanislas Denain, a senior researcher at Epoch AI, told 404 Media.
The model carries uncertainty. Fan speed and configuration vary widely.
Epoch AI says real cooling capacity could be twice as high or half as low as estimates.
What remains unseen
The map remains incomplete. State and local disclosure laws vary. Some projects avoid publicity. Smaller facilities often escape detection.
Epoch AI estimates the current dataset represents about 15 percent of global AI compute delivered by chipmakers as of November 2025.
Zooming out reveals markers across the country. One near Memphis, Tennessee points to xAI’s Colossus 2 project.
Epoch AI notes the company installed natural gas turbines across the Mississippi border, likely to secure faster approval.
“Based on this, and on earlier tweets from Elon Musk, 110,000 NVIDIA GB200 GPUs are operational,” Epoch AI wrote.
Even detailed mapping leaves blind spots.
“Even if we have a perfect analysis of a data center, we may still be in the dark about who uses it, and how much they use,” the group said.
Epoch AI plans to expand its search globally.
The project aims to shed light on infrastructure shaping the future economy, often without public visibility.
🔗 Sumber: interestingengineering.com
📌 MAROKO133 Update ai: Adobe Research Unlocking Long-Term Memory in Video World Mo
Video world models, which predict future frames conditioned on actions, hold immense promise for artificial intelligence, enabling agents to plan and reason in dynamic environments. Recent advancements, particularly with video diffusion models, have shown impressive capabilities in generating realistic future sequences. However, a significant bottleneck remains: maintaining long-term memory. Current models struggle to remember events and states from far in the past due to the high computational cost associated with processing extended sequences using traditional attention layers. This limits their ability to perform complex tasks requiring sustained understanding of a scene.
A new paper, “Long-Context State-Space Video World Models” by researchers from Stanford University, Princeton University, and Adobe Research, proposes an innovative solution to this challenge. They introduce a novel architecture that leverages State-Space Models (SSMs) to extend temporal memory without sacrificing computational efficiency.
The core problem lies in the quadratic computational complexity of attention mechanisms with respect to sequence length. As the video context grows, the resources required for attention layers explode, making long-term memory impractical for real-world applications. This means that after a certain number of frames, the model effectively “forgets” earlier events, hindering its performance on tasks that demand long-range coherence or reasoning over extended periods.
The authors’ key insight is to leverage the inherent strengths of State-Space Models (SSMs) for causal sequence modeling. Unlike previous attempts that retrofitted SSMs for non-causal vision tasks, this work fully exploits their advantages in processing sequences efficiently.
The proposed Long-Context State-Space Video World Model (LSSVWM) incorporates several crucial design choices:
- Block-wise SSM Scanning Scheme: This is central to their design. Instead of processing the entire video sequence with a single SSM scan, they employ a block-wise scheme. This strategically trades off some spatial consistency (within a block) for significantly extended temporal memory. By breaking down the long sequence into manageable blocks, they can maintain a compressed “state” that carries information across blocks, effectively extending the model’s memory horizon.
- Dense Local Attention: To compensate for the potential loss of spatial coherence introduced by the block-wise SSM scanning, the model incorporates dense local attention. This ensures that consecutive frames within and across blocks maintain strong relationships, preserving the fine-grained details and consistency necessary for realistic video generation. This dual approach of global (SSM) and local (attention) processing allows them to achieve both long-term memory and local fidelity.
The paper also introduces two key training strategies to further improve long-context performance:
- Diffusion Forcing: This technique encourages the model to generate frames conditioned on a prefix of the input, effectively forcing it to learn to maintain consistency over longer durations. By sometimes not sampling a prefix and keeping all tokens noised, the training becomes equivalent to diffusion forcing, which is highlighted as a special case of long-context training where the prefix length is zero. This pushes the model to generate coherent sequences even from minimal initial context.
- Frame Local Attention: For faster training and sampling, the authors implemented a “frame local attention” mechanism. This utilizes FlexAttention to achieve significant speedups compared to a fully causal mask. By grouping frames into chunks (e.g., chunks of 5 with a frame window size of 10), frames within a chunk maintain bidirectionality while also attending to frames in the previous chunk. This allows for an effective receptive field while optimizing computational load.
The researchers evaluated their LSSVWM on challenging datasets, including Memory Maze and Minecraft, which are specifically designed to test long-term memory capabilities through spatial retrieval and reasoning tasks.
The experiments demonstrate that their approach substantially surpasses baselines in preserving long-range memory. Qualitative results, as shown in supplementary figures (e.g., S1, S2, S3), illustrate that LSSVWM can generate more coherent and accurate sequences over extended periods compared to models relying solely on causal attention or even Mamba2 without frame local attention. For instance, on reasoning tasks for the maze dataset, their model maintains better consistency and accuracy over long horizons. Similarly, for retrieval tasks, LSSVWM shows improved ability to recall and utilize information from distant past frames. Crucially, these improvements are achieved while maintaining practical inference speeds, making the models suitable for interactive applications.
The Paper Long-Context State-Space Video World Models is on arXiv
The post Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models first appeared on Synced.
🔗 Sumber: syncedreview.com
🤖 Catatan MAROKO133
Artikel ini adalah rangkuman otomatis dari beberapa sumber terpercaya. Kami pilih topik yang sedang tren agar kamu selalu update tanpa ketinggalan.
✅ Update berikutnya dalam 30 menit — tema random menanti!
