MAROKO133 Hot ai: How a math theory born in Cold War might hold clues to when humanity dis

📌 MAROKO133 Breaking ai: How a math theory born in Cold War might hold clues to wh

Imagine you’re flipping through the history of humanity like a very long movie. You pause at one random frame. You don’t know exactly when you’ve stopped. Perhaps early, perhaps late, perhaps somewhere in the middle. The idea behind the Carter catastrophe says that, since we find ourselves now, we are probably somewhere in the middle of the film, not right at the beginning, and (unless we’re very unlucky) not at the very end.

Science writer James Felton explains in an IFLScience article that this concept “rests on us being average observers through time.” In other words, there’s nothing special about our moment (we assume), so our “now” shows how long humanity has been and might continue to be.

How the idea works: a gentle walkthrough

Here’s a way to picture it. Suppose the history of humanity is like a long journey from point A (beginning) to point B (end). If we assume we are a “random” traveler somewhere along this path, then the distance we’ve traveled so far gives some rough hint about what is left behind. If you’ve already traveled a long way, you might not have an extremely long road left, though the exact distance remaining is uncertain.

That’s the essence of the argument. Astrophysicist J. Richard Gott, as quoted by IFLScience, explains it like this: if something has existed for a certain amount of time already. We have no reason to believe we’re observing it at a special moment in its lifespan — not unusually early or unusually late — then it’s reasonable to expect that its remaining time will be roughly comparable to the time that has already passed.

In other words, if you come across a structure, a civilization, or even a species that’s been around for a while, the safest assumption is that you’re encountering it somewhere around the middle of its timeline. The past duration gives you a rough clue to its likely future duration, though this is just a probability, not a rule.

To demonstrate how this works, Gott applied it to a simple real-world example: the Berlin Wall. When he visited the Wall in 1969, it had stood for about eight years. He reasoned that his visit was a “random moment” in the Wall’s lifespan. The Wall fell twenty years later, giving it a remaining lifetime about 2.5 times the duration it had already existed. That outcome fit comfortably within the range predicted by the logic. As IFLScience reports: “The Wall fell 20 years later, giving tfuture = 2.5 t(past), within the 95% confidence limits predicted by equation (1).”

This kind of example helps show how the logic is intended to work. But of course, applying it to humanity is far more fraught.

What it predicts for humanity

The claim becomes bold when that same reasoning is turned to the human species. Using estimates of how many humans have already lived, Gott suggested that the number of humans yet to be born might lie somewhere between about 1.8 billion and 2.7 trillion (in his 1993 estimate), assuming that we are a typical human in the timeline. 

He further noted that if the current birth rate continued (at then ~145 million births per year), the remaining span for humanity could be on the order of tens of thousands of years, rather than millions or billions: “Combining … with the current rate … we find tf < 19,000 years unless the rate of births drops.”

That is eye-opening. It suggests that, under these assumptions, humanity’s remaining time might be far less than many intuitively hope for. But it is important: this is not a precise forecast. It is a probabilistic estimate, based on strong assumptions.

Why treat this idea with caution

There are several reasons why the Carter catastrophe concept should be considered thought-provoking rather than definitive. First, the assumption that we are a “random observer” in human history might not hold. Perhaps we live in a special epoch, such as rapid technological change or population growth.

Secondly, Felton notes the definition of “observer” and what matters. Are only current humans counted? Will future machine-intelligent beings count? If so, the numbers change drastically. Many philosophers and statisticians have also critiqued the logic and assumptions behind the argument and the choice of reference class (which observers should include).

Third, the method assumes things like birth-rate, death-rate, and the kind of life we live remain roughly comparable to the past, but that is open to change. For instance, we could find a way to enable artificial birth, a medical breakthrough could increase lifespans dramatically, or survival in general might be threatened by forces beyond our control.  

Why it matters

Why should you care about this rather abstract-looking idea? Because it encourages us to think differently about our place in time. Rather than assuming humanity will last for eons simply because it has existed for a while, it invites the possibility that our remaining stretch might not be as long as we assume, or at least that our assumptions should be examined.

It is a conversation starter rather than a prophecy, linking simple statistical ideas with deep questions about existence, longevity, and meaning.

The Carter catastrophe argument doesn’t tell us when humanity will end. It doesn’t supply a date or an event. What it offers is a lens. Look at how far we’ve come, assume you’re not at extremes, and ask that we may not have infinite time ahead. Maybe our remaining time is of the same magnitude as our past time. That doesn’t mean doom is imminent, but it means the notion that “we’ll last forever” deserves scrutiny.

🔗 Sumber: interestingengineering.com


📌 MAROKO133 Update ai: Adobe Research Unlocking Long-Term Memory in Video World Mo

Video world models, which predict future frames conditioned on actions, hold immense promise for artificial intelligence, enabling agents to plan and reason in dynamic environments. Recent advancements, particularly with video diffusion models, have shown impressive capabilities in generating realistic future sequences. However, a significant bottleneck remains: maintaining long-term memory. Current models struggle to remember events and states from far in the past due to the high computational cost associated with processing extended sequences using traditional attention layers. This limits their ability to perform complex tasks requiring sustained understanding of a scene.

A new paper, “Long-Context State-Space Video World Models” by researchers from Stanford University, Princeton University, and Adobe Research, proposes an innovative solution to this challenge. They introduce a novel architecture that leverages State-Space Models (SSMs) to extend temporal memory without sacrificing computational efficiency.

The core problem lies in the quadratic computational complexity of attention mechanisms with respect to sequence length. As the video context grows, the resources required for attention layers explode, making long-term memory impractical for real-world applications. This means that after a certain number of frames, the model effectively “forgets” earlier events, hindering its performance on tasks that demand long-range coherence or reasoning over extended periods.

The authors’ key insight is to leverage the inherent strengths of State-Space Models (SSMs) for causal sequence modeling. Unlike previous attempts that retrofitted SSMs for non-causal vision tasks, this work fully exploits their advantages in processing sequences efficiently.

The proposed Long-Context State-Space Video World Model (LSSVWM) incorporates several crucial design choices:

  1. Block-wise SSM Scanning Scheme: This is central to their design. Instead of processing the entire video sequence with a single SSM scan, they employ a block-wise scheme. This strategically trades off some spatial consistency (within a block) for significantly extended temporal memory. By breaking down the long sequence into manageable blocks, they can maintain a compressed “state” that carries information across blocks, effectively extending the model’s memory horizon.
  2. Dense Local Attention: To compensate for the potential loss of spatial coherence introduced by the block-wise SSM scanning, the model incorporates dense local attention. This ensures that consecutive frames within and across blocks maintain strong relationships, preserving the fine-grained details and consistency necessary for realistic video generation. This dual approach of global (SSM) and local (attention) processing allows them to achieve both long-term memory and local fidelity.

The paper also introduces two key training strategies to further improve long-context performance:

  • Diffusion Forcing: This technique encourages the model to generate frames conditioned on a prefix of the input, effectively forcing it to learn to maintain consistency over longer durations. By sometimes not sampling a prefix and keeping all tokens noised, the training becomes equivalent to diffusion forcing, which is highlighted as a special case of long-context training where the prefix length is zero. This pushes the model to generate coherent sequences even from minimal initial context.
  • Frame Local Attention: For faster training and sampling, the authors implemented a “frame local attention” mechanism. This utilizes FlexAttention to achieve significant speedups compared to a fully causal mask. By grouping frames into chunks (e.g., chunks of 5 with a frame window size of 10), frames within a chunk maintain bidirectionality while also attending to frames in the previous chunk. This allows for an effective receptive field while optimizing computational load.

The researchers evaluated their LSSVWM on challenging datasets, including Memory Maze and Minecraft, which are specifically designed to test long-term memory capabilities through spatial retrieval and reasoning tasks.

The experiments demonstrate that their approach substantially surpasses baselines in preserving long-range memory. Qualitative results, as shown in supplementary figures (e.g., S1, S2, S3), illustrate that LSSVWM can generate more coherent and accurate sequences over extended periods compared to models relying solely on causal attention or even Mamba2 without frame local attention. For instance, on reasoning tasks for the maze dataset, their model maintains better consistency and accuracy over long horizons. Similarly, for retrieval tasks, LSSVWM shows improved ability to recall and utilize information from distant past frames. Crucially, these improvements are achieved while maintaining practical inference speeds, making the models suitable for interactive applications.

The Paper Long-Context State-Space Video World Models is on arXiv

The post Adobe Research Unlocking Long-Term Memory in Video World Models with State-Space Models first appeared on Synced.

🔗 Sumber: syncedreview.com


🤖 Catatan MAROKO133

Artikel ini adalah rangkuman otomatis dari beberapa sumber terpercaya. Kami pilih topik yang sedang tren agar kamu selalu update tanpa ketinggalan.

✅ Update berikutnya dalam 30 menit — tema random menanti!

Author: timuna