Google Unveils Veo 3.1 to Rival OpenAI’s Sora 2 with Realistic AI Videos and Sound
The war of artificial intelligences reaches a peak. With every announcement, a new model emerges, more daring, more immersive, more… expensive. In this battle of innovations, Google did not want to remain a spectator. By releasing Veo 3.1, it unveils a video AI armed with sounds, dialogues, and new editing capabilities. Facing the viral popularity of Sora 2, the Mountain View firm plays another card: that of narrative precision and creative control.
In brief
- Veo 3.1 integrates audio, dialogues and sound effects to enrich the AI-generated scenes.
- The tool targets serious creators, with editing options and professional formats.
- Three key modules: image composition, creative transitions, and smooth clip extension.
- Google’s AI favors visual coherence, sometimes at the expense of action speed.
Technological Duel: Google attacks the queens of AI video
When OpenAI, valued at $500 billion without IPO , launched Sora 2 on September 30, the success was immediate. The app was downloaded more than one million times in only five days, climbing to the top of the App Store. Its approach? A “TikTok-ized” interface, designed for sharing and remixing.
Google did not choose this path. With Veo 3.1 , the goal is clear: to address creators, not influencers. The model allows generating videos with 1080p resolution, in horizontal or vertical format, integrating sound atmosphere, synchronized voices, and realistic effects. Accessible via Flow, Vertex AI and Gemini API, it offers two plans: a fast version at $0.15/second, and a standard one at $0.40/second.
The firm emphasizes the audio capabilities, now present in all modules. It promises an unprecedented rendering: the lip synchronization of Veo 3.1 surpasses that of all other models.
Where Sora favors visual dynamism, Veo chooses coherence. Movements are slower, but elements remain stable. It is the price of precision. A positioning that contrasts with the ambitions of Meta or Luma Labs, who focus more on speed and the wow effect.
Stories that speak: Google’s AI wants to tell
One of Veo 3.1’s major bets is narrative immersion. The addition of sound allows Google to take a step forward: no longer just illustrating, but telling with images and voices. Three features stand out:
- Ingredients to Video: you combine several reference images, and the AI generates a scene with objects and characters;
- Frames to Video: you provide a starting image and an ending one, and the AI produces a coherent transition;
- Extend: the AI extends a clip by generating the continuation from the last second.
The tool also allows adding or removing elements, taking shadows and lights into account. This level of detail is the strength of the approach: a film studio within an artificial intelligence interface.
But not everything is perfect. When instructions stray too far from visual logic, the AI goes off track. Some scenes jump from one shot to another, lose characters or completely change atmosphere. It remains a technology under development.
As Google explained in its official blog:
We’re also introducing Veo 3.1, which brings richer audio, more narrative control, and enhanced realism that captures true-to-life textures.
Veo 3.1 does not want to entertain: it wants to move. And this is probably where it differs radically from its competitors.
Demanding UX, stunning result: when artificial intelligence becomes a creative tool
The user experience provided by Veo 3.1 is not that of a social network. It is not a product to consume, but a tool to master. Creators must learn to speak the language of AI. A poorly written prompt or one too far from reference images can produce an incoherent result.
Some tips are already circulating among users. For example, going through Seedream to generate a faithful initial image before importing it into Veo. Or using an audio-aware construction, explicitly mentioning the desired sounds in prompts.
In this regard, here are some concrete facts:
- Veo has generated more than 275 million videos since the launch of Flow;
- Three creative modules are available: Ingredients, Frames, Extend;
- The usage cost is up to 2 times lower than that of Sora 2 Pro;
- Videos can last up to one minute, with integrated sound;
- Only three models handle spoken voices: Sora, Grok, and now Veo.
The tool is not easily tamed. But once understood, it delivers videos of rare realism, with accurate intonations and credible characters. It just requires patience, skill… and some credits.
Google no longer hides its ambition to dominate generative AI. Veo 3.1 shows that the firm does not just want to follow. It wants to impose its tempo. And to confirm this thirst for achievement, one of its robots has just solved a math problem considered impossible . The message is clear: the AI giant is just starting to speak.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Zcash Halving Event: The Impact on Bitcoin Privacy Coins and Price Trends After Halving
- Zcash’s 2025 halving reduces block rewards by 50%, tightening supply and enhancing privacy via zk-SNARKs. - Historical data shows ZEC surged 92% post-2024 halving, outperforming Bitcoin amid regulatory uncertainty. - Institutional investments, including $151.6M from Grayscale, highlight Zcash’s growing appeal as a privacy-focused alternative to Bitcoin.

Timeless Investment Strategies: Why Insights from 1927 Continue to Shape Today’s Investors
- McNeel's 1927 "Beating the Market" prefigured Buffett's value investing and modern behavioral finance principles. - He advocated emotional discipline and long-term faith in U.S. economic resilience, echoed by Buffett's "margin of safety" strategy. - Modern behavioral finance (2020–2025) validates these insights, showing disciplined investors outperforming during crises like 2008 and 2020. - Algorithmic trading and meme stocks highlight the enduring relevance of McNeel's principles in countering speculati

ICP Network Expansion and Its Impact on Web3 Infrastructure Investments
- ICP Protocol's 2025 growth highlights its role as a hybrid cloud/Web3 infrastructure leader through cross-chain integration and enterprise partnerships. - Unverified 10M node claims contrast with 1.2M wallets, creating transparency concerns for investors assessing network legitimacy. - 22.5% TVL growth and 2,000 new developers signal institutional confidence, yet Q3 dApp usage fell 22.4%, exposing adoption gaps. - Regulatory risks and Web3's user experience challenges question ICP's long-term viability d

SOL Price Forecast and Solana's Market Strength in Late 2025: A Two-Factor Assessment
- Solana (SOL) faces pivotal 2025 juncture with Fed easing and blockchain upgrades driving price resilience. - Fed rate cuts and $421M institutional inflows via ETFs (e.g., REX-Osprey) boost crypto adoption amid low yields. - Firedancer/Alpenglow upgrades cut validator costs by 80%, enabling 100-150ms finality and $10.2B DeFi TVL growth. - $133 support level and bullish TD Sequential signals suggest $150-$165 target by year-end despite inflation risks.

