網(wǎng)易首頁 > 網(wǎng)易號 > 正文申請入駐

77萬人圍觀的吉卜力風(fēng)「游戲」視頻，我們用3個國產(chǎn)AI整出來了

2025-06-19 11:31:47　來源: 機器之心Pro

北京舉報

分享至

機器之心報道

編輯：楊文

Reddit上爆火的吉卜力風(fēng)「游戲」視頻。

前段時間 GPT-4o 爆火，網(wǎng)友拿它各種爆改吉卜力風(fēng)格照片。現(xiàn)在，又有網(wǎng)友搞出了吉卜力風(fēng)格的游戲視頻，還一度登上了 Reddit 熱榜。

a16z 合伙人 Justine Moore 也在 X 上轉(zhuǎn)發(fā)該視頻，短短一天時間就獲得超 77 萬瀏覽量。

她配文稱，如果能夠通過提示詞創(chuàng)建自己的虛擬世界，并與由大語言模型和語音模型驅(qū)動的其他角色互動，那將會非常震撼，由此暗示了 AI 在游戲開發(fā)，特別是生成動態(tài)、沉浸式的虛擬環(huán)境中的潛力。

網(wǎng)友在底下紛紛評論：

其實，要制作這樣一則視頻并不難，該網(wǎng)友放出了操作全流程，甚至連提示詞都整理好了。

他先分別使用 Midjourney 和 Kling 2.1 生成圖像和視頻，再通過 Joystick png 添加一些畫面中的按鈕、小地圖等 HUD 元素，并配上 ASMR 聲音使其更加生動。

接下來，我們就拿國產(chǎn) AI 復(fù)刻一下。

第一步：生成圖片。

在之前的評測文章《實測完即夢 3.0，我后悔大學(xué)選了設(shè)計專業(yè)……》中，我們讓即夢 3.0 單挑 GPT-4o、Ideogram 3.0，絲毫不落下風(fēng)。相比于之前的版本，即夢 3.0 屬實進步了一大截，不僅用色布局審美在線，生成中英文字體也幾乎能一次過，不用反復(fù)抽卡。

這次我們再來試下即夢 3.0 的「文生圖」功能。

提示詞：First-person POV video game screenshot, playing as a young anime protagonist in a slightly oversized white t-shirt and knee-length blue shorts. Visible hands pushing open a sun-faded wooden door, forearms resting on the frame. In a dusty hallway mirror reflection: character's soft Ghibli-style face with windblown hair. Inside a cozy coastal cottage: slanted sunlight through lace curtains, pastel walls with watercolor seascapes, overstuffed bookshelf spilling seashells. Foreground: 'E: Rest' prompt over a quilted sofa. Background: steaming teacup on a driftwood table, open window revealing distant lighthouse and Miyazaki fluffy clouds. Soft painterly textures, slight fisheye lens, identical HUD (minimap corner, health bar)

提示詞：First-person POV video game screenshot, playing as a young anime protagonist in a slightly oversized white t-shirt and knee-length blue shorts. View includes visible hands gripping a steering wheel, sunlit arms resting on car door, and rearview mirror showing character's soft Ghibli-style face with windblown hair. Driving through a vibrant coastal town: cobblestone streets, pastel houses with flower boxes, distant lighthouse. Soft painterly textures, Miyazaki skies with fluffy clouds, slight fisheye lens effect, HUD elements (minimap corner, health bar).

提示詞：First-person POV video game screenshot, playing as a young protagonist in a loose white t-shirt and faded denim shorts. Visible arms holding a woven basket, sneakers stepping on rain-damp cobblestones. Walking through a chaotic Ghibli street market: cramped stalls selling glowing mushrooms, floating lanterns, and spiral-cut fruits. Fishmonger shouts.Soft painterly lighting, depth of field, subtle HUD (minimap corner, health bar). Studio Ghibli meets Grand Theft Auto.（注：原提示詞更適合動態(tài)效果，我們對此進行了簡化。）

提示詞：First-person POV video game screenshot, playing as a young anime protagonist in a slightly oversized white t-shirt (salt-stained sleeves) and knee-length blue shorts, visible hands gripping a bamboo fishing rod. Kneeling on a mossy dock pier at sunset, arms resting on knees. Foreground: 'E: Reel In' prompt as line pulls taut. Background: pastel fishing boats, distant lighthouse under Miyazaki’s fluffy clouds. Glowing koi fish breaching turquoise water.A school of fish swims gracefully through crystal-clear water. Identical soft painterly textures, fisheye lens effect, HUD (minimap corner, health bar).（注：原提示詞更適合動態(tài)效果，我們對此進行了簡化。）

第二步：生成視頻。

受谷歌 Veo 3 的「刺激」，國產(chǎn) AI 視頻生成模型又開始卷了。

5 月 29 日，可靈 2.1 正式上線；6 月 11 日，字節(jié)推出了視頻生成模型 Seedance 1.0 pro，也就是即夢視頻 3.0 Pro；昨天凌晨 Minimax 也發(fā)布了新視頻生成模型 Hailuo 02。

而且即夢和可靈均上線了 AI 音效功能，只要在生成的視頻中點擊相應(yīng)的按鈕就能自動生成 3-4 條音效，Hailuo AI 目前還未推出該功能。

我們把這三個視頻模型放在一起對比下，看看誰更能打。

提示詞 1：The black-haired boy strides from the rustic house toward the ocean, the camera tracking his movement in a GTA-style third-person perspective as coastal winds flutter white curtains and sunlight glimmers on distant sailboats, blending warm interior details with expanding seaside horizons under a tranquil sky.

提示詞 2：The brown-haired boy drives a vintage blue convertible along the coastal cobblestone street, colorful flower-adorned buildings passing by as the camera follows the car's journey toward the sunlit ocean horizon, sea breeze gently tousling his hair under a serene sky.

提示詞 3：The young boy navigates the bustling cobblestone market, basket of oranges in arm, as vibrant stalls and fluttering awnings frame his journey, the camera tracking his focused stride through chattering crowds under swaying traditional lanterns.

提示詞 4：A school of fish swims gracefully through crystal-clear water, sunlight filtering through the surface, coral reefs swaying gently, creating a serene underwater scene with the camera stationary.

最后，我們來看看成品效果：

即夢 3.0 Pro：

可靈 2.1 ：

Hailuo 02：

傳統(tǒng)游戲的開發(fā)周期通常冗長而昂貴，特別是在高質(zhì)量場景、美術(shù)資產(chǎn)和動畫內(nèi)容的制作方面，需要大量人力和時間投入。就拿去年爆火的 3A 大作《黑神話·悟空》來說，每小時的開發(fā)成本就有 1500 萬元到 2000 萬元，整個項目的開發(fā)成本保守估計達到了 4 億元。

而視頻生成模型的不斷進化則為游戲產(chǎn)業(yè)帶來了顛覆性可能，它們可以根據(jù)文本甚至玩家的對話風(fēng)格、選擇偏好和操作習(xí)慣等，實時生成符合其個性的劇情發(fā)展和視覺風(fēng)格。

比如谷歌 GameNGen 模型通過擴散模型和強化學(xué)習(xí)，實現(xiàn)無引擎狀態(tài)下的幀序列預(yù)測，動態(tài)生成游戲畫面，GameGen-O 可以依據(jù)玩家選擇實時生成劇情線等。這不僅改變了游戲開發(fā)的工作流程，也重新定義了玩家體驗。玩家將不再被限制于開發(fā)者預(yù)設(shè)的劇情和地圖，而是在 AI 的協(xié)助下進入一個可以隨時擴展、因人而異、真正「開放」的世界。

此外，AI 的引入還可能降低游戲門檻，鼓勵更多獨立開發(fā)者甚至非專業(yè)用戶參與創(chuàng)作。比如去年初創(chuàng)公司 BuildBox AI 就曾發(fā)布 Buildbox 4 Alpha 這一 AI 游戲引擎，用戶只需輸入提示詞即可為游戲添加資產(chǎn)和動畫，這在一定程度上解放了創(chuàng)作力，未來或許還能催生出全新的商業(yè)模式。

當(dāng)然，技術(shù)上的挑戰(zhàn)仍然不少，實時生成內(nèi)容需要巨大的算力支撐，如何平衡質(zhì)量與響應(yīng)速度仍是一大難題，谷歌 GameNGen 模型就僅支持簡單游戲如 1993 版《Doom》，且受限于 3 秒歷史記憶，復(fù)雜場景易出現(xiàn)圖像故障；AI 生成內(nèi)容的版權(quán)歸屬、虛擬角色的行為規(guī)范等問題也仍需明確。

盡管如此，我們還是愿意相信，AI + 游戲大有可為。正如英偉達 CEO 黃仁勛作出的預(yù)測那樣，未來 5-10 年，我們或許真的可以看到完全由 AI 生成的游戲。

參考鏈接：

https://www.reddit.com/r/midjourney/comments/1lbblfq/ghibli_style_game_guide_included/

https://x.com/venturetwins/status/1934634448244654174

特別聲明：以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺“網(wǎng)易號”用戶上傳并發(fā)布，本平臺僅提供信息存儲服務(wù)。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.