【AI產業週報】ComfyUI、Pika多項生成式 AI 工具發布重要升級

近期 AI 領域迎來多項重要更新，以下為各位整理本週的關鍵內容：

影像與動畫生成工具重大升級

ComfyUI 在 2025 年推出重大更新，介面優化讓玩家可以更直覺地使用分頁與日誌功能。而備受矚目的 Pika 工具也發布 2.1 版本，現已開放所有玩家使用，支援 1080p 高畫質影片輸出，人物動作更加流暢自然。

https://github.com/comfyanonymous/ComfyUI

海螺 AI 則帶來全新的 T2V-01-Director 模型，讓創作者能更精準地控制鏡頭，大幅降低動作的隨機性，對遊戲與動畫製作者來說是一大福音。

Getting really real with you…

Pika 2.1 is HERE—crystal-clear 1080p resolution, razor-sharp details, seamless motion, lifelike human characters, and more. If you see it, you’ll believe it.

Try it out now at pika dot art. pic.twitter.com/VIhypQPFjw
— Pika (@pika_labs) January 27, 2025

大型語言模型領域競爭白熱化

阿里巴巴的 Qwen 2.5-Max 以超過 20 兆個 token 的驚人訓練量，在多項測試中擊敗 DeepSeek V3。Google 也不甘示弱，推出 Gemini 2.0 Pro exp 01.28 新版本。值得注意的是，OpenAI 特別針對美國政府需求，推出了部署在 Azure 平台的「ChatGPT Gov」版本。

Qwen2.5-Max：https://qwenlm.github.io/blog/qwen2.5-max

BREAKING 🚨: Gemini 2.0 Pro exp 01.28 is spotted in the wild on Google AI Studio in the Starter apps section. It is not yet available in the model selector 👀

h/t @BartokGabi17 pic.twitter.com/3C2y7i6Z5T
— TestingCatalog News 🗞 (@testingcatalog) January 28, 2025

ChatGPT Gov：https://openai.com/global-affairs/introducing-chatgpt-gov

遊戲開發者新利器登場

為遊戲開發者帶來福音的 Seamless PBR Material Generator 發布，支援 1024×1024 解析度的完整 PBR 材質生成。Stable Flow 技術則讓開發者能更靈活地進行場景編輯，透過識別「關鍵層」實現多樣化的影像處理。

Seamless pbr texture generator with all maps：https://civitai.com/articles/11045

動漫風格的 Animagine XL 4.0 同樣引人注目，耗時 2,650 GPU 小時、透過 840 萬張圖片訓練的成果，為遊戲美術設計帶來更多可能。

現實世界的 AI 應用與爭議

特斯拉展示了彷彿科幻電影的一幕：新車可自動從工廠駛至指定裝載區，展現自動駕駛技術的成熟度。中國優地機器人的人形機器人 H1「伏羲」也在春節晚會上首次亮相，向《機器人與我》邁進了一大步。

Teslas now drive themselves from their birthplace at the factory to their designated loading dock lanes without human intervention

One step closer to large-scale unsupervised FSD pic.twitter.com/Aj6dHsLaRO
— Tesla AI (@Tesla_AI) January 29, 2025

Unitree H1: Humanoid Robot Makes Its Debut at the Spring Festival Gala 🥰
Hello everyone, let me introduce myself again. I am Unitree H1 "Fuxi".
I am now a comedian at the Spring Festival Gala, hoping to bring joy to everyone.
Let’s push boundaries every day and shape the future… pic.twitter.com/MsFuIo6BL0
— Unitree (@UnitreeRobotics) January 28, 2025

不過 AI 領域也出現爭議，中國新創 DeepSeek 被指控可能不當使用 OpenAI 的資料進行訓練，目前 OpenAI 與微軟正在調查中，此事可能影響該公司在美國的發展。

DeepSeekがデータ不正利用か　OpenAIとMicrosoft調査：https://www.nikkei.com/article/DGXZQOGN293P40Z20C25A1000000

最後，Gemini 2.0 展示的低延遲語音互動功能，讓玩家可以用語音指令快速創建遊戲內容，為遊戲開發帶來更多想像空間。

Check out this real-time screen recording that demonstrates voice interaction with low latency and interruptions using Gemini 2.0. Try it using a simple prompt and create your own game 👾 → https://t.co/ddXNLFkrFd pic.twitter.com/XGTV0K9QUD
— Google AI Developers (@googleaidevs) January 28, 2025

時尚與影像編輯工具重大突破

Introducing Model Studio from FASHN—a brand-new suite of tools to create, edit, and manage AI fashion model images.

No more time-consuming re-shoots or limited visuals. Get ready to refresh your on-model photos faster and more flexibly!

More details in the thread below 🧵👇 pic.twitter.com/19XdaPzY1c
— FASHN AI (@fashn_ai) January 27, 2025

FASHN 推出的 Model Studio 為時尚產業帶來革新，讓創作者能輕鬆創建、編輯 AI 時尚模特兒影像，不僅可以在相同服裝、相同姿勢下替換不同模特兒，還能自由更換背景，成品品質相當精緻。

影像技術創新應用

Stable Flow 技術的問世讓影像編輯進入新紀元。透過擴散變換器（DiT）技術，系統能自動識別「關鍵層」，並在這些層中注入其他影像的特徵，實現非剛體編輯、物件增減以及整體場景編輯等多樣化功能。系統會自動測量每個層級被略過時對影像內容的影響，藉此找出對影像生成最關鍵的層級。

Stable Flow: Vital Layers for Training-Free Image Editing：https://arxiv.org/abs/2411.14430

隨著 AI 技術的快速進展，我們可以預期未來進行相關創作會變得將更加便捷，創作者能夠更專注於消費者體驗的打造。但同時，數據使用的合規、合法性也將是業界需要謹慎面對的課題。

參考資料：【生成AIニュース】『ComfyUI2025』『Pika2.1』『Hailuo T2V-01-Director』『Qwen2.5-Max』『Gemini 2.0 Pro exp 01.28』『ChatGPT Gov』『SeamlessPbrMaterialGenerator』『Gemini2.0の音声インタラクション』『Model Studio』『Stable Flow』『Animagine4.0』『テスラの自動運転』『UnitreeH1Robots』『DeepSeekがデータ不正利用か』

合作廣告

吹著魔笛的浮士德

Marvelous 新一季度財報：魔農傳記 FARMAGIA表現遠低於預期，佐藤社長即將辭任

Marvelous 佐藤社長即將卸任，照井慎一與寶可夢能否讓公司逆轉勝？