Hello2 – 高清 LipSync 工具 GitHub – fudan-generative-vision/hallo2: Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image AnimationHallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation – fudan-generative-vision/hallo2
RF-Inversion – 無需 ControlNet 的圖片編輯 GitHub – LituRout/RF-Inversion: Rectified Flow Inversion (RF-Inversion)(附 ComfyUI Node)Rectified Flow Inversion (RF-Inversion). Contribute to LituRout/RF-Inversion development by creating an account on GitHub.
Llama 3.2 Vision Instruct 詳細的安裝教學 Llama 3.2 Vision Instruct - Installation & Usage Tutorial Watch this video on YouTube
Open-LLM-VTuber 透過免持語音互動、語音中斷、Live2D 臉部辨識和跨平臺本地運行的長期記憶與任何 LLM 交談LLM 推理後端、語音辨識和語音合成器均設計為可交換。此專案可以配置為在 macOS、Linux 和 Windows 上離線運行。也支援線上 LLM/ASR/TTS 選項。
Speech-to-speech 語音到語音開源模組 GitHub – huggingface/speech-to-speech: Speech To Speech: an effort for an open-sourced and modular GPT4-oSpeech To Speech: an effort for an open-sourced and modular GPT4-o – huggingface/speech-to-speech
kotaemon 一個基於 RAG 的開源工具 GitHub – Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents.An open-source RAG-based tool for chatting with your documents. – Cinnamon/kotaemon
CogVideo 文字 > 影片產生 GitHub – THUDM/CogVideo: Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023) – THUDM/CogVideo