💻

programming Category

AI audio editing

Found 6 AI tools

tools

Primary Category: programming

Subcategory: AI audio editing

Found 6 matching tools

Related AI Tools

Click any tool to view details

Podcastfy

Podcastfy is an open source Python package that uses generative artificial intelligence technology to transform web content, PDF files, and text into engaging multilingual audio conversations. Unlike traditional user interface-based tools, Podcastfy focuses on programmatic and customized generation of engaging, conversational audio and text from multiple text sources, enabling customization and scale.

gradio huggingface-spaces genai +2

编程 Visit

seed-vc

seed-vc is a sound conversion model based on the SEED-TTS architecture, which can achieve zero-sample sound conversion, that is, the sound can be converted without the need for a specific person's voice sample. This technology performs well in terms of audio quality and timbre similarity, and has high research and application value.

机器学习音频处理零样本学习 +1

编程 Visit

whisper-diarization

Whisper-diarization is an open source project that combines Whisper's automatic speech recognition (ASR) capabilities, vocal activity detection (VAD), and speaker embedding technology. It improves the accuracy of speaker embeddings by extracting the sound parts in the audio, then using Whisper to generate transcripts and correcting timestamps and alignments with WhisperX to reduce segmentation errors due to time offsets. Next, MarbleNet is used for VAD and segmentation to exclude silence, TitaNet is used to extract speaker embeddings to identify the speaker of each paragraph, and finally the results are associated with timestamps generated by WhisperX, the speaker of each word is detected based on the timestamp, and realigned using a punctuation model to compensate for small temporal shifts.

语音识别自动转录说话人分割 +1

编程 Visit

ElevenLabs Audio Isolation API

Audio Isolation is an online audio processing service provided by ElevenLabs that focuses on separating vocals or background music from audio. This technology has important application value in fields such as music production and video post-production, and can significantly improve the efficiency and quality of audio editing. The product provides services through API, supports calls in multiple programming languages, and is highly flexible and convenient. In terms of pricing, the API is charged per minute based on the number of audio characters processed, and the specific price is not clearly marked on the page.

音频处理 API服务人声隔离 +1

编程 Visit

AudioSeal

AudioSeal is a localized watermarking technology for AI-generated speech audio with state-of-the-art robustness and extremely fast detection speed. By jointly training a watermark-embedded generator and a detector, it can detect watermarked segments in longer audio even in the presence of audio editing. AudioSeal designed a fast single-pass detector that is two orders of magnitude faster than existing models, making it ideal for large-scale and real-time applications.

AI生成音频编辑鲁棒性 +2

编程 Visit

LookOnceToHear

LookOnceToHear is an innovative smart headphone interaction system that allows users to select the target speaker they want to hear through simple visual recognition. This technology received an honorable mention for Best Paper at CHI 2024. It achieves real-time speech extraction by synthesizing audio mixes, head-related transfer functions (HRTFs) and binaural room impulse responses (BRIRs), providing users with a novel way to interact.

语音识别实时处理智能耳机 +1

编程 Visit

Related Subcategories

Explore other subcategories under programming Other Categories

Development and Tools

768 tools

AI model

465 tools

code assistant

368 tools

AI development assistant

294 tools

Model training and deployment

140 tools

AI code assistant

85 tools

Development platform

66 tools

research tools

61 tools

💻

Explore More programming Tools

AI audio editing Hot programming is a popular subcategory under 6 quality AI tools

Browse programming Category Categories