TOPIC

Voice & Speech AI

Voice cloning, real-time speech recognition (Whisper), executive voice assistants, and the audio-intelligence stack reshaping the private-banking client experience.

Banner of Vibrant gradient overlay on repeated profiles

Openvoice · Voice Cloning Technology · Ai Synthetic Speech

OpenVoice: Leading Innovation in Voice Cloning Technology

Apr 1, 2024 · Sebastien Rousseau

OpenVoice from MIT, Tsinghua and MyShell delivers production-grade voice cloning with fine-grained tone, accent and emotion control — and the trade-offs worth knowing.

Banner for Real-time automatic speech recognition (ASR)

Openai Whisper · Metal Performance Shaders · Macos Speech Recognition

Fast Real-Time Speech Recognition on macOS: OpenAI Whisper

Mar 12, 2024 · Sebastien Rousseau

Explore how OpenAI Whisper and Metal Performance Shaders are transforming real-time speech recognition on macOS, offering unparalleled speed and accuracy.

Àkàndé · Openai Gpt-4 · Whisper Stt

Àkàndé: GPT-Powered Voice Assistant for Executives

Feb 12, 2024 · Sebastien Rousseau

Àkàndé is an open-source Python voice assistant that chains OpenAI Whisper speech recognition, GPT-4 chat completions, and a local SQLite response cache into a voice-driven workflow — generating PDF summaries from conversation history and keeping all stored data local.

Azure Cognitive Services · Speech-To-Text · Neural Acoustic Model

Audio Analyser: Azure Speech, NLP, and Translation Pipeline

Jan 29, 2024 · Sebastien Rousseau

Audio Analyser uses Azure Cognitive Services speech-to-text neural models, Text Analytics NLP, and CherryPy to convert audio recordings into searchable transcripts with sentiment scores, keyword extraction, and multilingual translations.

Articles in this topic

OpenVoice: Leading Innovation in Voice Cloning Technology

Fast Real-Time Speech Recognition on macOS: OpenAI Whisper

Àkàndé: GPT-Powered Voice Assistant for Executives

Audio Analyser: Azure Speech, NLP, and Translation Pipeline