SNAPSHOT TESTING WITH SYRUPY IN AWS CDK PYTHON
In 2023, I wrote a post on snapshot testing in AWS CDK Python. Since then, I’ve switched from pytest-snapshot to syrupy. At Defiance Digital, we use the AWS CDK for almost everything. Generally we use TypeScript because it’s the original language for the CDK, everything using JSII transpiles back to TypeScript, and it has the most compability with the CDK. However, we have a few projects that use Python, and on those I’ve really been missing Jest snapshot testing.
TTS OPTIONS FOR OPENVOICEOS
OpenVoiceOS, or OVOS, provides a flexible plugin system for text-to-speech (TTS) synthesis. This system allows you to choose the TTS engine that best suits your needs, whether you prioritize voice quality, speed, privacy, or language support. Running Plugins Directly OVOS maintains several TTS plugins that can run directly on your voice assistant hardware. Here are the currently maintained plugins, categorized by their primary use case: Hardware/Environment Model GitHub Link Notes GPU-Optimized Coqui TTS ovos-tts-plugin-coqui Local TTS using Coqui models, best performance with GPU but some models work on CPU CPU-Capable Edge TTS ovos-tts-plugin-edge-tts Microsoft Edge TTS voices, requires internet connection but provides fast, high-quality output and streaming capability CPU-Capable Piper ovos-tts-plugin-piper Local TTS with multiple voice options, good performance on modern hardware (optimized for Raspberry Pi) CPU-Capable Mimic ovos-tts-plugin-mimic Classic Mycroft offline TTS engine, lightweight but robotic sounding CPU-Capable ESpeak-NG ovos-tts-plugin-espeakNG Lightweight offline TTS, supports many languages but robotic sounding CPU-Capable Pico ovos-tts-plugin-pico Very lightweight offline TTS engine API-Based/Cloud Azure ovos-tts-plugin-azure Microsoft Azure Cognitive Services TTS, paid service with high quality but no guarantee of privacy API-Based/Cloud Polly ovos-tts-plugin-polly Amazon Polly TTS service, paid service with high quality but no guarantee of privacy API-Based/Cloud MaryTTS ovos-tts-plugin-marytts Connect to MaryTTS server, can be self-hosted, considered outdated and not well maintained Language-Specific Nòs ovos-tts-plugin-nos Galician language TTS Language-Specific Cotovia ovos-tts-plugin-cotovia Alternative Galician language TTS Language-Specific Matxa ovos-tts-plugin-matxa-multispeaker-cat Catalan language multi-speaker TTS Novelty BeepSpeak ovos-tts-plugin-beepspeak R2D2-style robotic sounds, works well but requires subtitles to be understood Novelty SAM ovos-tts-plugin-SAM Software Automatic Mouth, retro-style speech synthesis, too digitized for regular use Note: While Mimic2 and Mimic3-server plugins exist in archived form, they are no longer supported.
STT OPTIONS FOR OPENVOICEOS
OpenVoiceOS, or OVOS, is the spiritual successor (and fork) to the Mycroft voice assistant. It is a privacy-focused, open-source voice assistant that you can run on your own hardware. OVOS has a plugin system that allows you to swap out the default speech-to-text (STT) engine for one that you prefer. The plugin system, while powerful, can be confusing due to the sheer number of options available. This post will cover some of the STT options available to you and when you might use them.
BUILDING VOICE ASSISTANT CONFIGURATIONS: ADVANCED OVOS SETUPS
After exploring each component of OVOS and Neon assistants, let’s examine how to mix and match these components to create custom configurations. The modular nature of OVOS allows for setups ranging from minimal text-only systems to complex distributed networks. Text-Only Assistants The simplest possible configuration requires just two components: Message bus Core/skills service This minimal setup can be useful for: Development and testing Accessibility (vision/hearing impaired users) Integration with existing text interfaces Command-line or web-based interaction In this configuration, the message flow is straightforward:
THE VOICE ASSISTANT'S BODY: UNDERSTANDING OVOS HARDWARE INTEGRATION
While previous articles covered how your assistant listens, thinks, and speaks, now we’ll explore how it interacts with physical hardware through the Platform Hardware Abstraction Layer (PHAL) system. Evolution of Hardware Support The PHAL system’s history helps explain its design. Originally, Mycroft AI’s code was tightly coupled to their Mark 1 device, which included: LED panel for eyes and mouth Custom Arduino-controlled circuit board Specific audio hardware configuration When Mycroft developed their Mark 2 device, they discovered this tight coupling made supporting new hardware difficult.
THE VOICE ASSISTANT'S BRAIN: UNDERSTANDING OVOS SKILLS
After exploring how your assistant communicates, speaks, listens, and controls hardware, let’s examine how it processes and responds to commands through the core/skills service. Core Service Overview The core service (called either “core” or “skills” depending on your OVOS implementation) coordinates two main components: Intent Engine Skills System These components work together to understand user requests and execute appropriate actions. Intent Engine The intent engine matches user requests with the appropriate skill.
THE VOICE ASSISTANT'S EARS: UNDERSTANDING OVOS LISTENER SERVICES
Continuing our exploration of OVOS and Neon.AI components, let’s examine how your assistant hears and understands spoken commands. The listener service (ovos-dinkum-listener) is like your assistant’s ears and early speech processing - it handles everything from detecting wake words to converting speech to text. Listener Architecture The listener service coordinates four critical components: Microphone input Wake word detection Voice Activity Detection (VAD) Speech-to-Text (STT) These components communicate through the message bus we discussed in the previous article, working together to turn audio into text commands your assistant can process.
THE VOICE ASSISTANT'S MOUTH: UNDERSTANDING OVOS AUDIO SERVICES
After exploring how your assistant listens in our previous article, let’s look at how it speaks and plays audio. The audio service (ovos-audio) handles all sound output, from spoken responses to music playback. Audio Service Overview Just as the listener service coordinates multiple components for hearing, the audio service manages two main components: Text-to-Speech (TTS) Audio playback These components communicate through the message bus we covered in part 1, responding to requests from skills and other services.
THE VOICE ASSISTANT'S NERVOUS SYSTEM: UNDERSTANDING THE OVOS MESSAGE BUS
OpenVoiceOS (OVOS) and Neon AI offer powerful options for creating private, local voice assistants. While most users can get started with ovos-installer or a Neon image, understanding how these assistants work internally helps you customize them effectively. Let’s start with the most fundamental component: the message bus. What is a Message Bus? Think of the message bus as your assistant’s nervous system - it’s how all the different parts communicate. Just as your nervous system carries signals between your brain, ears, and mouth, the message bus carries messages between your assistant’s components.
GUNICORN IN CONTAINERS
There is a dearth of information on how to run a Flask/Django app on Kubernetes using gunicorn (mostly for good reason!). What information is available is often conflicting and confusing. Based on issues I’ve seen with my customers at Defiance Digital in the last year or so, I developed a test repository to experiment with different configurations and see which is best. tl;dr The conventional wisdom to use multiple workers in a containerized instance of Flask/Django/anything that is served with gunicorn is incorrect - you should only use one or two workers per container, otherwise you’re not properly using the resources allocated to your application.