SSH OVER AWS SYSTEMS MANAGER: DITCHING KEY PAIRS FOR IAM AUTHENTICATION

If you’ve ever managed SSH keys for EC2 instances at scale, you know the pain. Keys get lost, rotated, shared inappropriately, or worse - committed to Git repos. What if I told you that you could use all your familiar SSH tooling - ssh, scp, rsync, port forwarding, and more - without ever touching an SSH key pair? Enter AWS Systems Manager Session Manager as an SSH proxy. Why SSH Over SSM? The traditional SSH model requires:

Thumbnail image

CUSTOM RESPONSE SOUNDS FOR OVOS/NEON

Neon.AI and OpenVoice OS (OVOS) both offer out-of-the-box smart speaker/voice assistant platforms, with Neon using OVOS as a base for their own implementation. Both platforms are designed to be privacy-respecting and run on low-power hardware, making them ideal for home automation and personal voice assistant projects. They are also both designed to be extensible, allowing you to add your own customizations and features. One of the most fun customizations you can make to your OVOS/Neon voice assistant is to change the response sounds. By default, OVOS and Neon use a set of standard sounds for various events, but you can easily replace these with your own custom sounds. This can be a fun way to personalize your voice assistant and make it feel more like your own.

Thumbnail image

SNAPSHOT TESTING WITH SYRUPY IN AWS CDK PYTHON

In 2023, I wrote a post on snapshot testing in AWS CDK Python. Since then, I’ve switched from pytest-snapshot to syrupy. At Defiance Digital, we use the AWS CDK for almost everything. Generally we use TypeScript because it’s the original language for the CDK, everything using JSII transpiles back to TypeScript, and it has the most compability with the CDK. However, we have a few projects that use Python, and on those I’ve really been missing Jest snapshot testing.

Thumbnail image

TTS OPTIONS FOR OPENVOICEOS

OpenVoiceOS, or OVOS, provides a flexible plugin system for text-to-speech (TTS) synthesis. This system allows you to choose the TTS engine that best suits your needs, whether you prioritize voice quality, speed, privacy, or language support. Running Plugins Directly OVOS maintains several TTS plugins that can run directly on your voice assistant hardware. Here are the currently maintained plugins, categorized by their primary use case: Hardware/Environment Model GitHub Link Notes GPU-Optimized Coqui TTS ovos-tts-plugin-coqui Local TTS using Coqui models, best performance with GPU but some models work on CPU CPU-Capable Edge TTS ovos-tts-plugin-edge-tts Microsoft Edge TTS voices, requires internet connection but provides fast, high-quality output and streaming capability CPU-Capable Piper ovos-tts-plugin-piper Local TTS with multiple voice options, good performance on modern hardware (optimized for Raspberry Pi) CPU-Capable Mimic ovos-tts-plugin-mimic Classic Mycroft offline TTS engine, lightweight but robotic sounding CPU-Capable ESpeak-NG ovos-tts-plugin-espeakNG Lightweight offline TTS, supports many languages but robotic sounding CPU-Capable Pico ovos-tts-plugin-pico Very lightweight offline TTS engine API-Based/Cloud Azure ovos-tts-plugin-azure Microsoft Azure Cognitive Services TTS, paid service with high quality but no guarantee of privacy API-Based/Cloud Polly ovos-tts-plugin-polly Amazon Polly TTS service, paid service with high quality but no guarantee of privacy API-Based/Cloud MaryTTS ovos-tts-plugin-marytts Connect to MaryTTS server, can be self-hosted, considered outdated and not well maintained Language-Specific Nòs ovos-tts-plugin-nos Galician language TTS Language-Specific Cotovia ovos-tts-plugin-cotovia Alternative Galician language TTS Language-Specific Matxa ovos-tts-plugin-matxa-multispeaker-cat Catalan language multi-speaker TTS Novelty BeepSpeak ovos-tts-plugin-beepspeak R2D2-style robotic sounds, works well but requires subtitles to be understood Novelty SAM ovos-tts-plugin-SAM Software Automatic Mouth, retro-style speech synthesis, too digitized for regular use Note: While Mimic2 and Mimic3-server plugins exist in archived form, they are no longer supported. Users are recommended to use Piper or other maintained alternatives for better performance and ongoing support.

Thumbnail image

STT OPTIONS FOR OPENVOICEOS

OpenVoiceOS, or OVOS, is the spiritual successor (and fork) to the Mycroft voice assistant. It is a privacy-focused, open-source voice assistant that you can run on your own hardware. OVOS has a plugin system that allows you to swap out the default speech-to-text (STT) engine for one that you prefer. The plugin system, while powerful, can be confusing due to the sheer number of options available. This post will cover some of the STT options available to you and when you might use them.

BUILDING VOICE ASSISTANT CONFIGURATIONS: ADVANCED OVOS SETUPS

After exploring each component of OVOS and Neon assistants, let’s examine how to mix and match these components to create custom configurations. The modular nature of OVOS allows for setups ranging from minimal text-only systems to complex distributed networks. Text-Only Assistants The simplest possible configuration requires just two components: Message bus Core/skills service This minimal setup can be useful for: Development and testing Accessibility (vision/hearing impaired users) Integration with existing text interfaces Command-line or web-based interaction In this configuration, the message flow is straightforward:

THE VOICE ASSISTANT'S BODY: UNDERSTANDING OVOS HARDWARE INTEGRATION

While previous articles covered how your assistant listens, thinks, and speaks, now we’ll explore how it interacts with physical hardware through the Platform Hardware Abstraction Layer (PHAL) system. Evolution of Hardware Support The PHAL system’s history helps explain its design. Originally, Mycroft AI’s code was tightly coupled to their Mark 1 device, which included: LED panel for eyes and mouth Custom Arduino-controlled circuit board Specific audio hardware configuration When Mycroft developed their Mark 2 device, they discovered this tight coupling made supporting new hardware difficult. OVOS solved this by abstracting hardware controls into plugins, making the assistant hardware-agnostic.

THE VOICE ASSISTANT'S BRAIN: UNDERSTANDING OVOS SKILLS

After exploring how your assistant communicates, speaks, listens, and controls hardware, let’s examine how it processes and responds to commands through the core/skills service. Core Service Overview The core service (called either “core” or “skills” depending on your OVOS implementation) coordinates two main components: Intent Engine Skills System These components work together to understand user requests and execute appropriate actions. Intent Engine The intent engine matches user requests with the appropriate skill. Currently, OVOS primarily uses Padatious, an intent parser created by Mycroft AI that generates models based on example phrases.

THE VOICE ASSISTANT'S EARS: UNDERSTANDING OVOS LISTENER SERVICES

Continuing our exploration of OVOS and Neon.AI components, let’s examine how your assistant hears and understands spoken commands. The listener service (ovos-dinkum-listener) is like your assistant’s ears and early speech processing - it handles everything from detecting wake words to converting speech to text. Listener Architecture The listener service coordinates four critical components: Microphone input Wake word detection Voice Activity Detection (VAD) Speech-to-Text (STT) These components communicate through the message bus we discussed in the previous article, working together to turn audio into text commands your assistant can process.

THE VOICE ASSISTANT'S MOUTH: UNDERSTANDING OVOS AUDIO SERVICES

After exploring how your assistant listens in our previous article, let’s look at how it speaks and plays audio. The audio service (ovos-audio) handles all sound output, from spoken responses to music playback. Audio Service Overview Just as the listener service coordinates multiple components for hearing, the audio service manages two main components: Text-to-Speech (TTS) Audio playback These components communicate through the message bus we covered in part 1, responding to requests from skills and other services.