This post is part of the Voice Assistant on Raspberry Pi series.
Both Pis are configured. Time to write some code. The goal: validate the full audio pipeline on the pi-client, from button press to spoken response, without a real LLM. We hardcode a reply for now. The LLM comes in article #3.
The complete code for this article is available on GitHub.
Project structure
AudioAssistant/
├── AudioAssistant.csproj
├── Program.cs
├── Worker.cs
├── AssistantOptions.cs
└── Services/
├── GpioService.cs
├── AudioRecorderService.cs
├── WhisperTranscriptionService.cs
└── PiperSpeechService.cs
Each service has a single responsibility and plugs into .NET’s DI container. Swapping components later is easy: replace WhisperTranscriptionService with a different engine without touching Worker, for instance.
Step 1: Create the project
SSH into the pi-client:
mkdir -p ~/projects && cd ~/projects
dotnet new worker -n AudioAssistant
cd AudioAssistant
mkdir -p Services
NuGet packages:
# GPIO
dotnet add package System.Device.Gpio
# Whisper.net + universal runtime (Linux ARM64, ARM, x64)
dotnet add package Whisper.net
dotnet add package Whisper.net.Runtime
Whisper.net.Runtimeis the universal package. There’s no separateWhisper.net.Runtime.Linux.Arm64. One package covers all architectures.
Step 2: Configuration
Replace appsettings.json:
{
"Assistant": {
"GpioButtonPin": 17,
"AudioDevice": "hw:1,0",
"RecordingDurationSeconds": 10,
"WhisperModel": "ggml-base.bin",
"PiperBinary": "/home/gabriel/piper/piper/piper",
"PiperVoice": "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx",
"AudioOutputDevice": "hw:2,0"
},
"Logging": {
"LogLevel": {
"Default": "Information"
}
}
}
GpioButtonPin: 17maps to GPIO 17 (physical pin 11). Wire the button between GPIO 17 and GND.
Step 3: Program.cs
using AudioAssistant;
using AudioAssistant.Services;
var builder = Host.CreateApplicationBuilder(args);
builder.Services.Configure<AssistantOptions>(
builder.Configuration.GetSection("Assistant"));
builder.Services.AddSingleton<IGpioService, GpioService>();
builder.Services.AddSingleton<IAudioRecorderService, AudioRecorderService>();
builder.Services.AddSingleton<ITranscriptionService, WhisperTranscriptionService>();
builder.Services.AddSingleton<ISpeechService, PiperSpeechService>();
builder.Services.AddHostedService<Worker>();
var host = builder.Build();
host.Run();
Step 4: AssistantOptions.cs
namespace AudioAssistant;
public class AssistantOptions
{
public int GpioButtonPin { get; set; } = 17;
public string AudioDevice { get; set; } = "hw:1,0";
public int RecordingDurationSeconds { get; set; } = 10;
public string WhisperModel { get; set; } = "ggml-base.bin";
public string PiperBinary { get; set; } = "/home/gabriel/piper/piper/piper";
public string PiperVoice { get; set; } = "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx";
public string AudioOutputDevice { get; set; } = "hw:2,0";
}
Step 5: The services
Services/GpioService.cs
using System.Device.Gpio;
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface IGpioService : IDisposable
{
bool IsButtonPressed();
void WaitForButtonPress(CancellationToken cancellationToken);
}
public class GpioService : IGpioService
{
private readonly GpioController _gpio;
private readonly int _buttonPin;
private readonly ILogger<GpioService> _logger;
public GpioService(IOptions<AssistantOptions> options, ILogger<GpioService> logger)
{
_buttonPin = options.Value.GpioButtonPin;
_logger = logger;
_gpio = new GpioController();
_gpio.OpenPin(_buttonPin, PinMode.InputPullUp);
_logger.LogInformation("GPIO initialized on pin {Pin}", _buttonPin);
}
public bool IsButtonPressed()
=> _gpio.Read(_buttonPin) == PinValue.Low;
public void WaitForButtonPress(CancellationToken cancellationToken)
{
_logger.LogInformation("Waiting for button press...");
while (!cancellationToken.IsCancellationRequested)
{
if (IsButtonPressed())
return;
Thread.Sleep(50);
}
}
public void Dispose() => _gpio.Dispose();
}
InputPullUp: the pin is HIGH at rest and LOW when the button is pressed (wired between GPIO 17 and GND).
Services/AudioRecorderService.cs
using System.Diagnostics;
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface IAudioRecorderService
{
Task<string> RecordAsync(CancellationToken cancellationToken);
}
public class AudioRecorderService : IAudioRecorderService
{
private readonly AssistantOptions _options;
private readonly ILogger<AudioRecorderService> _logger;
public AudioRecorderService(IOptions<AssistantOptions> options, ILogger<AudioRecorderService> logger)
{
_options = options.Value;
_logger = logger;
}
public async Task<string> RecordAsync(CancellationToken cancellationToken)
{
var outputFile = Path.Combine(Path.GetTempPath(), $"audio_{Guid.NewGuid()}.wav");
_logger.LogInformation("Recording started ({Duration}s)...", _options.RecordingDurationSeconds);
var psi = new ProcessStartInfo
{
FileName = "arecord",
Arguments = $"-D {_options.AudioDevice} -f cd -t wav -d {_options.RecordingDurationSeconds} {outputFile}",
RedirectStandardError = true,
UseShellExecute = false
};
using var process = Process.Start(psi)!;
await process.WaitForExitAsync(cancellationToken);
_logger.LogInformation("Recording done: {File}", outputFile);
return outputFile;
}
}
Services/WhisperTranscriptionService.cs
using Microsoft.Extensions.Options;
using Whisper.net;
using Whisper.net.Ggml;
namespace AudioAssistant.Services;
public interface ITranscriptionService
{
Task<string> TranscribeAsync(string audioFilePath, CancellationToken cancellationToken);
}
public class WhisperTranscriptionService : ITranscriptionService
{
private readonly string _modelPath;
private readonly ILogger<WhisperTranscriptionService> _logger;
public WhisperTranscriptionService(IOptions<AssistantOptions> options, ILogger<WhisperTranscriptionService> logger)
{
_modelPath = options.Value.WhisperModel;
_logger = logger;
}
public async Task<string> TranscribeAsync(string audioFilePath, CancellationToken cancellationToken)
{
// Download model if missing
if (!File.Exists(_modelPath))
{
_logger.LogInformation("Downloading Whisper model...");
var downloader = new WhisperGgmlDownloader(new HttpClient());
using var modelStream = await downloader.GetGgmlModelAsync(GgmlType.Base);
using var fileStream = File.OpenWrite(_modelPath);
await modelStream.CopyToAsync(fileStream, cancellationToken);
}
_logger.LogInformation("Transcribing...");
using var factory = WhisperFactory.FromPath(_modelPath);
using var processor = factory.CreateBuilder()
.WithLanguage("fr")
.Build();
var result = new System.Text.StringBuilder();
// ProcessAsync takes a Stream, not a file path
using var audioStream = File.OpenRead(audioFilePath);
await foreach (var segment in processor.ProcessAsync(audioStream, cancellationToken))
{
result.Append(segment.Text);
}
var text = result.ToString().Trim();
_logger.LogInformation("Transcription: \"{Text}\"", text);
return text;
}
}
Services/PiperSpeechService.cs
using System.Diagnostics;
using Microsoft.Extensions.Options;
namespace AudioAssistant.Services;
public interface ISpeechService
{
Task SpeakAsync(string text, CancellationToken cancellationToken);
}
public class PiperSpeechService : ISpeechService
{
private readonly AssistantOptions _options;
private readonly ILogger<PiperSpeechService> _logger;
public PiperSpeechService(IOptions<AssistantOptions> options, ILogger<PiperSpeechService> logger)
{
_options = options.Value;
_logger = logger;
}
public async Task SpeakAsync(string text, CancellationToken cancellationToken)
{
_logger.LogInformation("Speaking: \"{Text}\"", text);
// Piper outputs raw audio, we pipe it to aplay
var piperPsi = new ProcessStartInfo
{
FileName = _options.PiperBinary,
Arguments = $"--model {_options.PiperVoice} --output_raw",
RedirectStandardInput = true,
RedirectStandardOutput = true,
UseShellExecute = false
};
var aplayPsi = new ProcessStartInfo
{
FileName = "aplay",
Arguments = $"-D {_options.AudioOutputDevice} -r 22050 -f S16_LE -t raw -",
RedirectStandardInput = true,
UseShellExecute = false
};
using var piper = Process.Start(piperPsi)!;
using var aplay = Process.Start(aplayPsi)!;
await piper.StandardInput.WriteLineAsync(text);
piper.StandardInput.Close();
await piper.StandardOutput.BaseStream.CopyToAsync(
aplay.StandardInput.BaseStream, cancellationToken);
aplay.StandardInput.Close();
await Task.WhenAll(
piper.WaitForExitAsync(cancellationToken),
aplay.WaitForExitAsync(cancellationToken));
}
}
Step 6: Worker.cs
using AudioAssistant.Services;
namespace AudioAssistant;
public class Worker : BackgroundService
{
private readonly IGpioService _gpio;
private readonly IAudioRecorderService _recorder;
private readonly ITranscriptionService _transcription;
private readonly ISpeechService _speech;
private readonly ILogger<Worker> _logger;
public Worker(
IGpioService gpio,
IAudioRecorderService recorder,
ITranscriptionService transcription,
ISpeechService speech,
ILogger<Worker> logger)
{
_gpio = gpio;
_recorder = recorder;
_transcription = transcription;
_speech = speech;
_logger = logger;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
_logger.LogInformation("Assistant started. Press the button to speak.");
while (!stoppingToken.IsCancellationRequested)
{
_gpio.WaitForButtonPress(stoppingToken);
if (stoppingToken.IsCancellationRequested) break;
try
{
var audioFile = await _recorder.RecordAsync(stoppingToken);
var text = await _transcription.TranscribeAsync(audioFile, stoppingToken);
// Hardcoded response — real LLM comes in article #3
var response = string.IsNullOrWhiteSpace(text)
? "I didn't catch that. Could you repeat?"
: $"You said: {text}. I'm still under construction, check back soon!";
await _speech.SpeakAsync(response, stoppingToken);
if (File.Exists(audioFile))
File.Delete(audioFile);
}
catch (Exception ex) when (!stoppingToken.IsCancellationRequested)
{
_logger.LogError(ex, "Error in audio pipeline");
await _speech.SpeakAsync("An error occurred.", stoppingToken);
}
}
}
}
Step 7: Wire the GPIO button
Pi GPIO 17 (Pin 11) ──── [Button] ──── GND (Pin 9)
No resistor needed, the Pi has a built-in pull-up (InputPullUp).
Pin 1: 3.3V Pin 2: 5V
Pin 9: GND Pin 11: GPIO 17 ← button here
Classic 4-pin tactile button gotcha: the two wires must go on opposite sides of the button (top and bottom), not on the same pair of legs. Same pair means pressing does nothing.
GPIO 17 wire → top-left leg [1] GND wire → bottom-left leg [3] ← opposite side
Validate the wiring before running the Worker Service:
python3
import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.IN, pull_up_down=GPIO.PUD_UP)
import time
while True:
print(GPIO.input(17))
time.sleep(0.3)
You should see 1 at rest and 0 when pressed. Ctrl+C to stop.
Finding the right audio devices
aplay -l # output devices (speakers, headphones)
arecord -l # input devices (microphones)
The device format is always hw:<card>,<device>. If arecord -l shows card 3: Device [USB Audio Device], device 0, the device string is hw:3,0. Update appsettings.json accordingly.
Step 8: Build and test
cd ~/projects/AudioAssistant
dotnet build
dotnet run
The logs should show:
info: AudioAssistant.Worker[0] Assistant started. Press the button to speak.
info: AudioAssistant.Worker[0] Waiting for button press...
Press the button. If you hear a spoken response, the pipeline works.
Auto-boot with systemd comes in article #4, with a more complete setup (After=sound.target, environment variables).
The v1 pipeline is complete: [GPIO Button] → [arecord] → [Whisper.net] → [Hardcoded response] → [Piper TTS] → [aplay]. Pure .NET 10, no Python, no external dependencies beyond arecord, aplay, and piper.
The complete code for this article is available on GitHub.
Series articles
- Setting Up Both Raspberry Pis
- .NET 10 Worker Service and Audio Pipeline (this article)
- Ollama Integration and Home Context
- Memory, Silence Detection, and systemd
- Real-Time Weather and Swapping to the Claude API
- Function Calling: Teaching Tools to the Assistant
- Retrospective, Lessons Learned, and v2 Roadmap
In article #3, the hardcoded response gets replaced by a real HTTP call to Ollama on the pi-cerveau. The pipeline starts to feel like an actual assistant.
This post was written with AI assistance and edited by me.