Worker Service .NET 10 et pipeline audio

Cet article fait partie de la série Assistant vocal sur Raspberry Pi.

Les deux Pi sont configurés. Maintenant on code. L’objectif : valider le pipeline audio complet sur le pi-client, de la pression du bouton jusqu’à la réponse vocale, sans LLM. On hardcode une réponse pour l’instant. Le LLM arrive à l’article #3.

Le code complet de cet article est disponible sur GitHub.

L’architecture du projet

AudioAssistant/
├── AudioAssistant.csproj
├── Program.cs
├── Worker.cs
├── AssistantOptions.cs
└── Services/
    ├── GpioService.cs
    ├── AudioRecorderService.cs
    ├── WhisperTranscriptionService.cs
    └── PiperSpeechService.cs

Chaque service a une responsabilité unique et s’injecte via le DI container de .NET. Ça facilite le swap de composants plus tard : remplacer WhisperTranscriptionService par un autre moteur sans toucher au Worker, par exemple.

Étape 1 : Créer le projet

Connectez-vous en SSH sur le pi-client :

mkdir -p ~/projects && cd ~/projects
dotnet new worker -n AudioAssistant
cd AudioAssistant
mkdir -p Services

Packages NuGet :

# GPIO
dotnet add package System.Device.Gpio

# Whisper.net + runtime universel (Linux ARM64, ARM, x64)
dotnet add package Whisper.net
dotnet add package Whisper.net.Runtime

Whisper.net.Runtime est le package universel. Il n’existe pas de package Whisper.net.Runtime.Linux.Arm64 séparé; un seul package couvre toutes les architectures.

Étape 2 : Configuration

Remplacez appsettings.json :

{
  "Assistant": {
    "GpioButtonPin": 17,
    "AudioDevice": "hw:1,0",
    "RecordingDurationSeconds": 10,
    "WhisperModel": "ggml-base.bin",
    "PiperBinary": "/home/gabriel/piper/piper/piper",
    "PiperVoice": "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx",
    "AudioOutputDevice": "hw:2,0"
  },
  "Logging": {
    "LogLevel": {
      "Default": "Information"
    }
  }
}

GpioButtonPin: 17 correspond au GPIO 17 (pin physique 11). Le bouton se câble entre GPIO 17 et GND.

Étape 3 : Program.cs

using AudioAssistant;
using AudioAssistant.Services;

var builder = Host.CreateApplicationBuilder(args);

builder.Services.Configure<AssistantOptions>(
    builder.Configuration.GetSection("Assistant"));

builder.Services.AddSingleton<IGpioService, GpioService>();
builder.Services.AddSingleton<IAudioRecorderService, AudioRecorderService>();
builder.Services.AddSingleton<ITranscriptionService, WhisperTranscriptionService>();
builder.Services.AddSingleton<ISpeechService, PiperSpeechService>();
builder.Services.AddHostedService<Worker>();

var host = builder.Build();
host.Run();

Étape 4 : AssistantOptions.cs

namespace AudioAssistant;

public class AssistantOptions
{
    public int GpioButtonPin { get; set; } = 17;
    public string AudioDevice { get; set; } = "hw:1,0";
    public int RecordingDurationSeconds { get; set; } = 10;
    public string WhisperModel { get; set; } = "ggml-base.bin";
    public string PiperBinary { get; set; } = "/home/gabriel/piper/piper/piper";
    public string PiperVoice { get; set; } = "/home/gabriel/piper-voices/fr_FR-siwis-low.onnx";
    public string AudioOutputDevice { get; set; } = "hw:2,0";
}

Étape 5 : Les services

Services/GpioService.cs

using System.Device.Gpio;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface IGpioService : IDisposable
{
    bool IsButtonPressed();
    void WaitForButtonPress(CancellationToken cancellationToken);
}

public class GpioService : IGpioService
{
    private readonly GpioController _gpio;
    private readonly int _buttonPin;
    private readonly ILogger<GpioService> _logger;

    public GpioService(IOptions<AssistantOptions> options, ILogger<GpioService> logger)
    {
        _buttonPin = options.Value.GpioButtonPin;
        _logger = logger;
        _gpio = new GpioController();
        _gpio.OpenPin(_buttonPin, PinMode.InputPullUp);
        _logger.LogInformation("GPIO initialisé sur le pin {Pin}", _buttonPin);
    }

    public bool IsButtonPressed()
        => _gpio.Read(_buttonPin) == PinValue.Low;

    public void WaitForButtonPress(CancellationToken cancellationToken)
    {
        _logger.LogInformation("En attente d'une pression sur le bouton...");
        while (!cancellationToken.IsCancellationRequested)
        {
            if (IsButtonPressed())
                return;
            Thread.Sleep(50);
        }
    }

    public void Dispose() => _gpio.Dispose();
}

InputPullUp : le pin est HIGH au repos et LOW quand le bouton est pressé (câblé entre GPIO 17 et GND).

Services/AudioRecorderService.cs

using System.Diagnostics;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface IAudioRecorderService
{
    Task<string> RecordAsync(CancellationToken cancellationToken);
}

public class AudioRecorderService : IAudioRecorderService
{
    private readonly AssistantOptions _options;
    private readonly ILogger<AudioRecorderService> _logger;

    public AudioRecorderService(IOptions<AssistantOptions> options, ILogger<AudioRecorderService> logger)
    {
        _options = options.Value;
        _logger = logger;
    }

    public async Task<string> RecordAsync(CancellationToken cancellationToken)
    {
        var outputFile = Path.Combine(Path.GetTempPath(), $"audio_{Guid.NewGuid()}.wav");
        _logger.LogInformation("Enregistrement démarré ({Duration}s)...", _options.RecordingDurationSeconds);

        var psi = new ProcessStartInfo
        {
            FileName = "arecord",
            Arguments = $"-D {_options.AudioDevice} -f cd -t wav -d {_options.RecordingDurationSeconds} {outputFile}",
            RedirectStandardError = true,
            UseShellExecute = false
        };

        using var process = Process.Start(psi)!;
        await process.WaitForExitAsync(cancellationToken);

        _logger.LogInformation("Enregistrement terminé : {File}", outputFile);
        return outputFile;
    }
}

Services/WhisperTranscriptionService.cs

using Microsoft.Extensions.Options;
using Whisper.net;
using Whisper.net.Ggml;

namespace AudioAssistant.Services;

public interface ITranscriptionService
{
    Task<string> TranscribeAsync(string audioFilePath, CancellationToken cancellationToken);
}

public class WhisperTranscriptionService : ITranscriptionService
{
    private readonly string _modelPath;
    private readonly ILogger<WhisperTranscriptionService> _logger;

    public WhisperTranscriptionService(IOptions<AssistantOptions> options, ILogger<WhisperTranscriptionService> logger)
    {
        _modelPath = options.Value.WhisperModel;
        _logger = logger;
    }

    public async Task<string> TranscribeAsync(string audioFilePath, CancellationToken cancellationToken)
    {
        // Télécharger le modèle si absent
        if (!File.Exists(_modelPath))
        {
            _logger.LogInformation("Téléchargement du modèle Whisper...");
            var downloader = new WhisperGgmlDownloader(new HttpClient());
            using var modelStream = await downloader.GetGgmlModelAsync(GgmlType.Base);
            using var fileStream = File.OpenWrite(_modelPath);
            await modelStream.CopyToAsync(fileStream, cancellationToken);
        }

        _logger.LogInformation("Transcription en cours...");

        using var factory = WhisperFactory.FromPath(_modelPath);
        using var processor = factory.CreateBuilder()
            .WithLanguage("fr")
            .Build();

        var result = new System.Text.StringBuilder();

        // ProcessAsync prend un Stream, pas un chemin de fichier
        using var audioStream = File.OpenRead(audioFilePath);
        await foreach (var segment in processor.ProcessAsync(audioStream, cancellationToken))
        {
            result.Append(segment.Text);
        }

        var text = result.ToString().Trim();
        _logger.LogInformation("Transcription : \"{Text}\"", text);
        return text;
    }
}

Services/PiperSpeechService.cs

using System.Diagnostics;
using Microsoft.Extensions.Options;

namespace AudioAssistant.Services;

public interface ISpeechService
{
    Task SpeakAsync(string text, CancellationToken cancellationToken);
}

public class PiperSpeechService : ISpeechService
{
    private readonly AssistantOptions _options;
    private readonly ILogger<PiperSpeechService> _logger;

    public PiperSpeechService(IOptions<AssistantOptions> options, ILogger<PiperSpeechService> logger)
    {
        _options = options.Value;
        _logger = logger;
    }

    public async Task SpeakAsync(string text, CancellationToken cancellationToken)
    {
        _logger.LogInformation("Synthèse vocale : \"{Text}\"", text);

        // Piper génère du raw audio, on pipe vers aplay
        var piperPsi = new ProcessStartInfo
        {
            FileName = _options.PiperBinary,
            Arguments = $"--model {_options.PiperVoice} --output_raw",
            RedirectStandardInput = true,
            RedirectStandardOutput = true,
            UseShellExecute = false
        };

        var aplayPsi = new ProcessStartInfo
        {
            FileName = "aplay",
            Arguments = $"-D {_options.AudioOutputDevice} -r 22050 -f S16_LE -t raw -",
            RedirectStandardInput = true,
            UseShellExecute = false
        };

        using var piper = Process.Start(piperPsi)!;
        using var aplay = Process.Start(aplayPsi)!;

        await piper.StandardInput.WriteLineAsync(text);
        piper.StandardInput.Close();

        await piper.StandardOutput.BaseStream.CopyToAsync(
            aplay.StandardInput.BaseStream, cancellationToken);
        aplay.StandardInput.Close();

        await Task.WhenAll(
            piper.WaitForExitAsync(cancellationToken),
            aplay.WaitForExitAsync(cancellationToken));
    }
}

Étape 6 : Worker.cs

using AudioAssistant.Services;

namespace AudioAssistant;

public class Worker : BackgroundService
{
    private readonly IGpioService _gpio;
    private readonly IAudioRecorderService _recorder;
    private readonly ITranscriptionService _transcription;
    private readonly ISpeechService _speech;
    private readonly ILogger<Worker> _logger;

    public Worker(
        IGpioService gpio,
        IAudioRecorderService recorder,
        ITranscriptionService transcription,
        ISpeechService speech,
        ILogger<Worker> logger)
    {
        _gpio = gpio;
        _recorder = recorder;
        _transcription = transcription;
        _speech = speech;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        _logger.LogInformation("Assistant démarré. Appuyez sur le bouton pour parler.");

        while (!stoppingToken.IsCancellationRequested)
        {
            _gpio.WaitForButtonPress(stoppingToken);
            if (stoppingToken.IsCancellationRequested) break;

            try
            {
                var audioFile = await _recorder.RecordAsync(stoppingToken);
                var texte = await _transcription.TranscribeAsync(audioFile, stoppingToken);

                // Réponse hardcodée — le LLM arrive à l'article #3
                var reponse = string.IsNullOrWhiteSpace(texte)
                    ? "Je n'ai pas bien entendu. Pouvez-vous répéter?"
                    : $"Vous avez dit : {texte}. Je suis encore en construction, revenez bientôt!";

                await _speech.SpeakAsync(reponse, stoppingToken);

                if (File.Exists(audioFile))
                    File.Delete(audioFile);
            }
            catch (Exception ex) when (!stoppingToken.IsCancellationRequested)
            {
                _logger.LogError(ex, "Erreur dans le pipeline audio");
                await _speech.SpeakAsync("Une erreur s'est produite.", stoppingToken);
            }
        }
    }
}

Étape 7 : Câbler le bouton GPIO

Pi GPIO 17 (Pin 11) ──── [Bouton] ──── GND (Pin 9)

Pas de résistance nécessaire, le Pi a une pull-up interne (InputPullUp).

Pin  1 : 3.3V          Pin  2 : 5V
Pin  9 : GND           Pin 11 : GPIO 17  ← bouton ici

Piège classique avec les boutons tactiles 4 pattes : les deux fils doivent être sur des côtés opposés du bouton (haut/bas), pas sur la même paire de pattes. Si les fils sont sur la même paire, appuyer ne change rien.
Fil GPIO 17 → patte haut gauche [1]
Fil GND     → patte bas gauche  [3]  ← côté opposé

Avant de lancer le Worker Service, validez le câblage avec Python :

python3

import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.IN, pull_up_down=GPIO.PUD_UP)
import time
while True:
    print(GPIO.input(17))
    time.sleep(0.3)

Vous devez voir 1 au repos et 0 quand vous appuyez. Ctrl+C pour arrêter.

Trouver les bons périphériques audio

aplay -l    # périphériques de sortie
arecord -l  # périphériques d'entrée

Le format du device est toujours hw:<card>,<device>. Si arecord -l affiche card 3: Device [USB Audio Device], device 0, le device est hw:3,0. Mettez à jour appsettings.json en conséquence.

Étape 8 : Build et test

cd ~/projects/AudioAssistant
dotnet build
dotnet run

Les logs devraient afficher :

info: AudioAssistant.Worker[0] Assistant démarré. Appuyez sur le bouton pour parler.
info: AudioAssistant.Worker[0] En attente d'une pression sur le bouton...

Appuyez sur le bouton. Si vous entendez une réponse vocale, le pipeline est validé.

Le démarrage automatique au boot avec systemd arrive à l’article #4, avec une configuration plus complète (After=sound.target, variables d’environnement).

Le pipeline v1 est complet : [Bouton GPIO] → [arecord] → [Whisper.net] → [Réponse hardcodée] → [Piper TTS] → [aplay]. Tout tourne en .NET 10, sans Python, sans dépendances externes sauf arecord, aplay et piper.

Le code complet de cet article est disponible sur GitHub.

Articles de la série

Setup des deux Raspberry Pi
Worker Service .NET 10 et pipeline audio (cet article)
Intégration Ollama et contexte maison
Mémoire, détection de silence et systemd
Météo en temps réel et swap Claude API
Function Calling : enseigner des outils à l’assistant
Bilan, leçons apprises et perspectives v2

Dans l’article #3, on remplace la réponse hardcodée par un vrai appel HTTP à Ollama sur le pi-cerveau. Le pipeline commence à ressembler à quelque chose.

Cet article a été rédigé avec l’aide de l’IA et révisé par moi.