9.8 KiB
Audio Output Design — Full 5-Channel Mixer + cpal Backend
Overview
Add real audio output to the desktop NES emulator client. This involves two independent pieces of work:
- Full APU mixer — replace the current DMC-only mixer with proper 5-channel mixing (Pulse 1, Pulse 2, Triangle, Noise, DMC) using NES hardware-accurate formulas.
- cpal audio backend — replace the stub
AudioSinkin the desktop client with a real audio output usingcpal, connected via a lock-free ring buffer. Add a volume slider to the GTK4 header bar.
1. Full APU Mixer
Current State
AudioMixer::push_cycles() in src/runtime/audio.rs reads only apu_regs[0x11] (DMC output level) and generates a single-channel signal. All other channels are ignored.
Design
1.1 Channel Outputs Struct
Add to src/native_core/apu/:
#[derive(Debug, Clone, Copy, Default)]
pub struct ChannelOutputs {
pub pulse1: u8, // 0–15
pub pulse2: u8, // 0–15
pub triangle: u8, // 0–15
pub noise: u8, // 0–15
pub dmc: u8, // 0–127
}
1.2 New APU Internal State
The current Apu struct lacks timer counters and sequencer state needed to compute channel outputs. The following fields must be added:
Pulse channels (×2):
pulse_timer_counter: [u16; 2]— countdown timer, clocked every other CPU cyclepulse_duty_step: [u8; 2]— position in 8-step duty cycle sequence (0–7)
Triangle channel:
triangle_timer_counter: u16— countdown timer, clocked every CPU cycletriangle_step: u8— position in 32-step triangle sequence (0–31)
Noise channel:
noise_timer_counter: u16— countdown timer, clocked every other CPU cyclenoise_lfsr: u16— 15-bit linear feedback shift register, initialized to 1
These must be clocked in Apu::clock_cpu_cycle():
- Pulse and noise timers decrement every 2 CPU cycles (APU rate, tracked via existing
cpu_cycle_parity) - Triangle timer decrements every 1 CPU cycle
- When a timer reaches 0, it reloads from the period register and advances the corresponding sequencer
1.3 APU Method
Add Apu::channel_outputs(&self) -> ChannelOutputs that computes the current output level of each channel:
- Pulse 1/2: Output is 0 if length counter is 0, or sweep mutes the channel, or duty cycle sequencer output is 0. Otherwise output is the envelope volume (0–15).
- Triangle: Output is the value from the 32-step triangle waveform lookup at
triangle_step. Muted (output 0) if length counter or linear counter is 0. - Noise: Output is 0 if length counter is 0 or LFSR bit 0 is 1. Otherwise output is the envelope volume (0–15).
- DMC: Output is
dmc_output_level(0–127), already tracked.
1.4 Save-State Compatibility
Adding new fields to Apu changes the save-state binary format. The save_state_tail() and load_state_tail() methods must be updated to serialize/deserialize the new fields. This is a breaking change to the save-state format — old save states will not be compatible. Since the project is pre-1.0, this is acceptable without a migration strategy.
1.5 Bus Exposure
Add NativeBus::apu_channel_outputs(&self) -> ChannelOutputs to expose channel outputs alongside the existing apu_registers().
1.6 Mixer Update
Change AudioMixer::push_cycles() signature:
// Before:
pub fn push_cycles(&mut self, cpu_cycles: u8, apu_regs: &[u8; 0x20], out: &mut Vec<f32>)
// After:
pub fn push_cycles(&mut self, cpu_cycles: u8, channels: ChannelOutputs, out: &mut Vec<f32>)
Mixing formula (nesdev wiki linear approximation):
pulse_out = 0.00752 * (pulse1 + pulse2)
tnd_out = 0.00851 * triangle + 0.00494 * noise + 0.00335 * dmc
output = pulse_out + tnd_out
Output range is approximately [0.0, 1.0]. Normalize to [-1.0, 1.0] by: sample = output * 2.0 - 1.0.
Known simplifications:
- This uses the linear approximation, not the more accurate nonlinear lookup tables from real NES hardware. Nonlinear mixing can be added later as an enhancement.
- The current
repeat_nresampling approach (nearest-neighbor) produces aliasing. A low-pass filter or bandlimited interpolation can be added later. - Real NES hardware applies two first-order high-pass filters (~90Hz and ~440Hz). Without these, channel enable/disable will cause audible pops. Deferred for a future iteration.
1.7 Runtime Integration
Update NesRuntime::run_until_frame_complete_with_audio() in src/runtime/core.rs to pass ChannelOutputs (from self.bus.apu_channel_outputs()) instead of the register slice to the mixer.
2. Lock-Free Ring Buffer
Location
New file: src/runtime/ring_buffer.rs.
Design
SPSC (single-producer, single-consumer) ring buffer using AtomicUsize for head/tail indices:
- Capacity: 4096 f32 samples (~85ms at 48kHz) — enough to absorb frame timing jitter
- Producer: emulation thread writes samples after each frame via
push_samples() - Consumer: cpal audio callback reads samples via
pop_samples() - Underrun (buffer empty): consumer outputs silence (0.0)
- Overrun (buffer full): producer drops new samples (standard SPSC behavior — only the consumer moves the tail pointer)
pub struct RingBuffer {
buffer: Box<[f32]>,
capacity: usize,
head: AtomicUsize, // write position (producer only)
tail: AtomicUsize, // read position (consumer only)
}
impl RingBuffer {
pub fn new(capacity: usize) -> Self;
pub fn push(&self, samples: &[f32]) -> usize; // returns samples actually written
pub fn pop(&self, out: &mut [f32]) -> usize; // returns samples actually read
pub fn clear(&self); // reset both pointers (call when no concurrent access)
}
Thread safety: RingBuffer is Send + Sync. Shared via Arc<RingBuffer>.
3. Desktop cpal Audio Backend
Dependencies
Add to crates/nesemu-desktop/Cargo.toml:
cpal = "0.15"
CpalAudioSink
pub struct CpalAudioSink {
_stream: cpal::Stream, // keeps the audio stream alive
ring: Arc<RingBuffer>,
volume: Arc<AtomicU32>, // f32 bits stored atomically
}
- Implements
nesemu::AudioOutput—push_samples()writes to ring buffer - Created when a ROM is loaded; the ring buffer is cleared on ROM change to prevent stale samples
- cpal callback: reads from ring buffer, multiplies each sample by volume, writes to output buffer
- On pause: emulation stops producing samples → callback outputs silence (underrun behavior)
- On ROM change: old stream is dropped, ring buffer cleared, new stream created
Error Handling
If no audio device is available, or the requested format is unsupported, or the stream fails to build:
- Log the error to stderr
- Fall back to
NullAudiobehavior (discard samples silently) - The emulator continues to work without sound
The cpal error callback also logs errors to stderr without crashing.
Stream Configuration
- Sample rate: 48,000 Hz
- Channels: 1 (mono — NES is mono)
- Sample format: f32
- Buffer size: let cpal choose (typically 256–1024 frames)
Volume
Arc<AtomicU32>shared between UI and cpal callback- Stored as
f32::to_bits()/f32::from_bits() - Default: 0.75 (75%)
- Applied in cpal callback:
sample * volume
4. UI — Volume Slider
Widget
gtk::Scale (horizontal) added to the header bar:
- Range: 0.0 to 1.0 (displayed as 0–100%)
- Default: 0.75
connect_value_changed→ atomically update volume
Placement
In the header bar, after the existing control buttons (open, pause, reset), with a small speaker icon label.
5. Threading Model
- GTK main thread: runs emulation via
glib::timeout_add_local(~16ms tick), UI events, volume slider updates - cpal OS thread: audio callback reads from ring buffer — this is the only cross-thread boundary
- The ring buffer (
Arc<RingBuffer>) and volume (Arc<AtomicU32>) are the only shared state between threads
6. Data Flow
CPU instruction step (GTK main thread)
→ APU.clock_cpu_cycle() [updates internal channel state]
→ AudioMixer.push_cycles(cycles, apu.channel_outputs())
→ mix 5 channels → f32 sample
→ append to frame audio buffer (Vec<f32>)
Per frame (GTK main thread):
→ FrameExecutor collects audio_buffer
→ CpalAudioSink.push_samples(audio_buffer)
→ write to Arc<RingBuffer>
cpal callback (separate OS thread):
→ read from Arc<RingBuffer>
→ multiply by volume (Arc<AtomicU32>)
→ write to hardware audio buffer
7. Files Changed
| File | Change |
|---|---|
src/native_core/apu/types.rs |
Add ChannelOutputs struct, new timer/sequencer fields to Apu and ApuStateTail |
src/native_core/apu/api.rs |
Add channel_outputs() method, update save_state_tail/load_state_tail |
src/native_core/apu/timing.rs |
Clock new timer/sequencer fields in clock_cpu_cycle() |
src/native_core/bus.rs |
Add apu_channel_outputs() |
src/runtime/audio.rs |
Rewrite mixer with 5-channel formula |
src/runtime/ring_buffer.rs (new) |
Lock-free SPSC ring buffer |
src/runtime/core.rs |
Pass channel_outputs() to mixer in run_until_frame_complete_with_audio() |
src/runtime/mod.rs |
Export ring_buffer, ChannelOutputs |
crates/nesemu-desktop/Cargo.toml |
Add cpal dependency |
crates/nesemu-desktop/src/main.rs |
Replace stub AudioSink with CpalAudioSink, add volume slider |
8. Testing
- Existing tests in
tests/public_api.rsmust continue to pass (they use NullAudio). Note: the regression hash test (public_api_regression_hashes_for_reference_rom) will produce a different audio hash due to the mixer change — the expected hash must be updated. - Unit test for ring buffer: push/pop, underrun, overrun, clear
- Unit test for mixer: known channel outputs → expected sample values
- Manual test: load a ROM, verify audible sound through speakers