From 9db17e007e4aac59d8d5195196d3ecc4f3eaf2b3 Mon Sep 17 00:00:00 2001 From: Andrew Hutchings Date: Sat, 4 Jul 2026 12:33:27 +0100 Subject: [PATCH 1/3] agnus: model the DDF sequencer start/stop flip-flops for FMODE=0 fetches Replace the value-range DDF window with the hardware's flop model: DDFSTRT/DDFSTOP comparator matches set/clear flip-flops, a stop request drains through one final fetch unit (which applies the per-plane modulos), the hardwired window ($18/$D8, HARDDIS $E0) gates starts and forces stops, and the sequencer state carries across line boundaries. The flop semantics are transcribed from vAmiga 4.4's Sequencer (OCS and ECS variants) as a free-standing module (chipset/ddf_sequencer.rs) plus a per-line walk (bus/ddf_line.rs) that drives the slot arbiter and the DMA capture loop. Register writes rebuild the line table at their hardware commit clocks: DDF writes reach the comparators four colour clocks after the write slot, an old DDFSTOP still fires on its commit clock while an old DDFSTRT does not, DMACON/BPLCON0 keep their 2/3-cck delays as strobes. Wide-FMODE (AGA quantum > 1) fetches keep the value-window plan. Missed or invalid comparators now produce the hardware behaviours a value range cannot express: a missed DDFSTOP runs to the hardware-stop drain, a DDFSTRT match past $D8 wraps the run through horizontal blanking into the next line, an early-blanked DDFSTRT ($10) never starts on OCS, and ECS's latched BPHSTART restarts runs at the hard window. Word addressing is unit-based so late-enabled planes keep their word positions. Ground truth: the vAmigaTS Agnus/DDF/DDF/oldhwstop1-4 A500 photos. The bottom swatch band there is a cumulative hash of every preceding row's fetched word count via the bitplane pointer progression, and the photos match the flop model's output (vAmiga's render) exactly while mismatching the old value-window model. Scores vs vAmiga 4.4 refs: oldhwstop1 51.5%->0.3%, oldhwstop2 50.2%->2.7%, oldhwstop3 59.8%->16.7%, oldhwstop4 53.4%->14.2% (the 3/4 residual is render-side placement of the wrapped rows, a follow-up); Agnus/DDF bucket mean 6.9%->4.2%. 18 value-window unit tests re-derived to the flop semantics with rule comments; new FSM rule suite and bus-level table tests. STATE_VERSION 15 (Bus gained the serialized sequencer flop state). A 15s KS1.3 boot runs ~12% faster than main (the per-cck table lookup replaces the window math and plan-cache probes) and its screenshot is byte-identical. --- docs/internals/chipset.md | 26 +- docs/internals/timing.md | 37 +- src/bus.rs | 20 ++ src/bus/custom_regs.rs | 30 ++ src/bus/ddf_line.rs | 581 ++++++++++++++++++++++++++++++++ src/bus/dma_slots.rs | 12 + src/bus/frame_capture.rs | 4 + src/bus/tests.rs | 117 +++++-- src/chipset/ddf_sequencer.rs | 631 +++++++++++++++++++++++++++++++++++ src/chipset/mod.rs | 1 + src/savestate.rs | 5 +- 11 files changed, 1404 insertions(+), 60 deletions(-) create mode 100644 src/bus/ddf_line.rs create mode 100644 src/chipset/ddf_sequencer.rs diff --git a/docs/internals/chipset.md b/docs/internals/chipset.md index 0bb301d..a5710d4 100644 --- a/docs/internals/chipset.md +++ b/docs/internals/chipset.md @@ -10,20 +10,20 @@ specification of the modelled behaviour. Agnus owns the beam: `vpos`/`hpos` counters advanced per colour clock, PAL (313 lines, 227 CCK/line) and NTSC (263 lines with long/short line alternation) geometry, the long-field flag for interlace, and VPOSR/VHPOSR. -It also owns DMACON and the display-fetch machinery: the bitplane fetch -plan for the current line is computed from DDFSTRT/DDFSTOP, the plane -count, resolution, and FMODE, producing the per-slot fetch pattern the -[arbitration model](timing) consumes. The fetch sequencer is anchored by -the DDFSTRT comparator, then each fetch block/unit uses the BPLCON0 value -visible at that block's first cycle. A mid-row BPLCON0 plane-count change -therefore cannot retroactively fetch earlier words, but it can add or -remove planes for later blocks in the same row. The sequence completes -whole fetch units: the DDF register value is first masked to the Agnus +It also owns DMACON and the display-fetch machinery: for FMODE=0 fetches +the per-line fetch table comes from the DDF sequencer flop model +(`src/chipset/ddf_sequencer.rs`; see the [arbitration model](timing) for +the flop semantics - comparator edges, stop drain through a final +modulo-applying unit, cross-line run carry, OCS/ECS rule differences). +Each fetch unit uses the BPLCON0 value the sequencer sees at that point, +so a mid-row plane-count change cannot retroactively fetch earlier words, +but it can add or remove planes for later units in the same row; word +addressing stays unit-based. The DDF register value is masked to the Agnus revision's comparator precision (OCS keeps 4-CCK precision; ECS/AGA keep -2-CCK precision), then a DDFSTOP landing mid-unit extends the fetch through -the unit starting at-or-after it (`agnus::bitplane_fetch_blocks`; the CDTV -trademark screen's hi-res $64/$A8 window fetches 20 words per row, not the -truncated 18). In lo-res OCS, bit 2 of DDFSTRT and DDFSTOP remains visible +2-CCK precision), and a DDFSTOP landing mid-unit extends the fetch through +the unit starting at-or-after it plus the drain unit (the CDTV trademark +screen's hi-res $64/$A8 window fetches 20 words per row, not the truncated +18). In lo-res OCS, bit 2 of DDFSTRT and DDFSTOP remains visible to the 8-CCK fetch-unit count: $34/$D4 fetches 21 words, $28/$D4 fetches 23, and $4A/$B6 fetches 15. Wide-FMODE units (16/32 CCK) use the same rule rather than moving DDFSTRT down to an absolute grid. In diff --git a/docs/internals/timing.md b/docs/internals/timing.md index 101121b..6ebea0d 100644 --- a/docs/internals/timing.md +++ b/docs/internals/timing.md @@ -28,19 +28,30 @@ order (`fixed_dma_owner_at`, `src/bus.rs`): 7. **Blitter** -- any remaining slots its schedule claims. 8. **CPU** -- whatever is left. -The arbiter consults the owner several times per colour clock, so the -bitplane decision (step 5) memoizes its line-invariant part: the effective -DDF window, FMODE fetch cadence, and per-plane fetch-order mask live in a -`BitplaneSlotPlan` keyed on exactly the register inputs that feed them -(`BitplaneSlotKey`, `src/bus.rs`) and are recomputed only when a register -write or a write-delay expiry changes the key. The vpos-dependent gates -(vertical display window, DDFSTRT write miss) are still evaluated live, so -the memoization cannot change behaviour. Once a line reaches DDFSTRT, the -arbiter keeps the fetch sequence anchored there but evaluates BPLCON0 at -each fetch block's first cycle. A later BPLCON0 plane-count increase does -not claim slots for earlier blocks or advance newly enabled plane pointers -for those words, but it can claim the matching slots and start advancing -those pointers on later blocks of the same row. +For FMODE=0 fetches (OCS/ECS, and AGA with a 1-word fetch quantum) the +bitplane decision (step 5) is driven by the Agnus DDF sequencer flop model +(`src/chipset/ddf_sequencer.rs`, walked per line by `src/bus/ddf_line.rs`): +DDFSTRT and DDFSTOP are comparator EDGES that set/clear flip-flops, not a +value range. A stop request drains through one final fetch unit (which +applies the modulos per plane), the hardwired window ($18/$D8, HARDDIS +relaxes the stop to $E0) gates starts and forces stops, and the flop state +carries across line boundaries. Missed comparators therefore produce the +hardware's "old stop" behaviours: a rewritten-too-late DDFSTOP lets the run +continue to the hardware-stop drain, a DDFSTRT match past $D8 starts a run +that wraps through horizontal blanking into the next line, and an +early-blanked DDFSTRT ($10 with SHW still down) never starts on OCS. The +per-line fetch table is rebuilt when DDFSTRT/DDFSTOP/BPLCON0/DMACON/DIW +writes land (DDF writes commit to the comparators four colour clocks after +the write slot; an old DDFSTOP still fires on its commit clock, an old +DDFSTRT does not - vAmiga's sequencer semantics, hardware-verified in +aggregate by the vAmigaTS Agnus/DDF/DDF/oldhwstop1-4 A500 photos). A +mid-row BPLCON0 change switches the fetch-unit slot layout from its commit +clock; word addressing is unit-based, so late-enabled planes keep their +word positions and earlier words stay zero. +Wide-FMODE (quantum > 1) fetches keep the memoized value-window plan: the +effective DDF window, fetch cadence, and per-plane fetch-order mask live in +a `BitplaneSlotPlan` keyed on the register inputs (`BitplaneSlotKey`, +`src/bus.rs`). Wide-FMODE lo-res slots are packed into the first eight CCKs of each 16/32-CCK fetch unit; the rest of the unit remains available to later arbitration priorities. diff --git a/src/bus.rs b/src/bus.rs index 9b01ee6..904ff19 100644 --- a/src/bus.rs +++ b/src/bus.rs @@ -884,6 +884,20 @@ pub struct Bus { /// on each hit while keeping the `&self` owner-selection call graph intact. #[serde(skip)] bitplane_slot_plan_cache: BitplaneSlotPlanCache, + /// Bitplane DDF sequencer flop state at the start of the current line + /// (see src/bus/ddf_line.rs); carried across lines by the flop walk. + ddf_seq_line_initial: std::cell::Cell, + /// DDFSTRT/DDFSTOP values as of the start of the current line (mid-line + /// rewrites replay through `ddf_seq_writes`). + ddf_seq_line_start_regs: std::cell::Cell<(u16, u16)>, + /// (bmapen, bplcon0) as of the start of the current line; used only when + /// mid-line DMACON/BPLCON0 writes are in the log. + ddf_seq_line_start_ctl: std::cell::Cell<(bool, u16)>, + /// Register writes that reached the sequencer during the current line. + ddf_seq_writes: std::cell::RefCell>, + /// The current line's walked fetch table (rebuilt on demand). + #[serde(skip)] + ddf_seq_line: std::cell::RefCell>, bus_accounting: BusAccounting, /// Latches once BEAMCON0.DUAL (A2024/UHRES) is first seen set, so the /// "not emulated" warning is logged a single time, not per write. @@ -1993,6 +2007,11 @@ impl Bus { bitplane_bplcon0_delay: None, bitplane_ddfstart_miss: None, bitplane_slot_plan_cache: BitplaneSlotPlanCache::new(), + ddf_seq_line_initial: std::cell::Cell::new(Default::default()), + ddf_seq_line_start_regs: std::cell::Cell::new((0, 0)), + ddf_seq_line_start_ctl: std::cell::Cell::new((false, 0)), + ddf_seq_writes: std::cell::RefCell::new(Vec::new()), + ddf_seq_line: std::cell::RefCell::new(None), bus_accounting: BusAccounting::from_env(), uhres_dual_warned: false, dbg_ext_cck_x100: external_access_cck_x100_setting(), @@ -6809,6 +6828,7 @@ fn palette_event_sequences_equivalent(a: &[BeamRegisterWrite], b: &[BeamRegister mod collisions; mod custom_regs; +mod ddf_line; mod dma_slots; mod frame_capture; diff --git a/src/bus/custom_regs.rs b/src/bus/custom_regs.rs index 8d9eead..9e559ab 100644 --- a/src/bus/custom_regs.rs +++ b/src/bus/custom_regs.rs @@ -350,6 +350,7 @@ impl Bus { ); } self.denise.diwstrt = val; + self.ddf_seq_invalidate_line(); self.ocs_same_line_diw_start_blocked_vpos = None; // ECS DIWHIGH only supplies the window MSBs when it is written // *after* DIWSTRT/DIWSTOP (HRM p.306). A later DIWSTRT/DIWSTOP @@ -382,6 +383,7 @@ impl Bus { ); } self.denise.diwstop = val; + self.ddf_seq_invalidate_line(); self.denise.diwhigh_written = false; if self.ocs_same_line_diw_start_blocked_vpos == Some(self.agnus.vpos) && !display_window_contains_vpos( @@ -415,8 +417,14 @@ impl Bus { val ); } + let previous = self.denise.ddfstrt; self.denise.ddfstrt = val; self.record_ddfstrt_write_match_miss(val); + self.ddf_seq_record_ddf_write( + super::ddf_line::DdfSeqWriteKind::Ddfstrt(val), + previous, + 4, + ); false } 0x094 => { @@ -429,7 +437,13 @@ impl Bus { val ); } + let previous = self.denise.ddfstop; self.denise.ddfstop = val; + self.ddf_seq_record_ddf_write( + super::ddf_line::DdfSeqWriteKind::Ddfstop(val), + previous, + 4, + ); false } 0x080 => { @@ -487,6 +501,19 @@ impl Bus { } if self.agnus.dmacon != previous { self.record_bitplane_dmacon_write(previous); + let en = DMACON_DMAEN | DMACON_BPLEN; + let was = previous & en == en; + let is = self.agnus.dmacon & en == en; + if was != is { + self.ddf_seq_record_write( + if is { + super::ddf_line::DdfSeqWriteKind::BmapenSet + } else { + super::ddf_line::DdfSeqWriteKind::BmapenClr + }, + 2, + ); + } } false } @@ -798,6 +825,7 @@ impl Bus { 0x1DC => { if self.blitter_ecs_registers_enabled() { self.agnus.write_beamcon0(val); + self.ddf_seq_invalidate_line(); self.refresh_paula_audio_min_period(); if val & BEAMCON0_DUAL != 0 && !self.uhres_dual_warned { log::warn!( @@ -938,6 +966,7 @@ impl Bus { self.agnus.set_ersy(val & 0x0002 != 0); if self.denise.bplcon0 != previous { self.record_bitplane_bplcon0_write(previous); + self.ddf_seq_record_bplcon0_write(self.denise.bplcon0, previous, 3); } false } @@ -1099,6 +1128,7 @@ impl Bus { if self.denise_ecs_registers() { self.denise.diwhigh = val; self.denise.diwhigh_written = true; + self.ddf_seq_invalidate_line(); } false } diff --git a/src/bus/ddf_line.rs b/src/bus/ddf_line.rs new file mode 100644 index 0000000..b0ea070 --- /dev/null +++ b/src/bus/ddf_line.rs @@ -0,0 +1,581 @@ +//! Per-line bitplane DDF sequencer tracking: walks the +//! [`crate::chipset::ddf_sequencer`] flop model once per scanline and serves +//! the resulting fetch table to the slot arbiter and the DMA capture loop. +//! +//! The walked table replaces the value-range window logic for FMODE=0 +//! fetches: missed or invalid DDFSTRT/DDFSTOP comparators, stop drains +//! through the final fetch unit, and runs carried across line boundaries all +//! fall out of the flop walk. Wide-FMODE (AGA quantum > 1) fetches keep the +//! value-window plan; vAmiga (the flop model's hardware-verified source) has +//! no AGA counterpart to transcribe. + +use super::*; +use crate::chipset::ddf_sequencer::{self as seq, DdfSignal, DdfState}; + +/// Widest line the fetch table covers (PAL 227, NTSC long 228). +pub(super) const DDF_SEQ_MAX_LINE_CCKS: usize = 232; + +/// A DDFSTRT/DDFSTOP/BPLCON0/DMACON write that reached the sequencer during +/// the current line, at the colour clock where it takes effect. +#[derive(Clone, Copy, Debug, serde::Serialize, serde::Deserialize)] +pub(super) struct DdfSeqWrite { + pub effect_cck: u16, + pub kind: DdfSeqWriteKind, +} + +#[derive(Clone, Copy, Debug, serde::Serialize, serde::Deserialize)] +pub(super) enum DdfSeqWriteKind { + Ddfstrt(u16), + Ddfstop(u16), + Bplcon0(u16), + BmapenSet, + BmapenClr, +} + +/// One line's walked bitplane fetch table. +#[derive(Clone)] +pub(super) struct DdfSeqLine { + pub vpos: u32, + /// Plane index + 1 fetching at each colour clock; 0 = no bitplane slot. + pub plane_at: [u8; DDF_SEQ_MAX_LINE_CCKS], + /// The plane's modulo applies after the fetch at this colour clock + /// (final-unit slot). + pub modulo_at: [bool; DDF_SEQ_MAX_LINE_CCKS], + /// Words each plane fetches over the whole line. + pub words_per_plane: [u16; 8], + /// The fetching plane's word ordinal at each slot colour clock (holes + /// keep their position when DMA enables mid-line: the ordinal counts + /// table slots, and unfetched earlier slots stay zero words). + pub word_idx_at: [u16; DDF_SEQ_MAX_LINE_CCKS], + /// First fetch colour clock of the line, if any. + pub first_fetch_cck: Option, + /// Sequencer state after the line's walk (becomes the next line's + /// initial state). + pub end_state: DdfState, +} + +impl Bus { + /// Whether the flop-walked fetch table drives bitplane DMA for the + /// current display settings (FMODE=0-style single-word fetches). + pub(super) fn ddf_seq_active(&self) -> bool { + crate::chipset::agnus::bitplane_fetch_quantum(self.agnus.fmode()) == 1 + } + + fn ddf_seq_ecs_rules(&self) -> bool { + !matches!(self.agnus.revision(), AgnusRevision::Ocs) + } + + /// The DDF comparator strobe positions for one register over the line, + /// honouring mid-line rewrites: within each written-value's reign, the + /// comparator fires if its position falls inside that span. The edge + /// semantics mirror vAmiga's sequencer pokes: a rewritten value only + /// fires strictly after its commit colour clock, and an old DDFSTOP + /// value still fires ON the commit clock (`invalidate(posh + 1)`) while + /// an old DDFSTRT does not (`invalidate(posh)`). + fn comparator_strobes( + line_start_value: u16, + writes: &[(u16, u16)], + line_ccks: u16, + old_fires_on_commit_cck: bool, + out: &mut Vec, + ) { + let mut active = line_start_value; + let mut span_start = 0u16; + for &(effect_cck, value) in writes { + let span_end = if old_fires_on_commit_cck { + effect_cck.saturating_add(1) + } else { + effect_cck + } + .min(line_ccks); + if active >= span_start && active < span_end { + out.push(active); + } + span_start = span_start.max(effect_cck.saturating_add(1)); + active = value; + } + if active >= span_start && active < line_ccks { + out.push(active); + } + } + + /// Build (or rebuild) the fetch table for the current line from the + /// carried sequencer state and this line's register-write log. + pub(super) fn ddf_seq_build_line(&self) -> DdfSeqLine { + let vpos = self.agnus.vpos; + let line_ccks = self.agnus.current_line_cck() as u16; + let revision = self.agnus.revision(); + let mask = crate::chipset::agnus::ddf_register_mask(revision); + + let mut state = self.ddf_seq_line_initial.get(); + + // Line-granular vertical flop and DMA/control refresh: the vertical + // display window opens at DIWSTRT.V and closes at DIWSTOP.V. + state.bpv = display_window_contains_vpos( + self.denise.diwstrt, + self.denise.diwstop, + self.effective_diwhigh(), + vpos, + ); + + let writes = self.ddf_seq_writes.borrow(); + // Runtime writes always land in the log, so when the log carries no + // DMACON/BPLCON0 strobes the live values equal the line-start values; + // reading them live also keeps direct register pokes in unit tests + // coherent without a rollover. + let has_bmapen_write = writes.iter().any(|w| { + matches!( + w.kind, + DdfSeqWriteKind::BmapenSet | DdfSeqWriteKind::BmapenClr + ) + }); + let has_con_write = writes + .iter() + .any(|w| matches!(w.kind, DdfSeqWriteKind::Bplcon0(_))); + let start_ctl = self.ddf_seq_line_start_ctl.get(); + state.bmapen = if has_bmapen_write { + start_ctl.0 + } else { + self.agnus.dmacon & (DMACON_DMAEN | DMACON_BPLEN) == (DMACON_DMAEN | DMACON_BPLEN) + }; + state.bplcon0 = if has_con_write { + start_ctl.1 + } else { + self.denise.bplcon0 + }; + let mut strt_writes: Vec<(u16, u16)> = Vec::new(); + let mut stop_writes: Vec<(u16, u16)> = Vec::new(); + let mut extra: Vec = Vec::new(); + for w in writes.iter() { + match w.kind { + DdfSeqWriteKind::Ddfstrt(v) => strt_writes.push((w.effect_cck, v & mask)), + DdfSeqWriteKind::Ddfstop(v) => stop_writes.push((w.effect_cck, v & mask)), + DdfSeqWriteKind::Bplcon0(v) => extra.push(DdfSignal { + cck: w.effect_cck.min(line_ccks.saturating_sub(1)), + bits: seq::sig::CON, + bplcon0: v, + }), + DdfSeqWriteKind::BmapenSet => extra.push(DdfSignal { + cck: w.effect_cck.min(line_ccks.saturating_sub(1)), + bits: seq::sig::BMAPEN_SET, + bplcon0: 0, + }), + DdfSeqWriteKind::BmapenClr => extra.push(DdfSignal { + cck: w.effect_cck.min(line_ccks.saturating_sub(1)), + bits: seq::sig::BMAPEN_CLR, + bplcon0: 0, + }), + } + } + // Runtime writes always land in the log (custom_regs hooks), so a + // register with no logged write is unchanged since line start; use + // the live value then. This also keeps direct register pokes in + // unit tests coherent without a line rollover. First writes snapshot + // the pre-write value into ddf_seq_line_start_regs. + let logged = self.ddf_seq_line_start_regs.get(); + let start_regs = ( + if strt_writes.is_empty() { + self.denise.ddfstrt + } else { + logged.0 + }, + if stop_writes.is_empty() { + self.denise.ddfstop + } else { + logged.1 + }, + ); + let mut strt_strobes = Vec::new(); + let mut stop_strobes = Vec::new(); + Self::comparator_strobes( + start_regs.0 & mask, + &strt_writes, + line_ccks, + false, + &mut strt_strobes, + ); + Self::comparator_strobes( + start_regs.1 & mask, + &stop_writes, + line_ccks, + true, + &mut stop_strobes, + ); + for cck in strt_strobes { + extra.push(DdfSignal { + cck, + bits: seq::sig::BPHSTART, + bplcon0: 0, + }); + } + for cck in stop_strobes { + extra.push(DdfSignal { + cck, + bits: seq::sig::BPHSTOP, + bplcon0: 0, + }); + } + drop(writes); + + // The static strt/stop strobes are already covered by the log-based + // reconstruction above, so pass never-matching values to the default + // list builder and merge everything through `extra`. BEAMCON0.HARDDIS + // relaxes the hardwired stop position. + let (_, hard_stop) = crate::chipset::agnus::ddf_hard_bounds(self.harddis_active()); + let signals = + seq::line_signals_with_hard_stop(0xFFFF, 0xFFFF, hard_stop, line_ccks, &extra); + let fetches = seq::walk_line( + self.aga_enabled(), + self.ddf_seq_ecs_rules(), + &signals, + &mut state, + ); + + let mut line = DdfSeqLine { + vpos, + plane_at: [0; DDF_SEQ_MAX_LINE_CCKS], + modulo_at: [false; DDF_SEQ_MAX_LINE_CCKS], + words_per_plane: [0; 8], + word_idx_at: [0; DDF_SEQ_MAX_LINE_CCKS], + first_fetch_cck: None, + end_state: state, + }; + let shres = crate::chipset::agnus::bitplane_shres(state.bplcon0); + let hires = crate::chipset::agnus::bitplane_hires(state.bplcon0); + for f in &fetches { + let idx = usize::from(f.cck); + if idx >= DDF_SEQ_MAX_LINE_CCKS { + continue; + } + let plane = usize::from(f.plane).min(7); + line.plane_at[idx] = f.plane + 1; + line.modulo_at[idx] = f.apply_modulo; + // Word addressing is unit-based: a plane enabled mid-line keeps + // fetching into its unit's word position, leaving earlier words + // zero (matching the hardware's per-unit pointer cadence). + line.word_idx_at[idx] = if shres { + f.unit_ord * 4 + u16::from(f.counter >> 1) + } else if hires { + f.unit_ord * 2 + u16::from(f.counter >= 4) + } else { + f.unit_ord + }; + line.words_per_plane[plane] = + line.words_per_plane[plane].max(line.word_idx_at[idx] + 1); + if line.first_fetch_cck.is_none() { + line.first_fetch_cck = Some(f.cck); + } + } + line + } + + /// The walked table for the current line, building it on first use. + pub(super) fn ddf_seq_line_table(&self) -> std::cell::Ref<'_, DdfSeqLine> { + { + let cached = self.ddf_seq_line.borrow(); + if cached + .as_ref() + .is_some_and(|line| line.vpos == self.agnus.vpos) + { + return std::cell::Ref::map(cached, |line| line.as_ref().unwrap()); + } + } + let built = self.ddf_seq_build_line(); + *self.ddf_seq_line.borrow_mut() = Some(built); + std::cell::Ref::map(self.ddf_seq_line.borrow(), |line| line.as_ref().unwrap()) + } + + /// Invalidate the current line's table (a register write changed the + /// remaining signals). Already-consumed word counters are preserved by + /// the capture loop keying on colour clocks, not indices. + pub(super) fn ddf_seq_invalidate_line(&self) { + *self.ddf_seq_line.borrow_mut() = None; + } + + /// Record a register write reaching the sequencer this line. + pub(super) fn ddf_seq_record_write(&self, kind: DdfSeqWriteKind, delay_cck: u16) { + let effect_cck = (self.agnus.hpos as u16).saturating_add(delay_cck); + { + let mut writes = self.ddf_seq_writes.borrow_mut(); + // First control write of the line: snapshot the pre-write value + // as the line-start state (the log-empty fast path reads live + // registers, which just changed). + match kind { + DdfSeqWriteKind::BmapenSet | DdfSeqWriteKind::BmapenClr => { + if !writes.iter().any(|w| { + matches!( + w.kind, + DdfSeqWriteKind::BmapenSet | DdfSeqWriteKind::BmapenClr + ) + }) { + let mut ctl = self.ddf_seq_line_start_ctl.get(); + ctl.0 = matches!(kind, DdfSeqWriteKind::BmapenClr); + self.ddf_seq_line_start_ctl.set(ctl); + } + } + DdfSeqWriteKind::Bplcon0(_) => {} + DdfSeqWriteKind::Ddfstrt(_) | DdfSeqWriteKind::Ddfstop(_) => {} + } + writes.push(DdfSeqWrite { effect_cck, kind }); + } + self.ddf_seq_invalidate_line(); + } + + /// Record a BPLCON0 write, snapshotting the pre-write value on the first + /// control write of the line. + pub(super) fn ddf_seq_record_bplcon0_write(&self, value: u16, previous: u16, delay_cck: u16) { + { + let writes = self.ddf_seq_writes.borrow(); + if !writes + .iter() + .any(|w| matches!(w.kind, DdfSeqWriteKind::Bplcon0(_))) + { + let mut ctl = self.ddf_seq_line_start_ctl.get(); + ctl.1 = previous; + self.ddf_seq_line_start_ctl.set(ctl); + } + } + self.ddf_seq_record_write(DdfSeqWriteKind::Bplcon0(value), delay_cck); + } + + /// Record a DDFSTRT/DDFSTOP write, snapshotting the pre-write values on + /// the first DDF write of the line. + pub(super) fn ddf_seq_record_ddf_write( + &self, + kind: DdfSeqWriteKind, + previous: u16, + delay_cck: u16, + ) { + { + let writes = self.ddf_seq_writes.borrow(); + let (had_strt, had_stop) = writes.iter().fold((false, false), |acc, w| match w.kind { + DdfSeqWriteKind::Ddfstrt(_) => (true, acc.1), + DdfSeqWriteKind::Ddfstop(_) => (acc.0, true), + _ => acc, + }); + let mut regs = self.ddf_seq_line_start_regs.get(); + match kind { + DdfSeqWriteKind::Ddfstrt(_) if !had_strt => regs.0 = previous, + DdfSeqWriteKind::Ddfstop(_) if !had_stop => regs.1 = previous, + _ => {} + } + if !had_strt && !had_stop { + // The other register was untouched this line: its line-start + // value is the live one. + if matches!(kind, DdfSeqWriteKind::Ddfstrt(_)) { + regs.1 = self.denise.ddfstop; + } else { + regs.0 = self.denise.ddfstrt; + } + } + self.ddf_seq_line_start_regs.set(regs); + } + self.ddf_seq_record_write(kind, delay_cck); + } + + /// FMODE=0 bitplane DMA capture driven by the walked fetch table: + /// fetches the assigned plane's word at each table slot, feeds Denise, + /// advances the plane pointer, and applies the plane's modulo at its + /// final-unit fetch. Replaces the value-window capture loop wholesale + /// when the sequencer table is active. + pub(super) fn capture_bitplane_dma_words_fsm( + &mut self, + vpos: u32, + old_hpos: u32, + new_hpos: u32, + old_emulated_cck: u64, + ) { + if self.ocs_same_line_diw_start_blocked_vpos == Some(vpos) { + return; + } + if self.mem.chip_ram.is_empty() { + return; + } + let display_bplcon0 = self.effective_bitplane_bplcon0_at(old_emulated_cck); + let display_planes = + BitplaneMode::from_bplcon0(display_bplcon0, self.aga_enabled()).display_planes(); + let Some(fb_y) = visible_framebuffer_y( + vpos, + self.current_frame_visible_start_vpos, + self.current_frame_geometry.visible_lines, + ) else { + // Lines outside the captured framebuffer advance no pointers, + // matching the pre-FSM capture (and the vAmiga reference dumps: + // diwv3/diwv4 pin this - a DIWSTRT.V inside vertical blanking + // must not skew the visible rows' pointer progression). + return; + }; + let (plane_at, modulo_at, word_idx_at, words_per_row) = { + let table = self.ddf_seq_line_table(); + let wpr = table.words_per_plane.iter().copied().max().unwrap_or(0) as usize; + (table.plane_at, table.modulo_at, table.word_idx_at, wpr) + }; + if words_per_row == 0 { + return; + } + let addr_mask = self.chip_dma_mask; + let end = new_hpos.min(DDF_SEQ_MAX_LINE_CCKS as u32); + let mut slots = 0usize; + let mut rows_started = 0usize; + for hpos in old_hpos..end { + let slot = plane_at[hpos as usize]; + if slot == 0 { + continue; + } + let plane = usize::from(slot - 1); + if plane == 0 { + self.record_sprite_display_enable_for_bitplane_dma(vpos); + } + let word_idx = usize::from(word_idx_at[hpos as usize]); + let addr = self.display_dma_bplpt[plane] & addr_mask; + let fetched = read_chip_word_wrapping(&self.mem.chip_ram, addr); + self.data_bus = fetched; + let dma_planes = plane_count_from_table(&plane_at); + if self.capture_bitplane_fetch_word( + fb_y, + display_planes, + dma_planes, + words_per_row, + plane, + word_idx.min(words_per_row.saturating_sub(1)), + fetched, + ) { + rows_started += 1; + } + self.denise.write_bpldat(plane, fetched); + self.display_dma_bplpt[plane] = + self.display_dma_bplpt[plane].wrapping_add(2) & addr_mask; + if modulo_at[hpos as usize] { + let modulo = self.display_dma_modulo_for_plane(plane, vpos); + self.display_dma_bplpt[plane] = + ((self.display_dma_bplpt[plane] as i64).wrapping_add(modulo as i64) as u32) + & addr_mask; + } + slots += 1; + } + if slots != 0 { + self.record_bitplane_fetch_timing(slots, rows_started, 0, None); + } + } + + /// Line rollover: finalize the ending line's walk, carry the sequencer + /// state, and reset the per-line write log. `ended_vpos` is the line + /// that just finished. + pub(super) fn ddf_seq_on_line_rollover(&mut self, ended_vpos: u32) { + let end_state = { + let cached = self.ddf_seq_line.borrow(); + match cached.as_ref() { + Some(line) if line.vpos == ended_vpos => Some(line.end_state), + _ => None, + } + }; + let end_state = end_state.unwrap_or_else(|| { + // The table was never built (or was invalidated) for the ended + // line: walk it now, against the ended line's vpos, so the + // carried state stays exact. + let vpos_backup = self.agnus.vpos; + self.agnus.vpos = ended_vpos; + let line = self.ddf_seq_build_line(); + self.agnus.vpos = vpos_backup; + line.end_state + }); + self.ddf_seq_line_initial.set(end_state); + self.ddf_seq_line_start_regs + .set((self.denise.ddfstrt, self.denise.ddfstop)); + self.ddf_seq_line_start_ctl.set(( + self.agnus.dmacon & (DMACON_DMAEN | DMACON_BPLEN) == (DMACON_DMAEN | DMACON_BPLEN), + self.denise.bplcon0, + )); + self.ddf_seq_writes.borrow_mut().clear(); + self.ddf_seq_invalidate_line(); + } +} + +/// DMA plane count implied by the walked table (highest plane with a slot). +fn plane_count_from_table(plane_at: &[u8; DDF_SEQ_MAX_LINE_CCKS]) -> usize { + plane_at.iter().copied().max().unwrap_or(0) as usize +} + +#[cfg(test)] +mod tests { + use super::super::tests::empty_bus; + use super::*; + + #[test] + fn standard_window_table_matches_value_model() { + let mut bus = empty_bus(); + bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; + bus.denise.diwstrt = 0x2C81; + bus.denise.diwstop = 0x2CC1; + bus.denise.ddfstrt = 0x0038; + bus.denise.ddfstop = 0x00D0; + bus.denise.bplcon0 = 0x4200; // 4 planes lores + bus.agnus.vpos = 0x50; + bus.ddf_seq_on_line_rollover(0x4F); + + let table = bus.ddf_seq_line_table(); + assert_eq!(table.first_fetch_cck, Some(0x39)); + assert_eq!(table.words_per_plane[0], 20); + assert_eq!(table.words_per_plane[3], 20); + assert_eq!(table.words_per_plane[4], 0); + // Plane 1 fetches at the end of each unit ($3F, $47, ...). + assert_eq!(table.plane_at[0x3F], 1); + assert_eq!(table.plane_at[0x38], 0); + } + + #[test] + fn invalid_stop_extends_the_run_to_the_hard_stop() { + let mut bus = empty_bus(); + bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; + bus.denise.diwstrt = 0x2C81; + bus.denise.diwstop = 0x2CC1; + bus.denise.ddfstrt = 0x0060; + bus.denise.ddfstop = 0x00FF; // never matches ($FC-masked to $FC, still past the line's RHW drain) + bus.denise.bplcon0 = 0x4200; + bus.agnus.vpos = 0x50; + bus.ddf_seq_on_line_rollover(0x4F); + + let table = bus.ddf_seq_line_table(); + // Run from $60 to the hard-stop drain: ($D8 - $60) / 8 = 15 units, + // then the $D8 unit drains as the final unit: 16 words. + assert_eq!(table.words_per_plane[0], 16); + assert!(table.plane_at[0xDF] != 0); + } + + #[test] + fn vertical_window_gates_the_walk() { + let mut bus = empty_bus(); + bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; + bus.denise.diwstrt = 0x2C81; + bus.denise.diwstop = 0x2CC1; + bus.denise.ddfstrt = 0x0038; + bus.denise.ddfstop = 0x00D0; + bus.denise.bplcon0 = 0x4200; + bus.agnus.vpos = 0x10; // above DIWSTRT.V + bus.ddf_seq_on_line_rollover(0x0F); + + let table = bus.ddf_seq_line_table(); + assert_eq!(table.first_fetch_cck, None); + } + + #[test] + fn mid_line_stop_rewrite_reaches_the_walk() { + let mut bus = empty_bus(); + bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; + bus.denise.diwstrt = 0x2C81; + bus.denise.diwstop = 0x2CC1; + bus.denise.ddfstrt = 0x0038; + bus.denise.ddfstop = 0x00D0; + bus.denise.bplcon0 = 0x4200; + bus.agnus.vpos = 0x50; + bus.ddf_seq_on_line_rollover(0x4F); + // Beam early in the line: rewrite DDFSTOP to $60. + bus.agnus.hpos = 0x20; + bus.denise.ddfstop = 0x0060; + bus.ddf_seq_record_write(DdfSeqWriteKind::Ddfstop(0x0060), 4); + + let table = bus.ddf_seq_line_table(); + // Stop at $60: units $38..$60 = 5, plus the $60 drain unit: 6 words. + assert_eq!(table.words_per_plane[0], 6); + } +} diff --git a/src/bus/dma_slots.rs b/src/bus/dma_slots.rs index 1d76a1d..450a5ef 100644 --- a/src/bus/dma_slots.rs +++ b/src/bus/dma_slots.rs @@ -200,6 +200,10 @@ impl Bus { if tick.new_lines != 0 || tick.new_frames != 0 { self.bitplane_ddfstart_miss = None; self.ocs_same_line_diw_start_blocked_vpos = None; + // Carry the DDF sequencer flops into the new line. Quanta are at + // most a few colour clocks, so exactly one line boundary can be + // crossed per advance. + self.ddf_seq_on_line_rollover(old_vpos); } let display_start = self.display_start_vpos_for_current_control(); if tick.new_frames == 0 && old_vpos < display_start && self.agnus.vpos >= display_start { @@ -674,6 +678,14 @@ impl Bus { } pub(super) fn bitplane_slot_active_at(&self, vpos: u32, hpos: u32) -> bool { + if self.ddf_seq_active() { + // FMODE=0: the walked DDF sequencer table owns the decision + // (vertical window, comparator flops, stop drains, carried runs). + let _ = vpos; + let table = self.ddf_seq_line_table(); + return (hpos as usize) < super::ddf_line::DDF_SEQ_MAX_LINE_CCKS + && table.plane_at[hpos as usize] != 0; + } // Bitplane DMA only runs inside the vertical display window (set at // DIWSTRT.V, cleared at DIWSTOP.V), so the top-border and vertical- // blank lines are free for the blitter/CPU. Rejecting this before the diff --git a/src/bus/frame_capture.rs b/src/bus/frame_capture.rs index bb6266f..3daff0f 100644 --- a/src/bus/frame_capture.rs +++ b/src/bus/frame_capture.rs @@ -1265,6 +1265,10 @@ impl Bus { if self.ocs_same_line_diw_start_blocked_vpos == Some(vpos) { return; } + if self.ddf_seq_active() { + self.capture_bitplane_dma_words_fsm(vpos, old_hpos, new_hpos, old_emulated_cck); + return; + } let display_bplcon0 = self.effective_bitplane_bplcon0_at(old_emulated_cck); let mode = BitplaneMode::from_bplcon0(display_bplcon0, self.aga_enabled()); let display_planes = mode.display_planes(); diff --git a/src/bus/tests.rs b/src/bus/tests.rs index 4d8b7c4..b13fd84 100644 --- a/src/bus/tests.rs +++ b/src/bus/tests.rs @@ -205,7 +205,7 @@ impl AudioSink for CollectAudio { fn flush(&mut self) {} } -fn empty_bus() -> Bus { +pub(super) fn empty_bus() -> Bus { empty_bus_with_chip_ram(512 * 1024) } @@ -2714,10 +2714,11 @@ fn bitplane_dma_capture_clips_ddfstart_to_hard_fetch_window() { bus.advance_chipset(10); - let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 1); - assert_eq!(row.planes[0], vec![0x1111]); - assert_eq!(bus.display_dma_bplpt[0], 0x0102); + // Flop model: the DDFSTRT comparator at $10 fires while the hardware + // start window (SHW, $18) is still down, so OCS never starts a run; + // the old value-window model clamped the start to $18 and fetched. + assert!(bus.frame_captured_bitplane_rows()[0].is_none()); + assert_eq!(bus.display_dma_bplpt[0], 0x0100); } #[test] @@ -2785,7 +2786,7 @@ fn wide_fmode_dma_capture_packs_lores_slots_in_fetch_units() { } #[test] -fn ecs_bitplane_dma_capture_stops_equal_ddf_window_after_one_fetch_cycle() { +fn ecs_bitplane_dma_capture_extends_equal_ddf_window_to_hard_stop() { let mut bus = empty_bus(); bus.set_agnus_revision(AgnusRevision::Ecs8372Rev4); bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; @@ -2803,10 +2804,15 @@ fn ecs_bitplane_dma_capture_stops_equal_ddf_window_after_one_fetch_cycle() { bus.advance_chipset(0x00E0 - 0x003E); + // Flop model: the merged equal DDFSTRT/DDFSTOP strobe starts the run + // with no stop pending on ECS too (the stop flop only latches while a + // run is up), so the fetch extends to the hardware-stop drain: + // 21 units at $38..$DF. let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 1); - assert_eq!(row.planes[0], vec![0xCAFE]); - assert_eq!(bus.display_dma_bplpt[0], 0x0102); + assert_eq!(row.words_per_row, 21); + assert_eq!(row.planes[0][0], 0xCAFE); + assert_eq!(row.planes[0][1], 0xBEEF); + assert_eq!(bus.display_dma_bplpt[0], 0x0100 + 21 * 2); } #[test] @@ -2829,10 +2835,12 @@ fn bitplane_dmacon_enable_reaches_fetcher_after_two_cck() { assert!(!bus.write_custom_word_from(0x096, 0x8000 | DMACON_BPLEN, BeamWriteSource::Cpu)); bus.advance_chipset(0x0048 - 0x003E); - let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x0000, 0x1111]); - assert_eq!(bus.display_dma_bplpt[0], 0x0102); + // The enable reaches the sequencer two colour clocks after the write, + // at $40 - the same strobe as the DDFSTOP match, which clears the + // latched BPHSTART before the BMAPEN logic evaluates: no run starts + // (flop model; the earlier $3F slot stayed idle either way). + assert!(bus.frame_captured_bitplane_rows()[0].is_none()); + assert_eq!(bus.display_dma_bplpt[0], 0x0100); } #[test] @@ -2855,9 +2863,12 @@ fn bitplane_dmacon_clear_reaches_fetcher_after_two_cck() { assert!(!bus.write_custom_word_from(0x096, DMACON_BPLEN, BeamWriteSource::Cpu)); bus.advance_chipset(0x0048 - 0x003E); + // The clear reaches the sequencer two colour clocks after the write + // ($40) and drops BPRUN immediately: the DDFSTOP drain unit does not + // run, so only the $3F fetch of the first unit happened. let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x1111, 0x0000]); + assert_eq!(row.words_per_row, 1); + assert_eq!(row.planes[0], vec![0x1111]); assert_eq!(bus.display_dma_bplpt[0], 0x0102); } @@ -2907,9 +2918,12 @@ fn bitplane_bplcon0_clear_reaches_fetcher_after_three_cck() { assert!(!bus.write_custom_word_from(0x100, 0x0000, BeamWriteSource::Cpu)); bus.advance_chipset(0x0048 - 0x003D); + // The BPLCON0 clear reaches the sequencer three colour clocks after the + // write ($40): the DDFSTOP drain unit still runs, but with zero planes + // it carries no fetch slots, so only the $3F fetch happened. let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x1111, 0x0000]); + assert_eq!(row.words_per_row, 1); + assert_eq!(row.planes[0], vec![0x1111]); assert_eq!(bus.display_dma_bplpt[0], 0x0102); } @@ -2988,13 +3002,18 @@ fn bitplane_ddfstrt_write_before_match_starts_current_line() { write_chip_word(&mut bus, 0x0100, 0x1111); write_chip_word(&mut bus, 0x0102, 0x2222); - assert!(!bus.write_custom_word_from(0x092, 0x0038, BeamWriteSource::Cpu)); - bus.advance_chipset(0x0048 - 0x0037); + // A DDFSTRT write commits to the comparator four colour clocks after + // the write slot (vAmiga's DMA_CYCLES(4) model): written at $37 it is + // live from $3B, so a match position of $44 still fires this line. The + // run then misses the already-passed $40 stop and extends to the + // hardware-stop drain. + assert!(!bus.write_custom_word_from(0x092, 0x0044, BeamWriteSource::Cpu)); + bus.advance_chipset(0x004C - 0x0037); let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x1111, 0x2222]); - assert_eq!(bus.display_dma_bplpt[0], 0x0104); + assert_eq!(row.words_per_row, 19); + assert_eq!(row.planes[0][0], 0x1111); + assert_eq!(bus.display_dma_bplpt[0], 0x0102); } #[test] @@ -3020,7 +3039,10 @@ fn bitplane_dma_capture_scans_fetch_window_independent_of_owner_hint() { assert_eq!(cck, 1); assert_eq!(tick.new_lines, 0); let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.planes[0], vec![0xCAFE]); + // Equal DDFSTRT/DDFSTOP plans to the hardware-stop drain (21 units); + // the $3F fetch executed despite the Idle owner hint. + assert_eq!(row.words_per_row, 21); + assert_eq!(row.planes[0][0], 0xCAFE); assert_eq!(bus.display_dma_bplpt[0], 0x0102); } @@ -3050,7 +3072,10 @@ fn bitplane_dma_capture_maps_early_vertical_overscan_to_first_framebuffer_row() RENDER_MIN_OVERSCAN_START_VPOS ); let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.planes[0], vec![0xCAFE]); + // Equal DDFSTRT/DDFSTOP plans to the hardware-stop drain; one unit has + // fetched so far. + assert_eq!(row.words_per_row, 21); + assert_eq!(row.planes[0][0], 0xCAFE); assert_eq!(bus.display_dma_bplpt[0], 0x0102); } @@ -3316,7 +3341,11 @@ fn bitplane_dma_capture_keeps_pal_overscan_bottom_rows() { let row = bus.frame_captured_bitplane_rows()[last_overscan_line] .as_ref() .unwrap(); - assert_eq!(row.planes[0], vec![0xFACE]); + // Equal DDFSTRT/DDFSTOP: the merged strobe starts the run without a + // pending stop, so the plan extends to the hardware-stop drain (21 + // units); only the first unit's fetch has executed at this point. + assert_eq!(row.words_per_row, 21); + assert_eq!(row.planes[0][0], 0xFACE); assert_eq!(bus.display_dma_bplpt[0], 0x0102); } @@ -3340,14 +3369,18 @@ fn bitplane_dma_capture_preserves_words_when_ddfstop_extends_same_line() { bus.write_custom_word_from(0x094, 0x0040, BeamWriteSource::Cpu); bus.advance_chipset(8); + // The equal start/stop strobe already started a run with no stop + // pending, so the plan runs to the hardware-stop drain; the new $40 + // stop commits at $44, after $40 has passed, and never matches. let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); - assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x1111, 0x2222]); + assert_eq!(row.words_per_row, 21); + assert_eq!(row.planes[0][0], 0x1111); + assert_eq!(row.planes[0][1], 0x2222); assert_eq!(bus.display_dma_bplpt[0], 0x0104); } #[test] -fn bitplane_dma_capture_leaves_unfetched_words_zero_when_ddfstop_shrinks_same_line() { +fn bitplane_ddfstop_shrink_write_commits_too_late_to_cancel_the_match() { let mut bus = empty_bus(); bus.set_agnus_revision(AgnusRevision::Ecs8372Rev4); bus.agnus.dmacon = DMACON_DMAEN | DMACON_BPLEN; @@ -3364,13 +3397,16 @@ fn bitplane_dma_capture_leaves_unfetched_words_zero_when_ddfstop_shrinks_same_li write_chip_word(&mut bus, 0x0102, 0x2222); bus.advance_chipset(2); + // Written at $40, the new stop commits at $44 - after the old $40 + // value already matched, so the stop request stands and the drain + // unit fetches the second word regardless of the rewrite. bus.write_custom_word_from(0x094, 0x0038, BeamWriteSource::Cpu); bus.advance_chipset(8); let row = bus.frame_captured_bitplane_rows()[0].as_ref().unwrap(); assert_eq!(row.words_per_row, 2); - assert_eq!(row.planes[0], vec![0x1111, 0x0000]); - assert_eq!(bus.display_dma_bplpt[0], 0x0102); + assert_eq!(row.planes[0], vec![0x1111, 0x2222]); + assert_eq!(bus.display_dma_bplpt[0], 0x0104); } #[test] @@ -8269,10 +8305,12 @@ fn bitplane_dma_ownership_clips_ddfstart_to_hard_fetch_window() { bus.denise.ddfstop = 0x0018; bus.agnus.vpos = 0x40; // inside the default vertical display window + // Flop model: the DDFSTRT comparator at $10 fires while the hardware + // start window (SHW, $18) is still down; OCS starts no run at all. bus.agnus.hpos = 0x017; assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Idle); bus.agnus.hpos = 0x01F; - assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Bitplane); + assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Idle); bus.agnus.hpos = 0x020; assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Idle); } @@ -8307,6 +8345,9 @@ fn bitplane_dma_ownership_matches_revision_for_equal_ddf_window() { assert_eq!(ocs.scheduled_dma_owner(false), ChipBusOwner::Bitplane); ocs.agnus.hpos = 0x0E0; assert_eq!(ocs.scheduled_dma_owner(false), ChipBusOwner::Idle); + // Flop model: ECS behaves like OCS here - the merged equal-value + // strobe starts a run with no stop pending, which only the + // hardware-stop drain ends. let mut ecs = empty_bus(); ecs.set_agnus_revision(AgnusRevision::Ecs8372Rev4); @@ -8317,6 +8358,10 @@ fn bitplane_dma_ownership_matches_revision_for_equal_ddf_window() { ecs.agnus.vpos = 0x40; // inside the default vertical display window ecs.agnus.hpos = 0x047; + assert_eq!(ecs.scheduled_dma_owner(false), ChipBusOwner::Bitplane); + ecs.agnus.hpos = 0x0DF; + assert_eq!(ecs.scheduled_dma_owner(false), ChipBusOwner::Bitplane); + ecs.agnus.hpos = 0x0E0; assert_eq!(ecs.scheduled_dma_owner(false), ChipBusOwner::Idle); } @@ -8329,11 +8374,17 @@ fn hires_bitplane_dma_ownership_uses_four_cck_fetch_cadence() { bus.denise.ddfstop = 0x0040; bus.agnus.vpos = 0x40; // inside the default vertical display window + // Flop model (vAmiga fetch tables): a hires unit carries H4 H2 H3 H1 + // twice over its 8 colour clocks, so a single-plane display only + // reserves the H1 slots at unit offsets 3 and 7; the other clocks are + // free for the copper/CPU. bus.agnus.hpos = 0x038; - assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Bitplane); - bus.agnus.hpos = 0x03A; assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Idle); + bus.agnus.hpos = 0x03B; + assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Bitplane); bus.agnus.hpos = 0x03C; + assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Idle); + bus.agnus.hpos = 0x03F; assert_eq!(bus.scheduled_dma_owner(false), ChipBusOwner::Bitplane); } diff --git a/src/chipset/ddf_sequencer.rs b/src/chipset/ddf_sequencer.rs new file mode 100644 index 0000000..ba57e93 --- /dev/null +++ b/src/chipset/ddf_sequencer.rs @@ -0,0 +1,631 @@ +//! Agnus bitplane DDF sequencer: the per-colour-clock start/stop flop model. +//! +//! The bitplane fetch window is NOT a simple [DDFSTRT, DDFSTOP] value range. +//! Agnus runs a small synchronous state machine on comparator EDGES: DDFSTRT +//! and DDFSTOP matches set/clear flip-flops, the hardwired window ($18/$D8) +//! gates and force-stops runs, a stop request drains through one final fetch +//! unit (which applies the modulos), and the whole state carries across line +//! boundaries. Missed comparators (values rewritten too late, or values that +//! never match) therefore produce fetch runs that a value-range model cannot +//! express: runs to the hardware stop, runs that wrap through horizontal +//! blanking into the next line, and lines with no run at all. +//! +//! The flop semantics are transcribed from vAmiga 4.4's Sequencer (OCS and +//! ECS variants, hardware-verified by the vAmigaTS Agnus/DDF suite). The +//! aggregate behaviour is pinned to real hardware by the +//! Agnus/DDF/DDF/oldhwstop1-4 A500 photos: the colour-swatch band below the +//! experiment rows encodes every preceding row's fetched word count through +//! the bitplane pointer progression, and the photos match this model's +//! output. +//! +//! This module is deliberately free-standing (no Bus/Agnus state): callers +//! feed a signal list for one line and the carried [`DdfState`], and receive +//! the per-cck fetch events. + +/// One horizontal line's colour-clock count is supplied by the caller (PAL +/// 227; programmable modes differ). The hardwired fetch window is fixed. +pub const DDF_HARD_START_CCK: u16 = 0x18; +pub const DDF_HARD_STOP_CCK: u16 = 0xD8; + +/// Signal bits, mirroring the hardware comparator strobes. Multiple signals +/// can coincide on one colour clock (e.g. DDFSTRT == DDFSTOP), which the +/// flop logic decodes as a distinct case. +pub mod sig { + pub const SHW: u32 = 1 << 0; + pub const RHW: u32 = 1 << 1; + pub const BPHSTART: u32 = 1 << 2; + pub const BPHSTOP: u32 = 1 << 3; + pub const BMAPEN_CLR: u32 = 1 << 4; + pub const BMAPEN_SET: u32 = 1 << 5; + pub const VFLOP_SET: u32 = 1 << 6; + pub const VFLOP_CLR: u32 = 1 << 7; + pub const CON: u32 = 1 << 8; + pub const DONE: u32 = 1 << 9; +} + +/// A signal strobe at a colour clock. `bplcon0` carries the new control +/// value for `sig::CON` strobes (BPLCON0 writes reaching Agnus). +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub struct DdfSignal { + pub cck: u16, + pub bits: u32, + pub bplcon0: u16, +} + +/// The sequencer flip-flops. Carried across line boundaries; a line's walk +/// starts from the previous line's final state. +#[derive(Clone, Copy, Debug, Default, PartialEq, Eq, serde::Serialize, serde::Deserialize)] +pub struct DdfState { + /// Vertical DIW flip-flop (bitplane DMA enabled vertically). + pub bpv: bool, + /// DMACON master+bitplane DMA enable. + pub bmapen: bool, + /// Past the hardwired start ($18). OCS: cleared when a fetch unit + /// completes; ECS: cleared at end of line. + pub shw: bool, + /// Past the hardwired stop ($D8). + pub rhw: bool, + /// DDFSTRT comparator flip-flop. + pub bphstart: bool, + /// DDFSTOP comparator flip-flop. + pub bphstop: bool, + /// Bitplane fetch running. + pub bprun: bool, + /// The final fetch unit (modulos apply) is in progress. + pub last_fu: bool, + /// A stop was requested; honoured at the next fetch-unit boundary. + pub stopreq: bool, + /// Fetch-unit position counter (2 colour clocks per step, 4 steps). + pub cnt: u8, + /// The BPLCON0 value the sequencer currently sees. + pub bplcon0: u16, +} + +/// One bitplane fetch slot produced by the walk. +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub struct DdfFetch { + pub cck: u16, + /// 0-based plane index. + pub plane: u8, + /// Fetch-unit ordinal within the line's run(s): which 8-cck unit this + /// slot belongs to, counting units the sequencer actually ran. Word + /// addressing is unit-based on the hardware, so a plane enabled + /// mid-line keeps fetching at its unit's word position. + pub unit_ord: u16, + /// Unit offset of the slot (0..7), for sub-unit word placement in + /// hires/SHRES units. + pub counter: u8, + /// This is the plane's fetch in the final unit: the plane's modulo is + /// added after the word (BPLxMOD). + pub apply_modulo: bool, +} + +fn plane_count(bplcon0: u16, aga: bool) -> u8 { + crate::chipset::agnus::bitplane_dma_planes(bplcon0, aga) as u8 +} + +/// Per-unit fetch layout: `slots[counter]` = Some(plane) when a DMA slot for +/// that plane sits at unit offset `counter` (0..7 colour clocks; hires +/// fetches two words per unit per plane; SHRES four). +fn unit_slot(bplcon0: u16, aga: bool, counter: u8) -> Option { + let planes = plane_count(bplcon0, aga); + let has = |p: u8| -> Option { (planes >= p).then_some(p - 1) }; + if crate::chipset::agnus::bitplane_shres(bplcon0) { + match counter & 1 { + 0 => has(2), + _ => has(1), + } + } else if crate::chipset::agnus::bitplane_hires(bplcon0) { + match counter & 3 { + 0 => has(4), + 1 => has(2), + 2 => has(3), + _ => has(1), + } + } else { + match counter { + 1 => has(4), + 2 => has(6), + 3 => has(2), + 5 => has(3), + 6 => has(5), + 7 => has(1), + _ => None, + } + } +} + +/// Whether a fetch at unit offset `counter` in the final unit is the plane's +/// last of that unit (modulo applies). Lores planes fetch once per unit +/// (always last); hires planes fetch twice (second half is last); SHRES four +/// times (last quarter). +fn modulo_slot(bplcon0: u16, counter: u8) -> bool { + if crate::chipset::agnus::bitplane_shres(bplcon0) { + counter >= 6 + } else if crate::chipset::agnus::bitplane_hires(bplcon0) { + counter >= 4 + } else { + true + } +} + +/// Emulate the flop updates for one signal strobe. Transcribed from +/// vAmiga's `Sequencer::processSignal` (OCS and ECS variants). +fn process_signal(ecs: bool, signal: &DdfSignal, state: &mut DdfState) { + let bits = signal.bits; + + if bits & sig::CON != 0 { + state.bplcon0 = signal.bplcon0; + } + + if ecs { + process_signal_ecs(bits, state); + } else { + process_signal_ocs(bits, state); + } +} + +fn process_signal_ocs(bits: u32, state: &mut DdfState) { + match bits & (sig::BMAPEN_CLR | sig::BMAPEN_SET) { + x if x == sig::BMAPEN_CLR => { + state.bmapen = false; + state.bprun = false; + state.cnt = 0; + } + x if x == sig::BMAPEN_SET => { + state.bmapen = true; + } + _ => {} + } + match bits & (sig::VFLOP_SET | sig::VFLOP_CLR) { + x if x == sig::VFLOP_SET => { + state.bpv = true; + } + x if x == sig::VFLOP_CLR => { + state.bpv = false; + state.bprun = false; + state.cnt = 0; + } + _ => {} + } + match bits & (sig::SHW | sig::RHW) { + x if x == sig::SHW => { + state.shw = true; + } + x if x == sig::RHW => { + state.rhw |= state.bprun; + state.stopreq |= state.bprun; + } + _ => {} + } + match bits & (sig::BPHSTART | sig::BPHSTOP) { + x if x == sig::BPHSTART | sig::BPHSTOP => { + if state.bprun { + state.bphstart &= !state.bprun; + state.bphstop |= state.bprun; + state.stopreq |= state.bprun; + } else { + state.bphstart = state.bphstart || state.shw; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bmapen; + } + } + x if x == sig::BPHSTART => { + state.bphstart |= state.shw && state.bmapen; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bmapen; + } + x if x == sig::BPHSTOP => { + state.bphstart &= !state.bprun; + state.bphstop |= state.bprun; + state.stopreq |= state.bprun; + } + _ => {} + } + if bits & sig::DONE != 0 { + state.rhw = false; + state.stopreq = false; + } +} + +fn process_signal_ecs(bits: u32, state: &mut DdfState) { + match bits & (sig::VFLOP_SET | sig::VFLOP_CLR) { + x if x == sig::VFLOP_SET => { + state.bpv = true; + } + x if x == sig::VFLOP_CLR => { + state.bpv = false; + state.bprun = false; + state.cnt = 0; + } + _ => {} + } + match bits & (sig::SHW | sig::RHW) { + x if x == sig::SHW => { + state.shw = true; + state.bprun |= state.bphstart && bits & sig::BPHSTOP == 0; + } + x if x == sig::RHW => { + state.rhw = true; + state.stopreq |= state.bprun; + } + _ => {} + } + match bits & (sig::BPHSTART | sig::BPHSTOP | sig::SHW | sig::RHW) { + x if x == sig::BPHSTART | sig::BPHSTOP | sig::SHW => { + state.bphstart = true; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bmapen; + } + x if x == sig::BPHSTART | sig::BPHSTOP | sig::RHW => { + state.bphstop |= state.bprun; + state.stopreq |= state.bprun; + state.bphstart = true; + } + x if x == sig::BPHSTART | sig::BPHSTOP => { + state.bphstop |= state.bprun; + state.stopreq |= state.bprun; + // vAmiga: "likely fix for test case arosddf2 and arosddf4". + state.bphstart = state.bpv; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bmapen; + } + x if x == sig::BPHSTART + || x == sig::BPHSTART | sig::SHW + || x == sig::BPHSTART | sig::RHW => + { + state.bphstart = true; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bmapen; + } + x if x == sig::BPHSTOP || x == sig::BPHSTOP | sig::SHW || x == sig::BPHSTOP | sig::RHW => { + state.bphstart = false; + state.bphstop |= state.bprun; + state.stopreq |= state.bprun; + } + _ => {} + } + match bits & (sig::BMAPEN_CLR | sig::BMAPEN_SET) { + x if x == sig::BMAPEN_CLR => { + state.bmapen = false; + state.bprun = false; + state.cnt = 0; + } + x if x == sig::BMAPEN_SET => { + state.bmapen = true; + state.bprun = (state.bprun || state.shw) && state.bpv && state.bphstart; + } + _ => {} + } + if bits & sig::DONE != 0 { + state.rhw = false; + state.shw = false; + state.bphstop = false; + } +} + +/// Emulate the fetch logic for colour clocks `[start, stop)`, appending +/// produced DMA slots. Transcribed from vAmiga's +/// `Sequencer::computeBplEvents`. +fn walk_span( + aga: bool, + ecs: bool, + start: u16, + stop: u16, + state: &mut DdfState, + unit_ord: &mut Option, + fetches: &mut Vec, +) { + for j in start..stop { + let counter = (state.cnt << 1) | (j & 1) as u8; + + if counter == 0 { + if state.last_fu { + state.bprun = false; + state.last_fu = false; + state.bphstop = false; + if !ecs { + state.shw = false; + } + } + if state.stopreq { + state.stopreq = false; + state.last_fu = true; + } + if state.bprun { + *unit_ord = Some(match *unit_ord { + Some(ord) => ord.saturating_add(1), + None => 0, + }); + } + } + + if state.bprun { + if let Some(plane) = unit_slot(state.bplcon0, aga, counter) { + fetches.push(DdfFetch { + cck: j, + plane, + unit_ord: unit_ord.unwrap_or(0), + counter, + apply_modulo: state.last_fu && modulo_slot(state.bplcon0, counter), + }); + } + if j & 1 == 1 { + state.cnt = (state.cnt + 1) & 3; + } + } else { + state.cnt = 0; + } + } +} + +/// Build the default signal list for a line with static register values. +/// Mid-line register writes append extra signals via `extra` (already +/// positioned at the colour clock where the write reaches the sequencer); +/// same-cck signals are merged like the hardware strobes. +pub fn line_signals( + ddfstrt: u16, + ddfstop: u16, + line_ccks: u16, + extra: &[DdfSignal], +) -> Vec { + line_signals_with_hard_stop(ddfstrt, ddfstop, DDF_HARD_STOP_CCK, line_ccks, extra) +} + +/// [`line_signals`] with a caller-supplied hardware-stop position +/// (BEAMCON0.HARDDIS relaxes the hardwired stop). +pub fn line_signals_with_hard_stop( + ddfstrt: u16, + ddfstop: u16, + hard_stop_cck: u16, + line_ccks: u16, + extra: &[DdfSignal], +) -> Vec { + let mut signals: Vec = Vec::with_capacity(5 + extra.len()); + let mut push = |cck: u16, bits: u32, bplcon0: u16| { + if let Some(existing) = signals.iter_mut().find(|s| s.cck == cck) { + existing.bits |= bits; + if bits & sig::CON != 0 { + existing.bplcon0 = bplcon0; + } + return; + } + signals.push(DdfSignal { cck, bits, bplcon0 }); + }; + push(DDF_HARD_START_CCK, sig::SHW, 0); + if ddfstrt < line_ccks { + push(ddfstrt, sig::BPHSTART, 0); + } + if ddfstop < line_ccks { + push(ddfstop, sig::BPHSTOP, 0); + } + push(hard_stop_cck, sig::RHW, 0); + for s in extra { + push(s.cck, s.bits, s.bplcon0); + } + push(line_ccks, sig::DONE, 0); + signals.sort_by_key(|s| s.cck); + signals +} + +/// Walk one full line. `state` carries across lines: pass the previous +/// line's final state (with `bpv`/`bmapen`/`bplcon0` refreshed by the caller +/// for line-granular changes) and receive this line's final state in place. +pub fn walk_line( + aga: bool, + ecs: bool, + signals: &[DdfSignal], + state: &mut DdfState, +) -> Vec { + let mut fetches = Vec::new(); + let mut cycle = 0u16; + let mut unit_ord: Option = None; + for signal in signals { + walk_span( + aga, + ecs, + cycle, + signal.cck, + state, + &mut unit_ord, + &mut fetches, + ); + process_signal(ecs, signal, state); + if signal.bits & sig::DONE != 0 { + break; + } + cycle = signal.cck; + } + fetches +} + +#[cfg(test)] +mod tests { + use super::*; + + const LINE: u16 = 227; + + fn lores4() -> u16 { + 0x4200 // 4 planes, lores, COLOR on + } + + fn ready_state(bplcon0: u16) -> DdfState { + DdfState { + bpv: true, + bmapen: true, + bplcon0, + ..DdfState::default() + } + } + + fn words_for_plane(fetches: &[DdfFetch], plane: u8) -> usize { + fetches.iter().filter(|f| f.plane == plane).count() + } + + fn walk_static(ecs: bool, ddfstrt: u16, ddfstop: u16, state: &mut DdfState) -> Vec { + let signals = line_signals(ddfstrt, ddfstop, LINE, &[]); + walk_line(false, ecs, &signals, state) + } + + #[test] + fn standard_lores_window_fetches_twenty_words_after_stop_drain() { + // $38/$D0: the stop request lands at $D0 (a unit boundary), the + // final unit drains with modulos: 20 words per plane, plane 1's + // last fetch at $D7. + for ecs in [false, true] { + let mut state = ready_state(lores4()); + let fetches = walk_static(ecs, 0x38, 0xD0, &mut state); + assert_eq!(words_for_plane(&fetches, 0), 20, "ecs={ecs}"); + assert_eq!( + fetches.first().map(|f| f.cck), + Some(0x39), + "first slot (plane 4) one cck into the unit; ecs={ecs}" + ); + assert_eq!(fetches.last().map(|f| (f.cck, f.plane)), Some((0xD7, 0))); + assert!(!state.bprun, "run stops before the line ends; ecs={ecs}"); + let mods: Vec<_> = fetches.iter().filter(|f| f.apply_modulo).collect(); + assert_eq!(mods.len(), 4, "each plane takes its modulo once"); + assert!(mods.iter().all(|f| f.cck >= 0xD0)); + } + } + + #[test] + fn missed_stop_runs_to_the_hardware_stop() { + // DDFSTOP that never matches ($FF is beyond the line): RHW at $D8 + // requests the stop, one further unit drains with modulos. + for ecs in [false, true] { + let mut state = ready_state(lores4()); + let fetches = walk_static(ecs, 0x38, 0xFF, &mut state); + // Units at $38..$D0 = 20, then the $D8 unit drains as the final + // unit (the RHW strobe lands exactly on its boundary): 21 words. + assert_eq!(words_for_plane(&fetches, 0), 21, "ecs={ecs}"); + assert_eq!(fetches.last().map(|f| f.cck), Some(0xDF), "ecs={ecs}"); + assert!(!state.bprun, "ecs={ecs}"); + } + } + + #[test] + fn missed_start_produces_no_fetch_on_ocs() { + let mut state = ready_state(lores4()); + let fetches = walk_static(false, 0xFF, 0xA0, &mut state); + assert!(fetches.is_empty()); + assert!(!state.bprun); + } + + #[test] + fn ecs_latched_start_restarts_at_the_hard_window() { + // ECS: BPHSTART is a latch surviving the line end. A start that + // matched on an earlier line keeps starting runs at SHW ($18) even + // when DDFSTRT never matches again. + let mut state = ready_state(lores4()); + state.bphstart = true; + let fetches = walk_static(true, 0xFF, 0xA0, &mut state); + assert!(!fetches.is_empty()); + assert_eq!( + fetches.first().map(|f| f.cck), + Some(0x19), + "run starts at the hard window start" + ); + // The $A0 stop still matches and drains the run. + assert!(fetches.last().map(|f| f.cck).unwrap() < 0xB0); + } + + #[test] + fn ocs_start_flop_does_not_restart_without_a_match() { + // Same scenario on OCS: the latched BPHSTART flop alone does not + // start a run; OCS needs the comparator edge. + let mut state = ready_state(lores4()); + state.bphstart = true; + let fetches = walk_static(false, 0xFF, 0xA0, &mut state); + assert!(fetches.is_empty()); + } + + #[test] + fn late_start_past_the_hard_stop_wraps_into_the_next_line_on_ocs() { + // A DDFSTRT match after $D8 starts a run that the missed RHW can no + // longer stop: fetching continues through the line end, wraps into + // the next line (through horizontal blanking) and stops after the + // next line's $D8 drain. + let mut state = ready_state(lores4()); + let fetches = walk_static(false, 0xE0, 0xFF, &mut state); + assert!(!fetches.is_empty()); + assert!(state.bprun, "run carries across the line boundary"); + + let next = walk_static(false, 0xE0, 0xFF, &mut state); + assert_eq!( + next.first().map(|f| f.cck), + Some(0x01), + "the carried run fetches from the start of the next line" + ); + // ($E0 matches again on this line while the run is already up, and + // the next-line stop drain repeats; the run keeps cycling.) + assert!(words_for_plane(&next, 0) > 20); + } + + #[test] + fn stop_without_running_fetch_is_ignored() { + for ecs in [false, true] { + let mut state = ready_state(lores4()); + let fetches = walk_static(ecs, 0xFF, 0xA0, &mut state); + assert!(!state.bphstop, "stop flop only latches while running"); + let _ = fetches; + } + } + + #[test] + fn dma_disabled_produces_no_fetches() { + for ecs in [false, true] { + let mut state = ready_state(lores4()); + state.bmapen = false; + let fetches = walk_static(ecs, 0x38, 0xD0, &mut state); + assert!(fetches.is_empty(), "ecs={ecs}"); + } + } + + #[test] + fn vertical_flop_off_produces_no_fetches() { + for ecs in [false, true] { + let mut state = ready_state(lores4()); + state.bpv = false; + let fetches = walk_static(ecs, 0x38, 0xD0, &mut state); + assert!(fetches.is_empty(), "ecs={ecs}"); + } + } + + #[test] + fn hires_window_fetches_two_words_per_unit() { + // Standard hires $3C/$D4: 8-cck units carry two words per plane. + let mut state = ready_state(0x8200 | 0x4000); // hires es, 4 planes + let fetches = walk_static(true, 0x3C, 0xD4, &mut state); + let plane0 = words_for_plane(&fetches, 0); + assert_eq!(plane0 % 2, 0); + assert_eq!(plane0, 40, "20 units, two words per unit"); + } + + #[test] + fn equal_start_and_stop_stops_a_running_fetch_from_a_prior_line() { + // DDFSTRT == DDFSTOP: the combined strobe requests a stop when a + // run is up, and starts one otherwise (OCS). + let mut state = ready_state(lores4()); + let signals = line_signals(0x60, 0x60, LINE, &[]); + let fetches = walk_line(false, false, &signals, &mut state); + // Starts at $60 (no run was up), then RHW stops it. + assert!(!fetches.is_empty()); + assert_eq!(fetches.first().map(|f| f.cck), Some(0x61)); + } + + #[test] + fn mid_line_stop_rewrite_before_match_moves_the_stop() { + // A DDFSTOP rewrite landing before the old value matches replaces + // the stop position: the walk uses the merged signal list. + let mut state = ready_state(lores4()); + let extra = [DdfSignal { + cck: 0x80, + bits: sig::BPHSTOP, + bplcon0: 0, + }]; + // Old stop $D0 removed by the caller; new stop $80 as an extra. + let signals = line_signals(0x38, 0xFF, LINE, &extra); + let fetches = walk_line(false, false, &signals, &mut state); + let last = fetches.last().unwrap().cck; + assert!( + last < 0x90, + "run drains at the rewritten stop, got {last:#x}" + ); + } +} diff --git a/src/chipset/mod.rs b/src/chipset/mod.rs index e58618a..83e0499 100644 --- a/src/chipset/mod.rs +++ b/src/chipset/mod.rs @@ -4,6 +4,7 @@ pub mod agnus; pub mod blitter; pub mod cia; pub mod copper; +pub mod ddf_sequencer; pub mod denise; pub mod keyboard; pub mod paula; diff --git a/src/savestate.rs b/src/savestate.rs index ef973fc..3a34125 100644 --- a/src/savestate.rs +++ b/src/savestate.rs @@ -71,7 +71,10 @@ const STATE_MAGIC: &[u8; 8] = b"CLSSTATE"; // and mmu_write_suppress (the RTE DF-cleared completion protocol, // pending across one instruction boundary) and pending_fault_wdata // (the frame's data output buffer) -pub const STATE_VERSION: u32 = 14; +// 15: Bus gained the bitplane DDF sequencer flop state (ddf_seq_line_initial, +// ddf_seq_line_start_regs, ddf_seq_writes) - the per-line flop walk that +// replaces the value-range DDF window for FMODE=0 fetches +pub const STATE_VERSION: u32 = 15; /// Default state file name, timestamped like the screenshot/recorder names. pub fn auto_filename() -> std::path::PathBuf { From 08786ccf2c9c7b8c3c629c227b5af00ad68ec399 Mon Sep 17 00:00:00 2001 From: Andrew Hutchings Date: Sat, 4 Jul 2026 14:45:51 +0100 Subject: [PATCH 2/3] denise: place captured bitplane rows by their sequencer run origin Rows whose DMA fetch diverges from the register-derived DDF window (the sequencer's missed-stop drains to the hardware stop, late starts) were still painted with the register-derived geometry: word plans, word count, and picture origin all disagreed with the words the capture actually fetched. The capture now records the run's first fetch-unit boundary in CapturedBitplaneRow (STATE_VERSION 16), and the renderer synthesizes the row's DDFSTRT/DDFSTOP from the captured origin and word count, so every register-derived derivation agrees with the DMA. Rows whose registers already match (all sane screens) synthesize nothing and stay byte-identical; runs wrapping through horizontal blanking (origin inside the hardware-blanked area) keep the register view for now. vAmigaTS Agnus/DDF (vs vAmiga 4.4 refs): oldhwstop3 16.7%->9.5%, oldhwstop4 14.2%->7.5%, single4 16.0%->3.1%, single5 8.1%->2.4%, hwstop2 5.2%->1.7%, hwstop4/5 10.0%->6.4%, hwstop6 7.3%->3.7%. KS1.3 boot, Inside the Machine, and Zool screenshots stay byte-identical to main. --- src/bus.rs | 5 ++++ src/bus/ddf_line.rs | 18 ++++++++++++-- src/bus/frame_capture.rs | 1 + src/bus/tests.rs | 26 ++++++++++++++++++++ src/savestate.rs | 4 +++- src/video/bitplane.rs | 48 +++++++++++++++++++++++++++++++++++++ src/video/bitplane/tests.rs | 1 + 7 files changed, 100 insertions(+), 3 deletions(-) diff --git a/src/bus.rs b/src/bus.rs index 904ff19..c3c0cab 100644 --- a/src/bus.rs +++ b/src/bus.rs @@ -272,6 +272,11 @@ pub struct CapturedBitplaneRow { pub nplanes: usize, pub words_per_row: usize, pub planes: [Vec; 8], + /// Colour clock of the row's first fetch-unit boundary when the DDF + /// sequencer's run diverges from the register-derived window (missed + /// stops draining to the hardware stop, late starts). None when the + /// register-derived geometry already matches (and for wide-FMODE rows). + pub fetch_origin_cck: Option, } #[derive(Clone, Copy, Debug, serde::Serialize, serde::Deserialize)] diff --git a/src/bus/ddf_line.rs b/src/bus/ddf_line.rs index b0ea070..7aea95b 100644 --- a/src/bus/ddf_line.rs +++ b/src/bus/ddf_line.rs @@ -49,6 +49,9 @@ pub(super) struct DdfSeqLine { pub word_idx_at: [u16; DDF_SEQ_MAX_LINE_CCKS], /// First fetch colour clock of the line, if any. pub first_fetch_cck: Option, + /// The run's first fetch-unit boundary (first fetch minus its unit + /// offset): the position that anchors word 0 on the display. + pub run_origin_cck: Option, /// Sequencer state after the line's walk (becomes the next line's /// initial state). pub end_state: DdfState, @@ -238,6 +241,7 @@ impl Bus { words_per_plane: [0; 8], word_idx_at: [0; DDF_SEQ_MAX_LINE_CCKS], first_fetch_cck: None, + run_origin_cck: None, end_state: state, }; let shres = crate::chipset::agnus::bitplane_shres(state.bplcon0); @@ -264,6 +268,7 @@ impl Bus { line.words_per_plane[plane].max(line.word_idx_at[idx] + 1); if line.first_fetch_cck.is_none() { line.first_fetch_cck = Some(f.cck); + line.run_origin_cck = Some(f.cck.saturating_sub(u16::from(f.counter))); } } line @@ -405,10 +410,16 @@ impl Bus { // must not skew the visible rows' pointer progression). return; }; - let (plane_at, modulo_at, word_idx_at, words_per_row) = { + let (plane_at, modulo_at, word_idx_at, words_per_row, run_origin) = { let table = self.ddf_seq_line_table(); let wpr = table.words_per_plane.iter().copied().max().unwrap_or(0) as usize; - (table.plane_at, table.modulo_at, table.word_idx_at, wpr) + ( + table.plane_at, + table.modulo_at, + table.word_idx_at, + wpr, + table.run_origin_cck, + ) }; if words_per_row == 0 { return; @@ -454,6 +465,9 @@ impl Bus { slots += 1; } if slots != 0 { + if let Some(row) = self.current_frame_bitplane_rows[fb_y].as_mut() { + row.fetch_origin_cck = run_origin; + } self.record_bitplane_fetch_timing(slots, rows_started, 0, None); } } diff --git a/src/bus/frame_capture.rs b/src/bus/frame_capture.rs index 3daff0f..46a3a26 100644 --- a/src/bus/frame_capture.rs +++ b/src/bus/frame_capture.rs @@ -1548,6 +1548,7 @@ impl Bus { nplanes: display_planes, words_per_row, planes: std::array::from_fn(|_| vec![0; words_per_row]), + fetch_origin_cck: None, }; for plane in dma_planes..display_planes { row.planes[plane].fill(self.denise.bpldat[plane]); diff --git a/src/bus/tests.rs b/src/bus/tests.rs index b13fd84..8363dfe 100644 --- a/src/bus/tests.rs +++ b/src/bus/tests.rs @@ -295,6 +295,7 @@ fn captured_row( CapturedBitplaneRow { nplanes, words_per_row, + fetch_origin_cck: None, planes, } } @@ -2014,6 +2015,7 @@ fn copper_move_palette_write_affects_pixels_after_second_dma_slot() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row, + fetch_origin_cck: None, planes: [ vec![0xFFFF; words_per_row], vec![0; words_per_row], @@ -3457,6 +3459,7 @@ fn beam_timed_display_window_changes_clip_later_bitplane_rows() { bus.current_frame_bitplane_rows[y] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0x4000, 0, 0], vec![0; 3], @@ -3539,6 +3542,7 @@ fn beam_timed_diwstrt_rewrite_after_window_open_does_not_reclip_line() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0xFFFF, 0xFFFF, 0xFFFF], vec![0; 3], @@ -3584,6 +3588,7 @@ fn beam_timed_diwstrt_clips_hidden_bitplane_pixels_without_rebasing_fetch_origin bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0x0400, 0, 0], vec![0; 3], @@ -3627,6 +3632,7 @@ fn beam_timed_diwstrt_extends_later_bitplane_pixels_left_on_same_line() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0xFFFF, 0xFFFF, 0xFFFF], vec![0; 3], @@ -3674,6 +3680,7 @@ fn beam_timed_diwstrt_can_enable_current_bitplane_line() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0xFFFF, 0xFFFF, 0xFFFF], vec![0; 3], @@ -3715,6 +3722,7 @@ fn beam_timed_diwstop_can_enable_current_bitplane_line() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0xFFFF, 0xFFFF, 0xFFFF], vec![0; 3], @@ -3759,6 +3767,7 @@ fn beam_timed_diwstop_extends_later_bitplane_pixels_on_same_line() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 3, + fetch_origin_cck: None, planes: [ vec![0xFFFF, 0xFFFF, 0xFFFF], vec![0; 3], @@ -3929,6 +3938,7 @@ fn bpl1dat_write_triggers_output_while_bitplane_dma_enabled() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 2, + fetch_origin_cck: None, planes: [ vec![0, 0], vec![0, 0], @@ -4167,6 +4177,7 @@ fn same_line_ham_enable_does_not_retime_earlier_playfield_color() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 6, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0xC000], vec![0x0000], @@ -4365,6 +4376,7 @@ fn same_line_bplcon2_killehb_changes_later_extra_half_brite_pixels() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 6, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0xFFFF], vec![0], @@ -4406,6 +4418,7 @@ fn beam_timed_bplcon0_hires_narrows_later_bitplane_pixels() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 4, + fetch_origin_cck: None, planes: [ vec![0x0000, 0x0000, 0x2000, 0x0000], vec![0; 4], @@ -4451,6 +4464,7 @@ fn beam_timed_bplcon0_lowres_widens_later_bitplane_pixels() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 2, + fetch_origin_cck: None, planes: [ vec![0x0000, 0x4000], vec![0; 2], @@ -4497,6 +4511,7 @@ fn same_line_bplcon2_priority_change_reveals_later_sprite_pixels() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0xFFFF], vec![0], @@ -4593,6 +4608,7 @@ fn same_line_bplcon3_pf2of_changes_later_dual_playfield_pixels() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 2, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0], vec![0xFFFF], @@ -5648,6 +5664,7 @@ fn state_load_resets_transient_video_latches() { bus.last_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: std::array::from_fn(|_| vec![0xFFFF]), }); bus.current_frame_sprite_lines.push(CapturedSpriteLine { @@ -5730,6 +5747,7 @@ fn state_load_after_display_start_suppresses_partial_render_frame() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: std::array::from_fn(|_| vec![0xFFFF]), }); bus.current_frame_sprite_lines.push(CapturedSpriteLine { @@ -5758,6 +5776,7 @@ fn state_load_after_display_start_suppresses_partial_render_frame() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: std::array::from_fn(|_| vec![0xAAAA]), }); bus.current_frame_sprite_lines.push(CapturedSpriteLine { @@ -6551,6 +6570,7 @@ fn manual_sprite_data_writes_accumulate_live_sprite_playfield_clxdat() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0x4000], vec![0], @@ -6591,6 +6611,7 @@ fn attached_manual_sprite_data_writes_accumulate_live_sprite_playfield_clxdat() bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0x8000], vec![0], @@ -6830,6 +6851,7 @@ fn shifted_horizontal_diw_offsets_live_playfield_clxdat_fetch_origin() { let row = CapturedBitplaneRow { nplanes: 2, words_per_row: 2, + fetch_origin_cck: None, planes: [ vec![0, 0x1000], vec![0, 0x1000], @@ -6884,6 +6906,7 @@ fn denise_horizontal_delay_aligns_sprite_playfield_collision_domain() { let row = CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0x8000], vec![0], @@ -7085,6 +7108,7 @@ fn live_sprite_playfield_clxdat_skips_already_latched_bits() { let row = CapturedBitplaneRow { nplanes: 1, words_per_row: 1, + fetch_origin_cck: None, planes: [ vec![0x8000], vec![0], @@ -9406,6 +9430,7 @@ fn render_input_refill_from_bus_matches_fresh_snapshot() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row, + fetch_origin_cck: None, planes: [ vec![0xFFFF; words_per_row], vec![0; words_per_row], @@ -9566,6 +9591,7 @@ fn debug_plane_mask_hides_pixels_but_not_collisions() { bus.current_frame_bitplane_rows[0] = Some(CapturedBitplaneRow { nplanes: 1, words_per_row, + fetch_origin_cck: None, planes: [ vec![0xFFFF; words_per_row], vec![0; words_per_row], diff --git a/src/savestate.rs b/src/savestate.rs index 3a34125..9e92bab 100644 --- a/src/savestate.rs +++ b/src/savestate.rs @@ -74,7 +74,9 @@ const STATE_MAGIC: &[u8; 8] = b"CLSSTATE"; // 15: Bus gained the bitplane DDF sequencer flop state (ddf_seq_line_initial, // ddf_seq_line_start_regs, ddf_seq_writes) - the per-line flop walk that // replaces the value-range DDF window for FMODE=0 fetches -pub const STATE_VERSION: u32 = 15; +// 16: CapturedBitplaneRow gained fetch_origin_cck (the sequencer run origin +// for rows whose fetch diverges from the register-derived window) +pub const STATE_VERSION: u32 = 16; /// Default state file name, timestamped like the screenshot/recorder names. pub fn auto_filename() -> std::path::PathBuf { diff --git a/src/video/bitplane.rs b/src/video/bitplane.rs index 73bcbca..c82f0ea 100644 --- a/src/video/bitplane.rs +++ b/src/video/bitplane.rs @@ -2406,6 +2406,41 @@ fn line_control_at_x( control } +/// Rewrite a row's DDFSTRT/DDFSTOP so the register-derived fetch geometry +/// matches the captured sequencer run (origin colour clock + word count). +/// FMODE=0 only; runs that wrap through horizontal blanking (origin before +/// the hardware start window) keep the register view. +fn apply_captured_fetch_geometry(control: &mut ControlState, origin: u16, words: usize) { + if control.fetch_quantum() != 1 || words == 0 { + return; + } + if origin < BITPLANE_DDF_HARD_START { + return; + } + let words_per_unit = (8 / control.fetch_cck_per_word() as usize).max(1); + let units = words.div_ceil(words_per_unit); + let synth_stop = origin + ((units.saturating_sub(1)) as u16) * 8; + if control.ddfstrt == origin && control.ddfstop == synth_stop { + return; + } + let native_w = native_frame_width_for_control(*control); + let current_start = effective_ddf_start_hpos( + control.agnus_revision, + control.hires() || control.shres(), + control.ddfstrt, + ); + let current_words = if control.has_valid_ddf_window() { + control.words_per_row(native_w) + } else { + 0 + }; + if current_start == origin && current_words == words { + return; + } + control.ddfstrt = origin; + control.ddfstop = synth_stop; +} + fn line_words_per_row(base_control: ControlState, control_segments: &[ControlSegment]) -> usize { let base_native_w = native_frame_width_for_control(base_control); let mut words = if base_control.has_valid_ddf_window() { @@ -3431,6 +3466,19 @@ pub fn render_from_input(input: &RenderInput, fb: &mut [u32]) -> RenderResult { let frame_ram = input.chip_ram.as_slice(); let mut ram = TimedChipRam::new(frame_ram, input.chip_ram_writes.as_slice()); let captured_bitplane_rows = input.captured_bitplane_rows.as_slice(); + // Rows whose DMA capture recorded a fetch run diverging from the + // register-derived DDF window (the sequencer's missed-stop drains and + // late starts) carry the run origin and true word count. Synthesize the + // row's DDFSTRT/DDFSTOP from them so every register-derived fetch and + // paint derivation (word plans, words-per-row, picture origin) agrees + // with what the DMA actually did. + for (y, control) in base_controls.iter_mut().enumerate() { + if let Some(row) = captured_bitplane_rows.get(y).and_then(Option::as_ref) { + if let Some(origin) = row.fetch_origin_cck { + apply_captured_fetch_geometry(control, origin, row.words_per_row); + } + } + } let has_captured_bitplane_rows = captured_bitplane_rows.iter().any(Option::is_some); let captured_sprite_lines = input.captured_sprite_lines.as_slice(); let sprite_display_enable_x_by_y = input.sprite_display_enable_x_by_y.as_slice(); diff --git a/src/video/bitplane/tests.rs b/src/video/bitplane/tests.rs index 5c64378..a28d837 100644 --- a/src/video/bitplane/tests.rs +++ b/src/video/bitplane/tests.rs @@ -6288,6 +6288,7 @@ fn manual_bpl1dat_snapshots_dma_updated_bpldat_latches() { captured_rows[0] = Some(CapturedBitplaneRow { nplanes: 3, words_per_row: 1, + fetch_origin_cck: None, planes, }); let events = [beam_event(PAL_VISIBLE_LINE0 as u32, hpos, 0x0110, 0x0000)]; From 4e8842831b0891afdbef56c8c494194c6905bccf Mon Sep 17 00:00:00 2001 From: Andrew Hutchings Date: Sat, 4 Jul 2026 15:09:07 +0100 Subject: [PATCH 3/3] denise: FMODE=0 picture placement rounds up to the shifter reload grid Denise's shifter reloads on a fixed grid (8 colour clocks in lo-res, 4 in hi-res, 2 in SHRES at FMODE=0); a fetch unit starting off that grid has its data wait for the NEXT reload slot. The placement quantization therefore rounds UP, not down. Hardware-verified on the arosddf1 A500 ECS photo: the DDFSTRT $3C lo-res picture sits at the $40 reload slot relative to the copper-anchored ruler dashes, with both ruler ends agreeing on the position (framebuffer 252-259 measured against vAmiga's 254 and the old floor placement's 222). Every on-grid start - all previously calibrated cases including the Kickstart insert-disk art and the wide-FMODE gulp grids - is unchanged; wide FMODE keeps its calibrated floor alignment. vAmigaTS Agnus/DDF: arosddf1-3 12.8% -> 0.008%, arosddf4 12.7% -> 0.07%, ddf3/ddf4/ddf7/ddf8 1.6% -> 0.1%. AROS boot, KS1.3 boot, and Zool screenshots stay byte-identical to main. --- src/bus/tests.rs | 12 +++++++----- src/video/bitplane.rs | 37 +++++++++++++++++++++---------------- 2 files changed, 28 insertions(+), 21 deletions(-) diff --git a/src/bus/tests.rs b/src/bus/tests.rs index 8363dfe..9d39688 100644 --- a/src/bus/tests.rs +++ b/src/bus/tests.rs @@ -4487,11 +4487,13 @@ fn beam_timed_bplcon0_lowres_widens_later_bitplane_pixels() { let mut fb = vec![0; FB_PIXELS]; bitplane::render(&mut bus, &mut fb); - // Content columns sit 2 fb px right of the hardware window edge - // (bitmap positions are beam-anchored; STANDARD_VISIBLE_X0 moved to 62). - assert_eq!(fb[STANDARD_VISIBLE_X0 + 32], rgb12_to_rgba8(0x0000)); - assert_eq!(fb[STANDARD_VISIBLE_X0 + 34], rgb12_to_rgba8(0x0F00)); - assert_eq!(fb[STANDARD_VISIBLE_X0 + 35], rgb12_to_rgba8(0x0F00)); + // The lo-res reinterpretation places the picture on the lo-res shifter + // reload grid: DDFSTRT $3C rounds UP to the $40 slot (hardware-verified + // on the arosddf1 ECS photo), so the widened word-1 bit sits one 8-cck + // unit right of a floor-aligned placement. + assert_eq!(fb[STANDARD_VISIBLE_X0 + 64], rgb12_to_rgba8(0x0000)); + assert_eq!(fb[STANDARD_VISIBLE_X0 + 66], rgb12_to_rgba8(0x0F00)); + assert_eq!(fb[STANDARD_VISIBLE_X0 + 67], rgb12_to_rgba8(0x0F00)); } #[test] diff --git a/src/video/bitplane.rs b/src/video/bitplane.rs index c82f0ea..8228d97 100644 --- a/src/video/bitplane.rs +++ b/src/video/bitplane.rs @@ -1166,26 +1166,31 @@ impl ControlState { }; // The displayed picture position is quantized to the fetch-period // grid (one FMODE gulp per plane). The DMA sequencer itself starts at - // the revision-masked DDFSTRT comparator value, but the shifter - // consumes data in whole 1/2/4-word gulps, so a DDFSTRT moved within - // one gulp changes how much tail data is fetched without necessarily - // moving the visible picture. With - // FMODE=0 the gulp equals the DDF granularity and nothing changes - // (boot-screen insert-disk art is drawn for the continuous placement: - // its negative modulos overlap rows so the hand/disk's right edge - // lives in the next row's first bytes - the calibrated FMODE=0 anchors - // must stay). With wide FMODE fetches system software programs DDFSTRT - // $38 or $3C interchangeably (same 16-cck gulp slot, BPLCON1=0), and - // its interleaved-bitmap modulos expect exactly the visible row width - // in the window - without the placement quantization, the fetch overrun - // displayed inside the window's right edge as the next plane's row - // start. The placement grid anchor is the colour-clock origin, not the - // hard DDF start $18. + // the revision-masked DDFSTRT comparator value, but Denise's shifter + // reloads on its own fixed grid, so data fetched off-grid waits for + // the NEXT reload slot: FMODE=0 placement rounds UP to the gulp grid. + // Hardware-verified on the arosddf1 A500 ECS photo: the DDFSTRT $3C + // lo-res picture sits at the $40 reload slot relative to the + // copper-anchored ruler dashes (both ruler ends agree), one full + // 8-cck unit right of a floor-aligned placement. On-grid starts + // ($30/$38 lo-res, $3C hi-res - every previously calibrated case, + // including the boot-screen insert-disk art) are unchanged by + // rounding direction. With wide FMODE fetches system software + // programs DDFSTRT $38 or $3C interchangeably (same 16-cck gulp + // slot, BPLCON1=0), and its interleaved-bitmap modulos expect + // exactly the visible row width in the window; the wide grids keep + // their calibrated floor alignment. The placement grid anchor is the + // colour-clock origin, not the hard DDF start $18. let align = |hpos: i32| -> i32 { let gulp = self.fetch_period() as i32; + let aligned = if self.fetch_quantum() == 1 { + hpos.div_euclid(gulp) * gulp + if hpos.rem_euclid(gulp) != 0 { gulp } else { 0 } + } else { + hpos.div_euclid(gulp) * gulp + }; // Clamped to the DDF hard start: placement before the first usable // fetch position is not visible. - (hpos.div_euclid(gulp) * gulp).max(BITPLANE_DDF_HARD_START as i32) + aligned.max(BITPLANE_DDF_HARD_START as i32) }; let ddf_native_shift = (align(effective_ddf_start_hpos( self.agnus_revision,