Skip to content

Microplate: Introduce Coordinates-aware WellMap to eliminate string key overhead #71

@simbig

Description

@simbig

Problem

The Microplate module stores wells in Collection<string, TWell|null> where keys are coordinate strings like "A1", "H12". This means every consumer that needs actual Coordinates objects must reconstruct them via Coordinates::fromString($key, $coordinateSystem).

This creates several categories of friction that grow with adoption.

Pain Point 1: Constant fromString() reconstruction

Every filledWells() consumer immediately parses string keys back into objects:

foreach ($microplate->filledWells() as $coordinateFromKey => $well) {
    $coordinates = Coordinates::fromString($coordinateFromKey, $microplate->coordinateSystem);
    $well->coordinates = $coordinates;
    $well->save();
}

In projects consuming this library, this pattern repeats in many places — run builders, sample sheets, worklist generators, scanner parsers, etc.

Pain Point 2: Lost CoordinateSystem — silent type erasure

String keys carry no coordinate system information. Consumers must thread the CoordinateSystem separately, and when the microplate reference isn't available, they hardcode it:

Coordinates::fromString($coordinatesString, new CoordinateSystem12x8());

If the coordinate system doesn't match the actual plate, this fails silently or produces wrong results.

Pain Point 3: Internal regex overhead on every sort/filter/check

Methods inside AbstractMicroplate itself parse strings back to objects:

  • sortedWells(COLUMN): calls fromString() on every well just to rearrange row/column for sort key
  • matchRow() / matchColumn(): calls fromString() on every tested well in the filter callback
  • isConsecutive(): calls fromString() on every filled well to get positions

Each fromString() runs a regex match with validation — non-trivial work for data that was already a valid Coordinates object moments before.

Pain Point 4: Round-trip waste

Some downstream code serializes then immediately deserializes (or vice versa):

// string → object → string, just for validation/normalization
Coordinates::fromString($coordinatesString, new CoordinateSystem12x8())->toString();

// position → object → string
Coordinates::fromPosition($position, FlowDirection::COLUMN(), new CoordinateSystem12x8())->toString();

Pain Point 5: Natural sort doesn't work on coordinate strings

Coordinate strings don't sort correctly ("A10" < "A2" lexicographically). In projects, this forces workarounds like SQL FIELD() clauses that manually supply the correct order:

$list = $microplate->freeWells()->keys()->join("','");
$query->orderByRaw("FIELD(coordinates_column, '{$list}')");

Pain Point 6: toWellWithCoordinatesMapper() exists as a band-aid

The existence of this helper method documents the problem — it reconstructs Coordinates from string keys to pair them back with well contents. In practice it sees little adoption, suggesting the ergonomics aren't compelling enough.


Proposed Solution: WellMap<TWell, TCoordinateSystem>

Introduce a WellMap class that stores wells alongside their Coordinates objects internally, eliminating all fromString() calls for consumers.

Core idea

/**
 * @template TWell
 * @template TCoordinateSystem of CoordinateSystem
 * @implements \IteratorAggregate<Coordinates<TCoordinateSystem>, TWell|null>
 */
class WellMap implements \IteratorAggregate, \Countable
{
    // Dual internal storage: string key → content, string key → Coordinates object
    // String key is an implementation detail, never exposed to consumers
}

Consumer-facing API

// Iteration yields Coordinates objects as keys (PHP 7.4+ generator support)
foreach ($microplate->wellMap()->filled() as $coordinates => $well) {
    $well->coordinates = $coordinates;  // no fromString() needed
}

// Fluent operations with Coordinates in every callback
$wellMap->filled()->sorted(FlowDirection::ROW())->map(
    fn ($content, Coordinates $coordinates): string => $coordinates->toPaddedString() . ',' . $content->name
);

// Direct filtering without regex per well
$wellMap->filterByRow('A');
$wellMap->filterByColumn(1);

// Legacy bridge for backward compat
$wellMap->toLegacyCollection();    // Collection<string, TWell|null>
$wellMap->toStringKeyedArray();    // array<string, TWell|null>
$wellMap->coordinateStrings();     // list<string>

Non-breaking integration

  • wellMap() is added as a concrete method on AbstractMicroplate (not abstract) with a default fallback that builds from wells().
  • All existing methods (wells(), filledWells(), freeWells(), sortedWells(), matchRow(), matchColumn(), isConsecutive(), toWellWithCoordinatesMapper()) stay and work unchanged, marked @deprecated.
  • Microplate overrides wellMap() efficiently by caching Coordinates objects during clearWells() / setWell() — zero fromString() calls.
  • SectionedMicroplate and any third-party subclasses use the default fallback automatically.

Migration phases

  1. Phase 1 (minor version): Add WellMap class + wellMap() method + @deprecated annotations. No existing code breaks.
  2. Phase 2 (consumers): Downstream projects migrate at their own pace.
  3. Phase 3 (next major): Make wellMap() abstract, remove deprecated methods.

Pros

  • Eliminates all consumer fromString() boilerplate — the primary pain point
  • CoordinateSystem always preserved in the Coordinates object — no more hardcoded coordinate system guessing
  • Internal performance improvementsortedWells(), matchRow(), matchColumn(), isConsecutive() stop running regex on every well
  • Fully non-breaking in Phase 1 — additive only, existing code untouched
  • Negligible memory overhead — 96 extra object references for a 96-well plate
  • PHP 7.4 compatible — generators support object keys since PHP 5.5
  • Migration is mechanical — no logic changes, just filledWells()wellMap()->filled() etc.
  • toWellWithCoordinatesMapper() and WellWithCoordinates DTO become unnecessary — simplifies the API surface

Cons

  • New class to maintainWellMap adds ~200 LOC with its own method surface
  • Dual storage during Phase 1Microplate keeps both Collection and coordinate cache until Phase 3
  • Re-implements some Collection methodsmap(), filter(), each() etc. exist on both Collection and WellMap, but with different callback signatures (Coordinates vs string)
  • Generator iteration quirkiterator_to_array() doesn't work with object keys (PHP limitation). Consumers must use foreach or ->toStringKeyedArray() for array conversion
  • Deprecated methods remain during Phase 1+2 — the old API stays "tempting" to use until the next major version removes it

Alternatives Considered

Alternative A: SplObjectStorage

PHP's built-in object-to-data map. Rejected because:

  • Uses strict object identity (spl_object_id), not value equality — two Coordinates('A', 1, $sys) instances are treated as different keys
  • Incompatible with Illuminate\Collection
  • Awkward iteration semantics, poor PHPStan support, not serializable

Alternative B: Just add filledWellsWithCoordinates() returning list<WellWithCoordinates>

Narrower fix — only addresses iteration, doesn't fix sortedWells(), matchRow(), matchColumn(), isConsecutive() internal overhead or provide a fluent API.

Alternative C: Change Collection key type to store Coordinates directly

Not possible — PHP arrays only support int|string keys.

Alternative D: Extend Collection with a WellCollection subclass

Maintains Collection API compatibility but can't change the key type in callbacks. Would need method overrides that still return string keys. Doesn't solve the fundamental problem.


Example: Before vs After

Iterating filled wells (most common pattern)

// Before
foreach ($microplate->filledWells() as $coordinateFromKey => $well) {
    $coordinates = Coordinates::fromString($coordinateFromKey, $microplate->coordinateSystem);
    $well->coordinates = $coordinates;
    $well->save();
}

// After
foreach ($microplate->wellMap()->filled() as $coordinates => $well) {
    $well->coordinates = $coordinates;
    $well->save();
}

Generating sample sheets / worklists

// Before
$body = $microplate->sortedWells(FlowDirection::ROW())
    ->map(function ($well, string $coordinateString) use ($microplate): string {
        $coordinates = Coordinates::fromString($coordinateString, $microplate->coordinateSystem);
        return $coordinates->toPaddedString() . ',' . $well->name();
    })
    ->join("\n");

// After
$body = implode("\n",
    $microplate->wellMap()->sorted(FlowDirection::ROW())
        ->map(fn ($well, Coordinates $coordinates): string =>
            $coordinates->toPaddedString() . ',' . $well->name()
        )
);

Filtering by row or column

// Before
$rowA = $microplate->filledWells()->filter($microplate->matchRow('A'));

// After
$rowA = $microplate->wellMap()->filled()->filterByRow('A');

Looking forward to hearing thoughts on whether this direction makes sense and if there are concerns about the approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions