growgraph · alexander-belikov · Mar 31, 2026 · Mar 31, 2026 · Mar 31, 2026 · Mar 31, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,36 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 
+## [1.7.9] - 2026-04-01
+
+### Added
+
+- **`Bindings.get_connectors_for_resource(name)`** returns an ordered list of connectors (unique by hash) for an ingestion resource, supporting **1→n** resource–connector wiring.
+- **`BoundSourceKind`** enum (`file`, `sql_table`, `sparql`) and **`ResourceConnector.bound_source_kind()`** describe the physical source modality of a connector (replacing the old “resource type” wording).
+- **`Resource.drop_trivial_input_fields`** (default `false`): when `true`, removes **top-level** keys whose value is `null` or `""` from each input record before the actor pipeline runs—useful for wide, sparse rows without custom transforms. Does not recurse into nested objects.
+
+### Changed
+
+- **`DBWriter`**: No longer calls `Schema.finish_init()` or `IngestionModel.finish_init()` on every `write()`. The orchestrator (e.g. **`Caster.ingest`**) is responsible for initializing schema and ingestion model for the target DB before writes. This avoids redundant work on each batch and prevents the writer from resetting ingestion flags (`strict_references`, `allowed_vertex_names`) that **`Caster`** had already applied.
+- **`DBWriter`**: Reuses a cached **`SchemaDBAware`** projection for a given connection DB type instead of rebuilding it on every `write()`.
+- **Ingestion caps**: `IngestionParams.max_items` is documented and validated (`>= 1` when set). **`SparqlEndpointDataSource.iter_batches`** paginates without loading the full endpoint result into memory, uses **`ORDER BY ?s`** when the query has no `ORDER BY`, and honors **`limit`** as a subject count. **`SQLDataSource`** and offset/page **API** pagination pass a tighter per-request page size when a total cap is close (fewer over-fetched rows/items).
+- **`RegistryBuilder`** registers **every** connector bound to each resource and dispatches on **`connector.bound_source_kind()`**; SQL registration uses the connector’s own table/schema fields instead of a resource-level table lookup.
+- **Auto-join** (`_vertex_table_info`) resolves table metadata via the list API and **raises** if more than one `TableConnector` is bound to the same vertex/resource key used for disambiguation.
+
+### Breaking
+
+
+- **`DBWriter`**: The **`dynamic_edges`** constructor argument was removed (it only drove the redundant `finish_init` call). Configure dynamic edge behavior via **`Caster`** / **`IngestionParams.dynamic_edges`** and ingestion **`finish_init`** as before.
+- **`ResourceType`** removed in favor of **`BoundSourceKind`**; **`get_resource_type()`** removed in favor of **`bound_source_kind()`** on connectors (update imports and call sites).
+- **`Bindings`**: **`get_connector_for_resource`**, **`get_resource_type`**, and **`get_table_info`** removed; use **`get_connectors_for_resource`** and connector fields / `bound_source_kind()` instead.
+- **`connector_connection` / internal connector refs**: resolution allows only **connector `name`** or **canonical `hash`**. Using an ingestion **resource name** as a `connector` reference is no longer supported (resource names are no longer 1:1 with connectors).
+- **`bind_resource`** and manifest **`resource_connector`** validation: additional rows for the same `resource` append connectors instead of replacing or conflicting.
+
+### Documentation
+
+- **Examples / docs**: `examples/9-connector-connection-proxy` and manifest guides updated for explicit connector names in `connector_connection`. Concepts and README clarify 1→n bindings and proxy wiring.
+- **`Resource.drop_trivial_input_fields`**: described in [Concepts](docs/concepts/index.md) (DataSources vs Resources) and [Documentation home — Resource](docs/index.md#resource).
+
 ## [1.7.7] - 2026-03-27
 
 ### Changed
@@ -20,8 +50,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **`Bindings.connector_connection_bindings`** (typed view), **`get_conn_proxy_for_connector`**, and **`bind_connector_to_conn_proxy`**: API aligned with HQ loaders (`ResourceMapper`, `GraphEngine`) for proxy-based source wiring.
 
 ### Changed
-- **Connector reference resolution**: `connector_connection` entries may reference a connector by canonical **hash**, declared **`name`**, or a **`resource` name** when that resource is already mapped to the connector (mirrors validation in `Bindings`).
-- **`Bindings` validation**: duplicate connector `name` values, conflicting resource→connector mappings, and conflicting `conn_proxy` for the same connector hash now fail fast with explicit errors.
+- **Connector reference resolution**: `connector_connection` entries may reference a connector by canonical **hash**, declared **`name`**, or a **`resource` name** when that resource is already mapped to the connector (mirrors validation in `Bindings`). **Update (1.7.8):** resource-name aliasing for `connector` refs was removed; use **connector `name` or `hash`** only.
+- **`Bindings` validation**: duplicate connector `name` values and conflicting `conn_proxy` for the same connector hash now fail fast with explicit errors. **Update (1.7.8):** many connectors may attach to the same ingestion resource (1→n); overlapping resource rows no longer raise “conflicting resource binding” for distinct connectors.
 
 ### Breaking
 - **`Bindings.from_dict` / manifest validation**: legacy top-level keys `postgres_connections`, `table_connectors`, `file_connectors`, and `sparql_connectors` are rejected. Migrate to the unified `connectors` + `resource_connector` (+ optional `connector_connection`) shape.

diff --git a/README.md b/README.md
@@ -76,7 +76,7 @@ ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, NebulaGraph — same API for al
 - **Schema inference** — Generate graph schemas from PostgreSQL 3NF databases (PK/FK heuristics) or from OWL/RDFS ontologies (`owl:Class` → vertices, `owl:ObjectProperty` → edges, `owl:DatatypeProperty` → vertex fields).
 - **Typed fields** — Vertex fields and edge weights carry types (`INT`, `FLOAT`, `STRING`, `DATETIME`, `BOOL`) for validation and database-specific optimisation.
 - **Parallel batch processing** — Configurable batch sizes and multi-core execution.
-- **Credential-free source contracts** — `Bindings.connector_connection` maps each `TableConnector` / `SparqlConnector` (by name, hash, or resource alias) to a `conn_proxy` label. Manifests stay free of secrets; a runtime `ConnectionProvider` resolves each proxy to concrete `GeneralizedConnConfig` (for example PostgreSQL or SPARQL endpoint settings).
+- **Credential-free source contracts** — `Bindings.connector_connection` maps each `TableConnector` / `SparqlConnector` (by **connector name** or **hash**) to a `conn_proxy` label. Manifests stay free of secrets; a runtime `ConnectionProvider` resolves each proxy to concrete `GeneralizedConnConfig` (for example PostgreSQL or SPARQL endpoint settings). Ingestion resource names are separate and may map to multiple connectors.
 
 ## Documentation
 Full documentation is available at: [growgraph.github.io/graflo](https://growgraph.github.io/graflo)

diff --git a/docs/concepts/index.md b/docs/concepts/index.md
@@ -46,7 +46,7 @@ flowchart LR
 - **GraphManifest** — the canonical top-level contract that composes `schema`, `ingestion_model`, and `bindings`.
 - **Schema** — the declarative logical graph model (`Schema`): vertex/edge definitions, identities, typed fields, and DB profile.
 - **IngestionModel** — reusable resources and transforms used to map records into graph entities.
-- **Bindings** — named `FileConnector` / `TableConnector` / `SparqlConnector` list plus `resource_connector` (resource→connector) and optional `connector_connection` (connector→`conn_proxy` for runtime `ConnectionProvider` resolution without secrets in the manifest).
+- **Bindings** — named `FileConnector` / `TableConnector` / `SparqlConnector` list plus `resource_connector` (many rows per resource allowed: resource→0..n connectors) and optional `connector_connection` (connector **name** or **hash**→`conn_proxy` for runtime `ConnectionProvider` resolution without secrets in the manifest). Each connector exposes a **bound source modality** (`BoundSourceKind`: file, SQL table, SPARQL) for dispatch, distinct from the abstract ingestion **Resource**.
 - **Database-Independent Graph Representation** — a `GraphContainer` of vertices and edges, independent of any target database.
 - **Graph DB** — the target LPG store (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, NebulaGraph).
 
@@ -94,9 +94,9 @@ flowchart LR
     Res --> Ex --> Asm --> GC --> DBW
 ```
 
-- **Bindings** (`FileConnector`, `TableConnector`, `SparqlConnector`) describe *where* data comes from (file paths, SQL tables, SPARQL endpoints). Optional **`connector_connection`** entries assign each SQL/SPARQL connector a **`conn_proxy`** label; the `ConnectionProvider` turns that label into real connection config at runtime so manifests stay credential-free.
+- **Bindings** (`FileConnector`, `TableConnector`, `SparqlConnector`) describe *where* data comes from (file paths, SQL tables, SPARQL endpoints). Multiple connectors may attach to the same ingestion resource name; optional **`connector_connection`** entries assign each SQL/SPARQL connector a **`conn_proxy`** by **connector `name` or `hash`** (not by resource name). The `ConnectionProvider` turns that label into real connection config at runtime so manifests stay credential-free.
 - **DataSources** (`AbstractDataSource` subclasses) handle *how* to read data in batches. Each carries a `DataSourceType` and is registered in the `DataSourceRegistry`.
-- **Resources** define *what* to extract — each `Resource` is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements.
+- **Resources** define *what* to extract — each `Resource` is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements. Set **`drop_trivial_input_fields`: `true`** on a resource to strip top-level `null` / `""` fields from each row before the pipeline (optional, default `false`).
 - **GraphContainer** (covariant graph representation) collects the resulting vertices and edges in a database-independent format.
 - **DBWriter** pushes the graph data into the target LPG store (ArangoDB, Neo4j, TigerGraph, FalkorDB, Memgraph, NebulaGraph).
 
@@ -176,6 +176,7 @@ classDiagram
         +connectors: list~ResourceConnector~
         +resource_connector: list~ResourceConnectorBinding~
         +connector_connection: list~ConnectorConnectionBinding~
+        +get_connectors_for_resource(name) list
         +get_conn_proxy_for_connector(connector) str?
         +bind_connector_to_conn_proxy(connector, conn_proxy)
     }
@@ -479,6 +480,7 @@ These are the two key abstractions that decouple *data retrieval* from *graph tr
 - **DataSources** (`AbstractDataSource` subclasses) — handle *where* and *how* data is read. Each carries a `DataSourceType` (`FILE`, `SQL`, `SPARQL`, `API`, `IN_MEMORY`). Many DataSources can bind to the same Resource by name via the `DataSourceRegistry`.
 
 - **Resources** (`Resource`) — handle *what* the data becomes in the LPG. Each Resource is a reusable actor pipeline (descend → transform → vertex → edge) that maps raw records to graph elements. Because DataSources bind to Resources by name, the same transformation logic applies regardless of whether data arrives from a file, an API, or a SPARQL endpoint.
+  - Optional **`drop_trivial_input_fields`** (default `false` on the model): when `true`, each record is preprocessed by dropping **top-level** keys whose value is `null` or the empty string `""` before actors run. This trims sparse wide rows (many unused columns) without extra transforms; nested dicts and lists are not walked.
 
 ## Core Components
 

diff --git a/docs/examples/example-9.md b/docs/examples/example-9.md
@@ -8,7 +8,7 @@ The manifest stays credential-free: `bindings.connector_connection` only contain
 
 ## Manifest: what `connector_connection` looks like
 
-Inside `bindings` you explicitly map each connector to a proxy label:
+Inside `bindings` you explicitly map each connector to a proxy label. The `connector` field must be a **connector `name`** or **canonical hash**, not an ingestion resource name (a resource may be bound to several connectors).
 
 ```yaml
 bindings:
@@ -23,7 +23,7 @@ bindings:
       conn_proxy: postgres_source
 ```
 
-In the code, connectors omit `connector.name` and use `connector.resource_name` (so the manifest references are stable and human-readable).
+In the companion script, each `TableConnector` sets `name` to match those references (here they match the table/resource names only for readability).
 
 ## Runtime: how the proxy label becomes a real DB config
 

diff --git a/docs/getting_started/creating_manifest.md b/docs/getting_started/creating_manifest.md
@@ -74,6 +74,7 @@ Defines ingestion behavior.
 
 - `resources`: named pipelines (`name`) with ordered actor steps
 - `transforms`: reusable named transforms as a **list** (each entry must define `name`) and referenced from resources via `transform.call.use`
+- Optional per-resource flags include **`drop_trivial_input_fields`** (default `false`): when `true`, top-level `null` or `""` fields are removed from each row before the pipeline—handy for sparse wide tables without extra transforms (shallow only; nested objects are unchanged).
 
 Use `ingestion_model` for **how source records become vertices/edges**.
 
@@ -82,10 +83,10 @@ Use `ingestion_model` for **how source records become vertices/edges**.
 Defines source wiring (`Bindings`).
 
 - **`connectors`**: list of `FileConnector`, `TableConnector`, or `SparqlConnector` entries (where each row points at paths, tables, or RDF/SPARQL sources).
-- **`resource_connector`**: list of `{"resource": "<ingestion resource name>", "connector": "<connector name or reference>"}` rows linking `IngestionModel.resources[*].name` to a connector.
-- **`connector_connection`** (optional): list of `{"connector": "<name|hash|resource alias>", "conn_proxy": "<label>"}` rows. This keeps manifests **non-secret**: only proxy *names* appear in YAML; runtime code registers each `conn_proxy` on a `ConnectionProvider` with the real `GeneralizedConnConfig` (PostgreSQL, SPARQL, etc.).
+- **`resource_connector`**: list of `{"resource": "<ingestion resource name>", "connector": "<connector name or hash>"}` rows linking `IngestionModel.resources[*].name` to a connector. The same `resource` may appear on **multiple rows** with different `connector` values (several physical sources for one pipeline).
+- **`connector_connection`** (optional): list of `{"connector": "<connector name or hash>", "conn_proxy": "<label>"}` rows. This keeps manifests **non-secret**: only proxy *names* appear in YAML; runtime code registers each `conn_proxy` on a `ConnectionProvider` with the real `GeneralizedConnConfig` (PostgreSQL, SPARQL, etc.).
 
-Connector references in `resource_connector` / `connector_connection` must match a connector `name` (or resolve via hash / resource alias as documented in `Bindings`). Duplicate connector names and conflicting resource or proxy mappings are rejected at validation time.
+Connector references in `resource_connector` / `connector_connection` must match a connector’s declared **`name`** or canonical **`hash`**. Ingestion **resource names** are not connector references (they can map 1→*n*). Duplicate connector `name` values and conflicting `conn_proxy` mappings for the same connector hash are rejected at validation time.
 
 The block can be left empty in-file (`bindings: {}`) and supplied at runtime for env-specific deployments.
 

diff --git a/docs/getting_started/quickstart.md b/docs/getting_started/quickstart.md
@@ -128,7 +128,7 @@ engine.define_and_ingest(
 
 Here `schema` defines the logical graph, while `ingestion_model` defines resources/transforms and `bindings` maps resources to physical data sources. See [Creating a Manifest](creating_manifest.md) and [Concepts — Schema](../concepts/index.md#schema) for details.
 
-`Bindings` maps resource names (from `IngestionModel`) to their physical data sources:
+`Bindings` maps resource names (from `IngestionModel`) to one or more physical data sources (the same resource may list several connectors):
 - **FileConnector**: For file-based resources with `regex` for matching filenames and `sub_path` for the directory to search
 - **TableConnector**: For PostgreSQL table resources (table/schema/view metadata on the connector; connection URLs and secrets are **not** stored in the manifest when using **`connector_connection`** — see below)
 - **SparqlConnector**: RDF class / SPARQL endpoint wiring (same proxy pattern as SQL when needed)

diff --git a/docs/index.md b/docs/index.md
@@ -57,6 +57,8 @@ Resources and transforms are part of `IngestionModel`, not `Schema`.
 
 A `Resource` is the central abstraction that bridges data sources and the graph schema. Each Resource defines a reusable pipeline of actors (descend, transform, vertex, edge) that maps raw records to graph elements. Data sources bind to Resources by name via the `DataSourceRegistry`, so the same transformation logic applies regardless of whether data arrives from a file, an API, or a SPARQL endpoint.
 
+For wide rows with many empty or null columns, **`drop_trivial_input_fields`** (default `false`) removes only **top-level** keys whose value is `null` or `""` before the pipeline runs—no recursion into nested structures.
+
 ### DataSourceRegistry
 
 The `DataSourceRegistry` manages `AbstractDataSource` adapters, each carrying a `DataSourceType`:

diff --git a/examples/9-connector-connection-proxy/README.md b/examples/9-connector-connection-proxy/README.md
@@ -6,6 +6,7 @@ This example demonstrates the non-secret runtime indirection:
 
 Key points:
 - The manifest stores only `conn_proxy` labels inside `bindings.connector_connection`.
+- Each `connector` row references a connector by **`name` or `hash`** (not by ingestion resource name).
 - The runtime script registers the real `PostgresConfig` under that proxy label
   via `InMemoryConnectionProvider`.
 - `provider.bind_from_bindings(bindings=...)` connects manifest connectors

diff --git a/examples/9-connector-connection-proxy/explicit_proxy_binding.py b/examples/9-connector-connection-proxy/explicit_proxy_binding.py
@@ -48,25 +48,30 @@ def _load_mock_postgres_schema(*, postgres_conf: PostgresConfig) -> None:
 
 def make_explicit_postgres_bindings(conn_proxy: str) -> Bindings:
     """Create manifest bindings with explicit connector_connection proxy labels."""
-    # In this example we keep `connector.name` omitted and rely on
-    # connector.resource_name as the stable manifest alias.
+    # Each connector has an explicit `name` so `connector_connection.connector`
+    # can reference it. Ingestion resource names still come from `resource_name`
+    # (or `resource_connector`); those names are not valid connector refs.
     connectors = [
         TableConnector(
+            name="users",
             table_name="users",
             schema_name="public",
             resource_name="users",
         ),
         TableConnector(
+            name="products",
             table_name="products",
             schema_name="public",
             resource_name="products",
         ),
         TableConnector(
+            name="purchases",
             table_name="purchases",
             schema_name="public",
             resource_name="purchases",
         ),
         TableConnector(
+            name="follows",
             table_name="follows",
             schema_name="public",
             resource_name="follows",

diff --git a/graflo/__init__.py b/graflo/__init__.py
@@ -47,8 +47,8 @@
     GraphModel,
     Index,
     IngestionModel,
+    BoundSourceKind,
     ResourceConnector,
-    ResourceType,
     Resource,
     SparqlConnector,
     Schema,
@@ -135,8 +135,8 @@
     "FileConnector",
     "Bindings",
     "JoinClause",
+    "BoundSourceKind",
     "ResourceConnector",
-    "ResourceType",
     "SparqlConnector",
     "TableConnector",
 ]
diff --git a/graflo/architecture/__init__.py b/graflo/architecture/__init__.py
@@ -17,8 +17,8 @@
     JoinClause,
     ProtoTransform,
     Resource,
+    BoundSourceKind,
     ResourceConnector,
-    ResourceType,
     SparqlConnector,
     TableConnector,
     Transform,
@@ -54,8 +54,8 @@
     "JoinClause",
     "ProtoTransform",
     "Resource",
+    "BoundSourceKind",
     "ResourceConnector",
-    "ResourceType",
     "Schema",
     "SchemaDBAware",
     "SparqlConnector",