Skip to content

[Model] RootedTreeStorageAssignment #409

@isPANN

Description

@isPANN

Motivation

ROOTED TREE STORAGE ASSIGNMENT (P153) from Garey & Johnson, A4 SR5. An NP-complete problem from the Storage and Retrieval category. Given a finite set X and a collection of subsets of X, the goal is to find a directed rooted tree on X and extend each subset (by adding extra elements) so that each extended subset forms a directed path in the tree, while minimizing the total number of added elements. This models the problem of organizing data in a hierarchical (tree-structured) storage system where related data items (subsets) must be stored along contiguous paths in the hierarchy. NP-completeness is proved by Gavril (1977) via reduction from ROOTED TREE ARRANGEMENT.

Associated rules:

  • R099: Rooted Tree Arrangement → Rooted Tree Storage Assignment (as target)

Definition

Name: RootedTreeStorageAssignment
Canonical name: ROOTED TREE STORAGE ASSIGNMENT
Reference: Garey & Johnson, Computers and Intractability, A4 SR5, p.227

Mathematical definition:

INSTANCE: Finite set X, collection C = {X_1, X_2, ..., X_n} of subsets of X, positive integer K.
QUESTION: Is there a collection C' = {X_1', X_2', ..., X_n'} of subsets of X such that X_i is a subset of X_i' for 1 <= i <= n, such that sum_{i=1}^{n} |X_i' - X_i| <= K, and such that there is a directed rooted tree T = (X, A) in which the elements of each X_i', 1 <= i <= n, form a directed path?

Variables

  • Count: |X| variables (one per element of X), encoding the tree topology as a parent array.
  • Per-variable domain: Each variable takes values in {0, 1, ..., |X|-1}.
  • Encoding: config[i] = parent of node i in the rooted tree T. The root r satisfies config[r] = r. This gives dims() = vec![universe_size; universe_size].
  • Meaning: The variable assignment encodes a hierarchical storage layout (the rooted tree T). Once the tree is fixed, the optimal extensions for each subset are determined greedily: for each subset X_i, find the shortest directed path in T containing all elements of X_i. The extension cost |X_i' - X_i| equals the number of intermediate nodes added. The evaluate() function checks: (1) the parent array forms a valid rooted tree, (2) for each subset, all elements lie on a common root-to-descendant path, (3) the total extension cost (sum of intermediate nodes across all subsets) does not exceed K. If the parent array is not a valid rooted tree (e.g., contains cycles or multiple roots), evaluate returns false.

Schema (data type)

Type name: RootedTreeStorageAssignment
Variants: none (no type parameters)

Field Type Description
universe_size usize Size of the finite set X (elements labeled 0..
subsets Vec<Vec<usize>> Collection C of subsets of X (each subset is a Vec of element indices)
bound usize Maximum total extension cost K

Size fields:

  • universe_size() — number of elements |X|
  • num_subsets() — number of subsets |C|

Notes:

  • This is a satisfaction (decision) problem: Metric = bool, implementing SatisfactionProblem.
  • A "directed path" in T means a sequence of nodes x_1, x_2, ..., x_k where each x_{i+1} is a child of x_i (i.e., the path goes from ancestor to descendant along the tree).
  • The tree T is part of the solution, not the instance. The solver must find T; extensions are computed deterministically once T is fixed.

Complexity

  • Best known exact algorithm: No specialized sub-exponential exact algorithm is known. Brute-force enumerates all parent-array configurations (|X|^|X| total), filters valid rooted trees, and for each valid tree computes the greedy extension cost.
  • Complexity string for declare_variants!: "universe_size^universe_size"
  • NP-completeness: NP-complete [Gavril, 1977a], via reduction from ROOTED TREE ARRANGEMENT (GT45).
  • Special cases: If all subsets have size <= 2 (pairs), the problem relates directly to tree arrangement. If the tree T is given as part of the input, finding optimal extensions can be done in polynomial time.
  • References:
    • F. Gavril (1977). "Some NP-complete problems on graphs." In: Proceedings of the 11th Conference on Information Sciences and Systems, pp. 91-95. Johns Hopkins University.

Extra Remark

Full book text:

INSTANCE: Finite set X, collection C = {X_1, X_2, ..., X_n} of subsets of X, positive integer K.
QUESTION: Is there a collection C' = {X_1', X_2', ..., X_n'} of subsets of X such that X_i is a subset of X_i' for 1 <= i <= n, such that sum_{i=1}^{n} |X_i' - X_i| <= K, and such that there is a directed rooted tree T = (X, A) in which the elements of each X_i', 1 <= i <= n, form a directed path?
Reference: [Gavril, 1977a]. Transformation from ROOTED TREE ARRANGEMENT.

How to solve

  • It can be solved by (existing) bruteforce — the parent-array encoding maps directly to the Problem trait: dims() = vec![universe_size; universe_size] enumerates all parent arrays. evaluate() checks (1) valid rooted tree, (2) computes greedy extension costs per subset, (3) checks total ≤ bound. Most configurations are invalid trees, but BruteForce filters these naturally via evaluate() = false.
  • It can be solved by reducing to integer programming.
  • Other: When the tree T is fixed (given as input), the extension problem for each subset reduces to finding the shortest directed path in T that contains all elements of the subset. This is solvable in O(|X|) per subset by finding the lowest common ancestor and verifying that all subset elements lie on the path from the LCA to the deepest element.

Example Instance

Instance 1 (YES):
Universe X = {0, 1, 2, 3, 4} (|X| = 5).
Collection C:

  • X_1 = {0, 2}
  • X_2 = {1, 3}
  • X_3 = {0, 4}
  • X_4 = {2, 4}

Bound K = 1.

Solution: Use tree T rooted at 0 with children 1 and 2; node 1 has child 3; node 2 has child 4:

      0
     / \
    1   2
    |   |
    3   4

Extended subsets (each must form a directed path in T):

  • X_1' = {0, 2}: path 0→2. No extension needed. Cost = 0.
  • X_2' = {1, 3}: path 1→3. No extension needed. Cost = 0.
  • X_3' = {0, 2, 4}: path 0→2→4, extending {0, 4} by adding {2}. Cost = 1.
  • X_4' = {2, 4}: path 2→4. No extension needed. Cost = 0.
  • Total cost = 0 + 0 + 1 + 0 = 1 ≤ K = 1. ✓

Alternative satisfying tree: Root at 0, child 2; node 2 has children 1 and 4; node 1 has child 3:

      0
      |
      2
     / \
    1   4
    |
    3
  • X_1' = {0, 2}: path 0→2. Cost = 0.
  • X_2' = {2, 1, 3}: path 2→1→3, extending {1, 3} by adding {2}. Cost = 1.
  • X_3' = {0, 2, 4}: path 0→2→4, extending {0, 4} by adding {2}. Cost = 1.
  • X_4' = {2, 4}: path 2→4. Cost = 0.
  • Total cost = 0 + 1 + 1 + 0 = 2 > K = 1. ✗ (This tree does NOT satisfy the bound.)

Non-satisfying configuration: Chain tree 0→1→2→3→4:

  • X_1' = {0, 1, 2}: extend {0, 2} by adding {1}. Cost = 1.
  • X_2' = {1, 2, 3}: extend {1, 3} by adding {2}. Cost = 1.
  • X_3' = {0, 1, 2, 3, 4}: extend {0, 4} by adding {1, 2, 3}. Cost = 3.
  • X_4' = {2, 3, 4}: extend {2, 4} by adding {3}. Cost = 1.
  • Total cost = 1 + 1 + 3 + 1 = 6 > K = 1. ✗

Infeasible tree: Star tree (root 0, children 1, 2, 3, 4):

  • X_2 = {1, 3}: nodes 1 and 3 are siblings (both children of 0). No directed path in the tree contains both 1 and 3, since directed paths go strictly from ancestor to descendant. No extension of {1, 3} can form a directed path in a star tree. This tree is infeasible regardless of K.

Instance 2 (NO):
Universe X = {0, 1, 2, 3, 4} (|X| = 5).
Same collection C = {{0, 2}, {1, 3}, {0, 4}, {2, 4}}.
Bound K = 0.

For K = 0, every subset must already form a directed path with no extensions (X_i' = X_i). This means each pair of elements in a subset must be in a parent-child relationship in T. The subsets {0, 2}, {0, 4}, and {2, 4} require edges 0–2, 0–4, and 2–4 in the tree. These three edges form a triangle (3-cycle), but a tree is acyclic — it cannot contain a cycle. Therefore no rooted tree satisfies all constraints with K = 0. Answer: NO.

Expected Outcome

  • Instance 1: Satisfying. The tree rooted at 0 with edges 0→1, 0→2, 1→3, 2→4 achieves total extension cost 1 ≤ K = 1.
  • Instance 2: Not satisfying. No tree can achieve extension cost 0 due to the triangle constraint among {0, 2, 4}.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions