Skip to content

bug: unrestricted concurrent task creation causes resource exhaustion vulnerability #2

@hartym

Description

@hartym

Summary

TLDR - The Container's dependency resolution creates unlimited concurrent asyncio tasks, allowing attackers to exhaust system resources through deep or wide dependency graphs. This vulnerability affects both Container and ScopedContainer, potentially causing denial of service in production applications.

Context

The current _create_instance method in src/hdmi/containers/default.py:137 resolves all dependencies concurrently without any limit on the number of concurrent tasks. When a service has many dependencies (or a deep dependency tree), this creates numerous asyncio tasks simultaneously:

# Current implementation at line 189-195
dependency_tasks[param_name] = asyncio.create_task(self.get(dependency_type))
# ...
await asyncio.gather(*dependency_tasks.values())  # No limit on concurrent tasks

This design enables efficient concurrent resolution (as tested in test_concurrent_resolution.py) but exposes the system to resource exhaustion attacks. An attacker could register services with artificially deep or wide dependency graphs to overwhelm the system.

Input

Attack Vectors:

  • Service with many direct dependencies (wide graph)
  • Deeply nested dependency chains (deep graph)
  • Circular references that bypass validation
  • Repeated requests for transient services with complex graphs

Example vulnerable configuration:

class ServiceWith100Dependencies:
    def __init__(self, dep1: Service1, dep2: Service2, ..., dep100: Service100):
        pass

# Or deep nesting
class Level10(Level9): pass
class Level9(Level8): pass
# ... continues to Level1

Output and Testing Scenarios

Expected Output:

  • Resolution continues working normally under the limit
  • When limit is reached, waits with configurable timeout
  • Raises ResourceExhaustedError if timeout expires
  • Semaphore releases slots as tasks complete

Testing Scenarios:

  1. Happy Path: Normal service with 5-10 dependencies resolves without hitting limit
  2. Edge Case - At Limit: Service graph with exactly 100 concurrent tasks works normally
  3. Edge Case - Over Limit: Service graph requiring 150 concurrent tasks waits for available slots
  4. Error Case - Timeout: When all slots occupied beyond timeout (30s default), raises ResourceExhaustedError
  5. Scope Inheritance: ScopedContainer shares parent's semaphore, preventing bypass via scoping

Possible Implementation

Chosen approach: Default limit of 100 concurrent tasks, configurable via ContainerBuilder, with timeout on blocking.

Key implementation points:

  1. Add to ContainerBuilder:

    • New parameter: max_concurrent_resolutions: int | None = 100
    • New parameter: resolution_timeout: float | None = 30.0
    • Pass to Container during build()
  2. Add to Container.init:

    self._resolution_semaphore = asyncio.Semaphore(max_concurrent_resolutions or 100)
    self._resolution_timeout = resolution_timeout or 30.0
  3. Wrap task creation in _create_instance:

    async with asyncio.timeout(self._resolution_timeout):
        async with self._resolution_semaphore:
            # Existing resolution logic
  4. ScopedContainer behavior:

    • Inherits parent's semaphore (shares the global limit)
    • Prevents circumvention by creating scopes
  5. Error handling:

    • Create new ResourceExhaustedError exception
    • Raise when timeout expires while waiting for semaphore

Configuration example:

builder = ContainerBuilder(
    max_concurrent_resolutions=50,  # Override default
    resolution_timeout=10.0  # 10 second timeout
)

Current Challenges

  • Backward compatibility: Default limit might affect existing applications with legitimate large dependency graphs
  • Performance testing: Need benchmarks to validate the default limit of 100 is appropriate
  • Monitoring: Consider adding metrics/logging for semaphore wait times and timeouts
  • Documentation: Must clearly document the limit and how to adjust for specific use cases

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions