Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions contracts/initialization-and-configuration-validation.ms
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Initialization & Configuration Validation Specification

This document specifies how Unleash SDKs MUST behave during initialization when configuration is missing or invalid.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be called "required configuration parameters" and internally link to the definitions section?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that seems like a good suggestion


## Overview

Unleash SDKs are frequently configured via environment variables. At runtime, required configuration such as url and token may be null, undefined, or otherwise invalid.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you use "such as" because you expect more required configs in future?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't undefined too JS specific to be part of the spec? maybe "absent"?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does invalid mean non conformant to the token and url spec?

This comment was marked as outdated.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you use "such as" because you expect more required configs in future?

Because it's late in the day and I don't have the entire list in my mind 😅 Need to do a teensy bit of digging tomorrow and figure out what exactly is required. I think it's just this but I'm not 100% sure

isn't undefined too JS specific to be part of the spec? maybe "absent"?

Cheers! Good feedback. Absent seems like the right thing here

does invalid mean non conformant to the token and url spec?

I think I'm being a bit too vague here. Invalid in my mind means "can't talk to unleash"/"unleash doesn't like your token". I specifically don't want to go down the path of validating URLs because it's weird and hard to do. I'd like to reuse existing failure paths that we have in the SDK - we're already pretty robust around not crashing when upstream is unreachable

If possible it would be great to specify clearly what's required.

Agree, I'll figure out what this is and update

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the full list is a nice-to-have, it can be evolved or changed over time. Happy having it if it's cheap.


SDKs MUST be resilient in production environments. Misconfiguration MUST NOT cause the host application process to crash by default. Instead, SDKs MUST start in a degraded state, emit appropriate warnings or errors through existing mechanisms, and provide predictable default behavior.

This specification defines:

- Default (soft) initialization behavior
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find "soft" and "hard" clarifying here. Disclaimer: I'm reading from top to bottom so maybe it becomes clearer later. If I forget to clarify please ping me.

Edit:

  1. After reading the document I got even more convinced about using "strict" instead of "hard".
  2. I'm not sure about about default initialization. Maybe just Default initialization? It could also be Safe initialization to better explain that it won't throw? 🤔 Or a mix of the two like: Default (safe) initialization 🤷
Suggested change
- Default (soft) initialization behavior
- Default initialization
- Strict initialization

- Hard initialization behavior
- Debug/development mode behavior
- Public API behavior in degraded state
- Backup state loading semantics
- Event and logging requirements

## Definitions

Required configuration parameters:

- **URL** - A URL specifying the base path of the Unleash client API
- **Token** - An Unleash token that allows access to the relevant client API
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we have very unfortunate name for our backend APIs that we call client API but maybe we can say "backend client API" to be less confusing to newcomers?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh we are being burned - this needs to cover frontend SDKs here too so I think I need to be more clear here. This means "the URL that you put in the config at startup". I think some frontend SDKs do weird things like expect the full path too. Needs some clarity though, agree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we landed on backend vs frontend. The fact that the backend lives under /client and that we used to call those APIs client is confusing. Maybe we could introduce a /backend url, and deprecate /client, but I don't want to go this route now. I think what Mateusz suggests makes sense for now, just be clearer with words :)


These parameters are required for communication with Unleash.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already said on line 22


**Soft initialization**:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion. I find "soft" too vague. I'd prefer "lenient" because it means it tolerates misconfiguration. The opposite would be "strict".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, that's a nice suggestion

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure abut "lenient". After looking for some alternatives I quite like best-effort e.g.: Best-effort initialization (default).


The default SDK initialization method.

**Hard initialization**:

An explicit, non-default initialization method that enforces successful initial synchronization before returning control.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-default -> opt-in

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what returning control means here. Some APIs have both sync and async init.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, nice suggestion


**Debug/Development mode**:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call it development mode (although Rust would disagree). IMO development mode is the local-dev mode, while debug mode is when you set LOG_LEVEL=debug (that can be done in development or in production for reasons that I'm not going to name). Feel free to disagree, cause in Rust debug_assertions suggest that the more accurate name is Debug mode...


A runtime mode provided by the host ecosystem (e.g., Rust debug_assertions, Node NODE_ENV !== "production"), indicating the application is not running in production. Only relevant where the ecosystem supports this idiomatically.

**Degraded State**:

The SDK is initialized and callable, but has not successfully synchronized with Unleash due to missing or invalid configuration.


## Default (Soft) Initialization

### Required Behavior

When initialized using the default initialization method:

1) The SDK MUST NOT throw, panic, or terminate the host process due to missing or invalid required configuration.
2) The SDK MUST start and enter a running state, even if url and/or token are missing or invalid.
3) The SDK MUST emit its normal ready (or equivalent) event once initialization completes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once initialization completes
Can we better define "when" this will happen in some way?
Especially for invalid values I think it might be helpful to understand what happens and how long it can take.
But, I'm not really sure if this wouldn't overcomplicate this doc.


`ready` in this context means:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add information about the degraded state to the event in some way? Even if only for users to create their own custom logs I think this could be helpful.

The SDK has initialized and public API methods are callable.
Comment on lines +56 to +57
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can steal some terminology from K8s here... liveness vs readiness. I think when you finish constructing the SDK, it's alive (liveness => true), while after you load from upstream, or backup, or bootstrap, it is "ready" (we could argue there are different levels of readiness, particularly if you have bootstrap>backup>upstream, maybe it's a level of freshness, IDK).

But agree, after we attempted to fetch from Unleash, we should move into the ready state, because we did our best effort to retrieve toggles.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that spec should distinguish between "fetched" successfully vs "gave up and loaded backup" so that users know if they're running on real data or defaults

It does NOT guarantee that flags have been fetched from Unleash, only that the SDK has made a best effort attempt to retrieve flags.

### Validation Semantics

- Soft validation MUST check only for presence of required configuration parameters.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "soft" here may be confused/conflated with "Soft initialization"

What about words like static or maybe basic? Unless it's intentional and I'm missing the connection.

Suggested change
- Soft validation MUST check only for presence of required configuration parameters.
- Static validation MUST check only for presence of required configuration parameters.

- SDKs MUST NOT reject URLs purely based on format validation. Communication success is determined by whether Unleash responds successfully.
- Missing or invalid required configuration MUST be surfaced through existing logging or error-event mechanisms.
- Configuration that is not required but missing should default to safe, known values.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to touch non required params in this spec? If so, what should happen with optional but incorrect parameters?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration that is not required but missing should default to safe, known values.

Is this too vague? I did want to cover these, apparently I'm doing a bad job of it 😅

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Mateusz point is that this is increasing the surface area of this definition, we could address optional params in a different contract. Maybe it's fine if we don't even mention them cause it doesn't alter this document's scope

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what safe, known values are in practice. If they are default values then maybe it can be rephrased in some way?

Suggested change
- Configuration that is not required but missing should default to safe, known values.
- Missing optional configuration should fall back to their default values.


### Network Behavior

- The SDK MAY proceed with its normal fetch or streaming scheduling.
- Resulting network failures MUST be surfaced via existing error/logging mechanisms.
- Repeated failures MUST NOT cause process termination.

Rate limiting and retry behavior are governed by existing SDK mechanisms and MUST NOT be bypassed.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the url points to a service that happens to be broken/offline then keeping all this logic "working" makes sense.

I've explicitly opted not to go for an offline mode. Simply because rolling that out to 25+ SDKs would make me cry and because it's a subtly complex feature to get right. Graceful degradation will give us 95% of the value there, with the tradeoff that errors will be a bit ugly

I think it's a de facto offline mode and it would be a lot easier to explain why communication/retries should still be working if it was called as such. 🤔 But, maybe I'm wrong, explicit "offline" mode could make all of this even harder to understand. 🤔

So maybe "disconnected" mode? 🤷


### Backup State Loading

1) The SDK MUST attempt to load persisted toggle state during initialization, regardless of configuration validity.
2) Backup loading MUST occur before emitting the ready event.

Backup loading failure MUST NOT prevent SDK startup.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Backup loading failure MUST NOT prevent SDK startup.
Backup loading failure MUST NOT prevent SDK startup. Failures MUST be surfaced via existing error/logging mechanisms. SDK needs to emit the `ready` event.


## Degraded Public API Behavior

If the SDK has not successfully fetched toggle state from Unleash and bootstrapping and backup loading have failed:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a section on bootstrapping. What does it mean?

Also I would find it clearer if you used Network in this sentence as the section describing it is called "Network Behavior". Or maybe add a small (network) clarifier?

Suggested change
If the SDK has not successfully fetched toggle state from Unleash and bootstrapping and backup loading have failed:
If the SDK has not successfully fetched toggle state from Unleash (network) and bootstrapping and backup loading have failed:


`isEnabled` (or equivalent SDK method)
- MUST return false
- If a caller-provided fallback value is supplied, that fallback MUST be returned.

`getVariant` (or equivalent SDK method)
- MUST return the SDK’s canonical default variant (disabled).
- If a caller-provided fallback variant is supplied, that fallback MUST be returned.

`listToggles` (or equivalent SDK method)
- MUST return an empty list.

## Periodic Warnings

If required configuration is missing or invalid:
- The SDK MUST emit at least one clear warning during startup identifying which required parameters are missing.
- Continued connectivity failures MUST be surfaced using existing logging/event mechanisms.
- Emission MUST use existing rate-limiting or retry controls provided by the SDK.

## Hard Initialization

SDKs MAY provide an explicit, non-default initialization method that enforces strict validation.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that enforces strict validation

I would follow this explanation and rename "Hard Initialization" to "Strict Initialization".


### Required Behavior

When using a hard initialization method:

1) url and token MUST be treated as required.
Copy link
Copy Markdown
Contributor

@kwasniew kwasniew Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be a breaking change for some SDKs since it's a default behavior nowadays. Are we gonna bump major versions for them?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to then yeah we will. But I don't intend to add hard/strict initialization to SDKs that don't have it until someone cries for it. This is more to clarify the behavior of SDKs that do have this right now. If we gotta break them, then so be it

2) The SDK MUST attempt a full fetch from Unleash. If retries are supported in the SDK then the retry limit must be exceeded.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "fetch" cross platform? How about "network request" or something similar?

If retries are supported in the SDK then the retry limit must be exceeded.

If there is a retry-limit then I think that Default Initialization should explicitly deal with it. Maybe I missed it, but I think it would fall under "Rate limiting and retry behavior are governed by existing SDK mechanisms and MUST NOT be bypassed." and I'm not sure how would that interact with "Periodic Warnings". But, that may be about different things that I just don't know much about.

Maybe retries and retry-limit could be a new point 4)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a missing 'not' here?

Suggested change
2) The SDK MUST attempt a full fetch from Unleash. If retries are supported in the SDK then the retry limit must be exceeded.
2) The SDK MUST attempt a full fetch from Unleash. If retries are supported in the SDK then the retry limit must not be exceeded.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If URL is missing should it still attempt a fetch? Isn't is wasteful with empty URL?

Copy link
Copy Markdown

@krzychukula krzychukula Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the first point was about preventing that?

  1. url and token MUST be treated as required.

3) The method MUST NOT resolve/return until at least one successful toggle fetch has occurred.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I'm missing something, but doesn't this contradict the "The method MUST reject, throw, or return an error."?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the same behavior in the presence of a backup?


If the initial fetch fails:
- The method MUST reject, throw, or return an error.
- The SDK MUST NOT silently downgrade into degraded state.

Timeout behavior and retry semantics are governed by existing SDK implementation.

## Debug / Development Mode Behavior
Copy link
Copy Markdown
Contributor

@kwasniew kwasniew Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two modes (soft + hard):

  • Simpler mental model, each mode behaves consistently regardless of environment
  • Developer chooses behavior explicitly, no surprises
  • Easier to implement, test, and document across SDKs
  • Tradeoff: devs who always use soft init won't get automatic fail-fast during development

Three modes (soft + hard + debug):

  • Catches misconfig automatically during local dev without requiring hard init
  • Tradeoff: soft init becomes context-dependent, harder to reason about
  • Every SDK must detect and document environment semantics
  • More surface area to implement, test, and maintain across SDK ecosystem

Basically: three modes adds a small (maybe big?) developer convenience at the cost of a meaningful increase in complexity and a less predictable soft init contract.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: I'm in explicit over implicit or configuration over convention camp. Two modes is configuration, three modes relies on environment convention. Works like magic until it doesn't match your actual intent

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: developer mode is great and adds a lot of developer convenience. But, I wonder if we can have that by just by changing our examples in README of the SDKs that support that.

For example node sdk could have:

import { initialize } from 'unleash-client';

const unleash = initialize({
  url: 'https://YOUR-API-URL',
  appName: 'my-node-name',
  customHeaders: { Authorization: '<YOUR_API_TOKEN>' },
  strictInitialization: process.env.NODE_ENV === 'development' # throws on errors
});

It also has a clear advantage that each customer can tweak how NODE_ENV test or others should behave.


SDK ecosystems that provide a clear debug or development mode (e.g., Rust debug_assertions, Node non-production environments) MAY enforce stricter validation semantics during soft initialization.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean we have 3 modes?

  • soft/lenient
  • hard/strict
  • soft in dev mode
    If so then maybe the reader should know that in the beginning of the document. I have to read this section again tomorrow.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAY enforce stricter validation semantics during soft initialization.

I don't like "soft" as it seems like it isn't proper initialization for some reason.

But, if we use default then I wonder if it wouldn't be confused with default values 🤔

MAY enforce stricter validation semantics during default initialization.
🤷 Not sure.


When operating in debug/development mode:
- The SDK MAY treat missing required configuration (url, token) as fatal during soft initialization.
- This MAY result in throwing, panicking, or otherwise failing initialization.
- This behavior MUST NOT apply in production mode.

This provision exists to enable fail-fast behavior during local development while preserving resilience in production environments.

SDKs implementing this behavior MUST document:
- How debug/development mode is detected.
- That strict validation applies only in that mode.

## Event and Logging Semantics

All validation and connectivity failures MUST be surfaced via existing SDK mechanisms:
- SDKs that emit error events MUST use those events.
- SDKs that rely on logging MUST log appropriately.
- SDKs MUST NOT introduce new event types unless separately specified.

At minimum:
- A clear warning identifying missing required configuration MUST be emitted during startup.
- Network failures resulting from invalid configuration MUST be surfaced normally.