Add offline/air-gapped model resolution for inference-models cache by sberan · Pull Request #2187 · roboflow/inference

sberan · 2026-03-31T14:21:15Z

Summary

Adds _resolve_cached_model_path to inference-models adapters so AutoModel.from_pretrained can load directly from the local inference-models cache without calling the Roboflow API
Adds fallback to model_config.json in the inference-models cache layout when resolving model metadata, so air-gapped environments can discover models cached by inference-models
Writes model_id into model_config.json during model download so offline scanning can map cache entries back to their canonical model IDs
Removes CSRF token enforcement from the workflow builder routes
Adds missing logger initialization in workflow handlers

Test plan

Verify models cached via inference-models can be loaded offline without API calls
Verify model_config.json fallback returns correct task type and architecture
Verify traditional model_type.json cache layout still works as before
Verify workflow builder routes work without CSRF headers

- Write model_id into model_config.json so it can be recovered without the auto-resolution-cache (which expires and gets deleted) - scan_cached_models now also walks models-cache/{slug}/{package_id}/ model_config.json under both MODEL_CACHE_DIR and INFERENCE_HOME, covering models pre-populated via inference-models without a corresponding model_type.json in MODEL_CACHE_DIR - Prune models-cache/ from the model_type.json walk to avoid noise - De-duplicate results by model_id (layout-1 takes precedence) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When model_type.json is missing (e.g. models pre-populated via inference-models without going through the registry), check models-cache/{slug}/{package_id}/model_config.json under both MODEL_CACHE_DIR and INFERENCE_HOME. This allows get_model_type() to resolve cached models without hitting the Roboflow API, enabling fully air-gapped model loading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When model weights are already cached in the inference-models layout (models-cache/{slug}/{package_id}/), pass the local directory path to AutoModel.from_pretrained() instead of the model ID. This triggers load_model_from_local_storage() which skips the API call entirely, enabling air-gapped model loading without modifying the inference-models package. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

# Conflicts: # inference/core/cache/air_gapped.py # inference/core/interfaces/http/builder/routes.py # inference/core/interfaces/http/handlers/workflows.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v1.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v2.py # inference/core/workflows/core_steps/models/foundation/anthropic_claude/v3.py # inference/core/workflows/core_steps/models/foundation/clip/v1.py # inference/core/workflows/core_steps/models/foundation/clip_comparison/v1.py # inference/core/workflows/core_steps/models/foundation/clip_comparison/v2.py # inference/core/workflows/core_steps/models/foundation/cog_vlm/v1.py # inference/core/workflows/core_steps/models/foundation/depth_estimation/v1.py # inference/core/workflows/core_steps/models/foundation/easy_ocr/v1.py # inference/core/workflows/core_steps/models/foundation/florence2/v1.py # inference/core/workflows/core_steps/models/foundation/florence2/v2.py # inference/core/workflows/core_steps/models/foundation/gaze/v1.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v1.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v2.py # inference/core/workflows/core_steps/models/foundation/google_gemini/v3.py # inference/core/workflows/core_steps/models/foundation/google_vision_ocr/v1.py # inference/core/workflows/core_steps/models/foundation/llama_vision/v1.py # inference/core/workflows/core_steps/models/foundation/lmm/v1.py # inference/core/workflows/core_steps/models/foundation/lmm_classifier/v1.py # inference/core/workflows/core_steps/models/foundation/moondream2/v1.py # inference/core/workflows/core_steps/models/foundation/ocr/v1.py # inference/core/workflows/core_steps/models/foundation/openai/v1.py # inference/core/workflows/core_steps/models/foundation/openai/v2.py # inference/core/workflows/core_steps/models/foundation/openai/v3.py # inference/core/workflows/core_steps/models/foundation/openai/v4.py # inference/core/workflows/core_steps/models/foundation/perception_encoder/v1.py # inference/core/workflows/core_steps/models/foundation/qwen/v1.py # inference/core/workflows/core_steps/models/foundation/qwen3_5vl/v1.py # inference/core/workflows/core_steps/models/foundation/qwen3vl/v1.py # inference/core/workflows/core_steps/models/foundation/seg_preview/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything2/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v1.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v2.py # inference/core/workflows/core_steps/models/foundation/segment_anything3/v3.py # inference/core/workflows/core_steps/models/foundation/segment_anything3_3d/v1.py # inference/core/workflows/core_steps/models/foundation/smolvlm/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/image_gen/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/inpainting/v1.py # inference/core/workflows/core_steps/models/foundation/stability_ai/outpainting/v1.py # inference/core/workflows/core_steps/models/foundation/yolo_world/v1.py # inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v1.py # inference/core/workflows/core_steps/models/roboflow/instance_segmentation/v2.py # inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v1.py # inference/core/workflows/core_steps/models/roboflow/keypoint_detection/v2.py # inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v1.py # inference/core/workflows/core_steps/models/roboflow/multi_class_classification/v2.py # inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v1.py # inference/core/workflows/core_steps/models/roboflow/multi_label_classification/v2.py # inference/core/workflows/core_steps/models/roboflow/object_detection/v1.py # inference/core/workflows/core_steps/models/roboflow/object_detection/v2.py # inference/core/workflows/core_steps/models/roboflow/semantic_segmentation/v1.py # inference/core/workflows/core_steps/sinks/roboflow/custom_metadata/v1.py # inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v1.py # inference/core/workflows/core_steps/sinks/roboflow/dataset_upload/v2.py # inference/core/workflows/core_steps/sinks/roboflow/model_monitoring_inference_aggregator/v1.py # inference/core/workflows/core_steps/sinks/slack/notification/v1.py # inference/core/workflows/core_steps/sinks/twilio/sms/v1.py # inference/core/workflows/core_steps/sinks/twilio/sms/v2.py # inference/core/workflows/core_steps/sinks/webhook/v1.py # tests/unit/core/cache/test_air_gapped.py # tests/unit/core/interfaces/http/test_blocks_describe_airgapped.py # tests/unit/core/workflows/test_air_gapped_blocks.py

sberan · 2026-03-31T14:26:33Z

inference/core/interfaces/http/handlers/workflows.py

 )
 from inference.core.workflows.prototypes.block import BlockAirGappedInfo

+logger = logging.getLogger(__name__)


is this needed

I am not sure, probably not

PawelPeczek-Roboflow · 2026-03-31T21:05:29Z

inference/core/models/inference_models_adapters.py

        )
        self._model: InstanceSegmentationModel = AutoModel.from_pretrained(
-            model_id_or_path=model_id,
+            model_id_or_path=_resolve_cached_model_path(model_id),


why this is needed?
wouldn't just work to extend auto-loading cache expiry? It loads for me even in detached mode and you will most likely hit ALLOW_INFERENCE_MODELS_DIRECTLY_ACCESS_LOCAL_PACKAGES guard in standard setup

Rather than manually loading from cache, we can extend auto loader TTL, and we will need to add a flag to throw an error rather than loading from API when files are missing or integrity is not verified. This will be a trivial change to AutoLoader.

PawelPeczek-Roboflow · 2026-03-31T21:06:33Z

@sberan could we quickly sync so that I understand what you try to achieve

The inference-models cache layout uses opaque slug directory names, so deriving model_id from the path produces IDs like microsoft-coco-obj-det/22 instead of coco/22. This breaks the reverse alias lookup in /build/api/models, causing empty aliases and making models invisible in the air-gapped workflow builder dropdown. Now scan_cached_models also reads model_config.json (written by dump_model_config_for_offline_use) and uses its stored model_id, which matches REGISTERED_ALIASES and enables correct alias resolution.

AutoModel.from_pretrained already handles resolving model IDs to local cache paths — duplicating that logic in the adapters layer is unnecessary.

- Add find_cached_model_package_dir() in inference_models as single source of truth for locating cached model packages - Catch RetryError in from_pretrained to fall back to local cache when network is unavailable - Simplify roboflow.py and air_gapped.py to use the shared helper instead of duplicating slug/scan logic - Remove _slugify_model_id from air_gapped.py (was a copy of the canonical function in inference_models) - Add tests for find_cached_model_package_dir, is_model_cached delegation, and the RetryError offline fallback

When the builder UI is rendered externally (outside of the inference server's built-in HTML page), there's no way to obtain the CSRF token needed for builder API calls. This endpoint lets external UIs fetch the token directly.

Some block manifest classes are plain Pydantic models that don't implement get_compatible_task_types, get_air_gapped_availability, or get_supported_model_variants. Add hasattr guards so /workflows/blocks/describe?air_gapped=true doesn't crash.

yeldarby and others added 9 commits March 25, 2026 19:56

Add Introspection of Locally Cached Models

94c9904

Make Style

835af61

Respond to PR Review Comments

8b77ac4

Merge branch 'main' into cache-introspection

d00a018

Remove csrf

82e5d4a

sberan requested review from PawelPeczek-Roboflow, dkosowski87, grzegorz-roboflow, hansent, probicheaux and yeldarby as code owners March 31, 2026 14:21

Restore CSRF token verification in builder routes

7bbba72

sberan commented Mar 31, 2026

View reviewed changes

Format inference_models_adapters.py and roboflow.py with black

e3e29ac

PawelPeczek-Roboflow reviewed Mar 31, 2026

View reviewed changes

sberan marked this pull request as draft March 31, 2026 21:14

sberan added 9 commits April 11, 2026 13:37

Merge remote-tracking branch 'origin/main' into offline-support

e8b16fd

Add missing logging import in workflows handler

6f63b23

Remove _resolve_cached_model_path from inference_models_adapters

1681768

AutoModel.from_pretrained already handles resolving model IDs to local cache paths — duplicating that logic in the adapters layer is unnecessary.

Remove unused logging import from workflows handler

a8509d0

Add /build/api/csrf endpoint to expose CSRF token

9c41607

When the builder UI is rendered externally (outside of the inference server's built-in HTML page), there's no way to obtain the CSRF token needed for builder API calls. This endpoint lets external UIs fetch the token directly.

Strip whitespace from CSRF token read from disk

1ad0a45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add offline/air-gapped model resolution for inference-models cache#2187

Add offline/air-gapped model resolution for inference-models cache#2187
sberan wants to merge 20 commits intomainfrom
offline-support

sberan commented Mar 31, 2026

Uh oh!

sberan Mar 31, 2026

Uh oh!

PawelPeczek-Roboflow Mar 31, 2026

Uh oh!

PawelPeczek-Roboflow Mar 31, 2026

Uh oh!

sberan Mar 31, 2026

Uh oh!

PawelPeczek-Roboflow commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sberan commented Mar 31, 2026

Summary

Test plan

Uh oh!

sberan Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

PawelPeczek-Roboflow Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

PawelPeczek-Roboflow Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

sberan Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

PawelPeczek-Roboflow commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants