Add Azure Blob Storage Gen2 (ADLS HNS) support to BlobFileStore#19014
Add Azure Blob Storage Gen2 (ADLS HNS) support to BlobFileStore#19014
Conversation
Make BlobFileStore adaptive: it now auto-detects whether the storage account has Hierarchical Namespace (HNS) enabled and uses native DataLake APIs for atomic moves, real directories, and efficient listing when available. Falls back to standard flat-namespace blob operations otherwise. HNS detection can be overridden via configuration. - Add IFileStoreCapabilities interface and FileStoreCapabilities impl - Add GetFilesAsync, GetDirectoriesAsync, Capabilities to IFileStore - Add UseHierarchicalNamespace option to BlobStorageOptions - Add Azure.Storage.Files.DataLake package dependency - Add MediaCreatedFileAsync and MediaCopiedFileAsync event hooks - Wire up EnsureCapabilitiesAsync at startup in Media.Azure module
Add integration tests that run against Azurite to verify BlobFileStore behavior in both flat-namespace (Gen1) and hierarchical-namespace (Gen2) modes. Tests cover capabilities detection, file CRUD, directory operations, move/copy, and directory content listing. Gen2 tests that use DataLake APIs will fail until Azurite supports the DFS endpoint (--dfsPort flag). This is intentional to track when a newer Azurite version enables full Gen2 testing. - Add abstract BlobFileStoreTestsBase with 26 test methods - Add BlobFileStoreGen1Tests and BlobFileStoreGen2Tests subclasses - Add AzuriteFactAttribute to skip tests when Azurite is unavailable - Add OrchardCore.FileStorage.AzureBlob reference to test project - Start Azurite in main_ci and pr_ci workflows on ubuntu runners
Add DfsEndpoint option to BlobStorageOptions for local emulators where the DFS endpoint runs on a separate port. Parse storage credentials from the connection string to construct the DataLakeServiceClient when an explicit DFS endpoint is configured. Fix Gen2 code paths to handle 404 RequestFailedException from DataLakePathClient.ExistsAsync, which can throw instead of returning false when the path does not exist.
Reverted both workflows back to docker run steps (instead of services) since that lets us pass --skipApiVersionCheck directly. The command now uses your image with the correct flags:
|
I'll try to review this over the weekend. |
| <ProjectReference Include="..\..\src\OrchardCore.Modules\OrchardCore.DynamicCache\OrchardCore.DynamicCache.csproj" /> | ||
| <ProjectReference Include="..\..\src\OrchardCore\OrchardCore.Apis.GraphQL.Abstractions\OrchardCore.Apis.GraphQL.Abstractions.csproj" /> | ||
| <ProjectReference Include="..\..\src\OrchardCore\OrchardCore.Apis.GraphQL.Client\OrchardCore.Apis.GraphQL.Client.csproj" /> | ||
| <ProjectReference Include="..\..\src\OrchardCore\OrchardCore.FileStorage.AzureBlob\OrchardCore.FileStorage.AzureBlob.csproj" /> |
There was a problem hiding this comment.
Same concerns for this being in OrchardCore.Tests as under #19015 (comment).
| } | ||
|
|
||
| [AzuriteFact] | ||
| public async Task CreateDirectory_AlreadyExists_ReturnsFalse() |
There was a problem hiding this comment.
This and DeleteDirectory_WithNestedSubdirectories_DeletesAll fails for me in the Gen1 tests with an actual storage account in Azure.
There was a problem hiding this comment.
That would mean that Azurite is not emulating properly for Gen 1. I'll take a look.
There was a problem hiding this comment.
We should not test this with Gen 1 storage simply.
| { | ||
| options.RemoveContainer = rawOptions.RemoveContainer; | ||
| options.RemoveFilesFromBasePath = rawOptions.RemoveFilesFromBasePath; | ||
| options.UseHierarchicalNamespace = rawOptions.UseHierarchicalNamespace; |
There was a problem hiding this comment.
Add a sample for this into the Web project's appsettings and here, with an explanation on when would you set what (my understanding is that usually, you wouldn't need to set this): https://docs.orchardcore.net/en/latest/reference/modules/Media.Azure/#configuration
There was a problem hiding this comment.
It's an escape hatch for two edge cases where auto-detection doesn't work:
SAS tokens — GetAccountInfoAsync() requires storage account-level permissions. A SAS token scoped to a container won't have them, so the detection call fails and falls back to flat-namespace. If the account is actually Gen2/HNS, the user needs to set UseHierarchicalNamespace: true to force the correct behavior.
Local emulators (Azurite) — Azurite doesn't implement the GetAccountInfo endpoint, so detection always fails. Combined with the separate DfsEndpoint option, this lets dev environments opt-in to HNS simulation without a real Azure account.
Without the override, there'd be no way to use HNS-dependent features (atomic moves, real directory ops) in either of those scenarios.
| var fileStore = new BlobFileStore(blobStorageOptions, clock, contentTypeProvider); | ||
| var blobLogger = serviceProvider.GetRequiredService<ILogger<BlobFileStore>>(); | ||
| var fileStore = new BlobFileStore(blobStorageOptions, clock, contentTypeProvider, blobLogger); | ||
| fileStore.EnsureCapabilitiesAsync().GetAwaiter().GetResult(); |
There was a problem hiding this comment.
Sync-over-async is not the best, can this be done better?
|
|
||
| private readonly string _basePrefix; | ||
|
|
||
| public IFileStoreCapabilities Capabilities => _capabilities ?? FileStoreCapabilities.Default; |
There was a problem hiding this comment.
Do this in the other file stores too, at least to set StorageProvider. And FileSystemStore has HasHierarchicalNamespace and SupportsAtomicMove too.
There was a problem hiding this comment.
No. AwsFileStore and FileSystemStore don't override Capabilities at all — they inherit the interface default (FileStoreCapabilities.Default), which returns all-false. That means:
FileSystemStore — the local filesystem doesn't need HNS detection; it just uses regular directory semantics, so Default (flat, no atomic move) is correct.
AwsFileStore — S3 is flat-namespace only (no HNS concept), so Default is also correct.
The EnsureCapabilitiesAsync pattern is specific to BlobFileStore because Azure Blob Storage has two distinct modes (Gen1 flat vs Gen2 HNS/ADLS) that need to be detected at runtime by calling the Azure API. The other stores don't have that ambiguity — their capabilities are statically known.
There was a problem hiding this comment.
Now you removed StorageProvider so that's not needed but I don't see why you wouldn't want to set the other two for FileSystemStore. You added this to describe capabilities of the provider to consumers, so they can e.g. optimize the UX, right? (In addition to this being not just nice to have, but necessary to distinguish between Gen1 and 2 Blob Storage). The local file system definitely has both a hierarchical namespace and supports atomic moves (two things that Gen2 mimics from a standard file system), so it shouldn't use FileStoreCapabilities.Default.
Now, if you actually meant IFileStoreCapabilities to be specific to BlobFileStore, then it shouldn't be a type in the general OrchardCore.FileStorage, nor should it be exposed via the general IFileStore interface.
Also, I'd ask you to please check AI answers instead of copying them to replies. I'm conversing with you, if I want to ask Copilot, I'll do that directly.
There was a problem hiding this comment.
I will take a look at if we can do this better. There is an assumption you are doing that the UX will be different on the Media Gallery that you should not do. I can explain in details by showing you on a call because it is not changing the way the new Media Gallery will behave for different reasons. I see that you are trying to see the whole things altogether here.
I will rethink how this part can be done here if we can make it more self explanatory through code. Though the code should only affect the server side behaviors.
There was a problem hiding this comment.
As for the Copilot/AI comment, I'm simply just trying to save time. The answer was accurate as this is how the current server side code works. If we want to support atomic move on local filestorage then we need to change the way the server side code works and thus changing another provider. Then this would be a different PR.
There was a problem hiding this comment.
Thanks!
Based on the documentation on IFileStoreCapabilities about how these properties are meant to be used, both look like they apply to FileSystemStore.
/// Gets a value indicating whether the store has a true hierarchical namespace
/// (i.e. directories are first-class objects, not simulated via path prefixes).Directories are not simulated via path prefixes but are first-class objects e.g. in NTFS and Ext4.
/// Gets a value indicating whether the store supports atomic (server-side) move / rename
/// that does not require a copy-then-delete round-trip.Same, you can tell a file system to move or rename a file and it won't be a copy.
| /// <summary> | ||
| /// Gets the capabilities supported by this file store. | ||
| /// </summary> | ||
| IFileStoreCapabilities Capabilities => FileStoreCapabilities.Default; |
There was a problem hiding this comment.
I don't think this should have a default. Every provider should set at least StorageProvider.
There was a problem hiding this comment.
I simplified this. No need to set a StorageProvider but a StorageName now.
| /// <summary> | ||
| /// Gets a human-readable name identifying the storage provider (e.g. "Local", "Azure Blob"). | ||
| /// </summary> | ||
| string StorageProvider => "Unknown"; |
There was a problem hiding this comment.
Where do we need this? It's not currently used.
…dia startup Move BlobFileStore.EnsureCapabilitiesAsync() from the DI singleton factory (where it blocked via GetAwaiter().GetResult()) into MediaBlobContainerTenantEvents.ActivatingAsync(), which already runs asynchronously at tenant startup. BlobFileStore is now registered as its own singleton so it can be injected. Also remove the unused StorageProvider member from IFileStoreCapabilities and FileStoreCapabilities, and remove the default implementation of IFileStore.Capabilities so every provider must declare capabilities explicitly. FileSystemStore and AwsFileStore updated accordingly.
| private readonly ILogger<FileSystemStore> _logger; | ||
| private readonly string _fileSystemPath; | ||
|
|
||
| public IFileStoreCapabilities Capabilities { get; } = new FileStoreCapabilities(hasHierarchicalNamespace: false, supportsAtomicMove: false); |
There was a problem hiding this comment.
I'm changing this configuration to report that the local FileStorage uses HierarchicalNamespace and supports atomic move. It technically should not break any custom implementations.
…lesystem behavior FileSystemStore was configured with hasHierarchicalNamespace: false and supportsAtomicMove: false, which does not match its actual implementation. The local filesystem uses real Directory.* APIs for first-class directory operations and File.Move() which maps to an atomic OS rename syscall. Updated both flags to true to align with the implementation and be consistent with how BlobFileStore Gen2 reports the same native behaviors.
…tification
Add a StorageName default interface member to IFileStore that returns a
human-readable name for the underlying storage backend. Implementations:
FileSystemStore ("Local"), BlobFileStore ("Azure Blob (Gen1)"/"Azure Blob (Gen2)"
based on detected HNS), AwsFileStore ("Amazon S3"). DefaultMediaFileStore
delegates to the inner store. This enables the media UI to display which storage
provider is active at runtime, including distinguishing Azure Gen1 from Gen2.
…tegration project Extract BlobFileStore Gen1/Gen2 tests from OrchardCore.Tests into a new OrchardCore.Tests.Integration project. This separates integration tests that require external services (Azurite) from unit tests, avoiding unnecessary Docker container startup and extra execution time on every PR commit. The new project is built and run by a dedicated integration_tests.yml workflow that only triggers on review submission, push to main/release, or manual dispatch — matching the same pattern as functional_all_db.yml. Removed Azurite setup from pr_ci.yml and main_ci.yml.
|
This pull request has merge conflicts. Please resolve those before requesting a review. |
|
@Piedone Same here, moved everything to a new project and also new GH workflow. |
|
This pull request has merge conflicts. Please resolve those before requesting a review. |
# Conflicts: # .github/workflows/integration_tests.yml # test/OrchardCore.Tests.Integration/OrchardCore.Tests.Integration.csproj
|
As we discussed in the latest steering meeting to utilize TestContainers, this closed in favor of #19162 |
Make BlobFileStore adaptive: it now auto-detects whether the storage account has Hierarchical Namespace (HNS) enabled and uses native DataLake APIs for atomic moves, real directories, and efficient listing when available. Falls back to standard flat-namespace blob operations otherwise. HNS detection can be overridden via configuration.
Storage Capabilities in the Vue 3 Media App
The Vue app fetches capabilities — but only for display
The frontend calls a dedicated
GetCapabilitiesAPI endpoint exposed byMediaApiController:The Vue app retrieves this via
FileDataService.getCapabilities()and auseStoragePopovercomposable shows it in a UI popover (storage provider name, quota, capabilities). However, it does not use the flags to conditionally change UX behavior — there is no "ifsupportsAtomicMovethen show rename, else show copy" logic.All operations go through separate, unified API endpoints that adapt server-side
The Vue app calls the same endpoints regardless of storage backend:
POST /api/media/MoveMediaRenameAsyncor copy-then-deletePOST /api/media/MoveMediaPOST /api/media/CopyMediaFile.CopyPOST /api/media/MoveMediaListPOST /api/media/CreateFolderPOST /api/media/DeleteFolderPOST /api/media/DeleteMediaPOST /api/media/DeleteMediaListConclusion
The Vue app uses a single set of API endpoints — it does not branch its UX based on capabilities. All the adaptation happens server-side in
DefaultMediaFileStore, which delegates to the underlyingIFileStoreimplementation. The capabilities are surfaced to the user purely as informational (shown in a storage popover), not to drive conditional frontend behavior.This is the right pattern — the frontend stays simple and the storage abstraction handles the differences transparently.
TODO