Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
57b3d70
Initial plan
Copilot Feb 5, 2026
5bbe648
Add comprehensive design document for airlock storage consolidation
Copilot Feb 5, 2026
1d2172e
Update design to use metadata-based stage management instead of data …
Copilot Feb 5, 2026
fa39c85
Add metadata-based blob operations and update constants for consolida…
Copilot Feb 5, 2026
138820b
Add implementation status document and update CHANGELOG
Copilot Feb 5, 2026
8941b1b
Complete Terraform infrastructure consolidation for core and workspac…
Copilot Feb 5, 2026
47dcdc8
Add storage helper functions and update implementation status
Copilot Feb 5, 2026
427515d
New approach: Use blob index tags for EventGrid filtering while keepi…
Copilot Feb 5, 2026
b09f990
Implement unified EventGrid subscriptions with metadata-based routing
Copilot Feb 5, 2026
231f434
Integrate metadata-based routing in airlock processor with feature fl…
Copilot Feb 5, 2026
76a3d62
Add comprehensive EventGrid unified subscription documentation
Copilot Feb 5, 2026
cd70948
Add ABAC conditions to restrict API access based on container metadat…
Copilot Feb 5, 2026
f08b384
Fix stage name consistency: use hyphenated 'in-progress' throughout
Copilot Feb 5, 2026
5211f36
Final consistency fixes for stage naming throughout codebase
Copilot Feb 5, 2026
76a09bd
Add security analysis revealing need for separate public/private stor…
Copilot Feb 5, 2026
2c6235b
Updated plan: Use ABAC with private endpoint filtering to maintain ne…
Copilot Feb 5, 2026
0df7e5c
Implement revised architecture: keep public accounts separate, use AB…
Copilot Feb 5, 2026
e375cf2
Plan: Consolidate to 1 core account using ABAC to allow public/privat…
Copilot Feb 5, 2026
72c9478
Final architecture: 1 core account via App Gateway + 1 per workspace …
Copilot Feb 5, 2026
2b66bb3
Implement 1 core account with App Gateway routing and PE-based ABAC
Copilot Feb 5, 2026
1d5b8ef
Add final architecture documentation and update CHANGELOG
Copilot Feb 5, 2026
7638186
Fix ABAC condition logic and align API access with ABAC permissions
Copilot Feb 5, 2026
d490b5a
Add clarifying comments to ABAC conditions explaining the Boolean logic
Copilot Feb 5, 2026
e20e33a
Add comprehensive analysis of copy operations and workspace_id ABAC o…
Copilot Feb 5, 2026
4f2fe0b
Implement Option B: Global workspace storage with workspace_id ABAC f…
Copilot Feb 5, 2026
aa6c32a
Update airlock documentation for Option B architecture and confirm UI…
Copilot Feb 5, 2026
85ab8af
Add e2e tests for Option B workspace isolation and metadata-based tra…
Copilot Feb 5, 2026
bee6cdc
Remove all "Option B" references from codebase and documentation
Copilot Feb 5, 2026
ff96ee5
Remove unimplemented App Gateway PE and correct documentation
Copilot Feb 5, 2026
e025056
Remove all Option A/B references and clarify 2 storage accounts
Copilot Feb 5, 2026
b98ede1
Remove unused legacy storage account name variables
Copilot Feb 5, 2026
4a9b185
remove comments, update docs
marrobi Feb 5, 2026
8421bdb
Update app gawateway configuration
marrobi Feb 5, 2026
34f2636
linting
marrobi Feb 5, 2026
3d99220
Implement airlock security improvements: is_publicly_accessible_stage…
Copilot Feb 6, 2026
ad73137
Rebase changes onto copilot/redesign-airlock-storage-accounts: tighte…
Copilot Feb 6, 2026
105f38b
Fix 3 bugs found during pre-merge review: BlobCreatedTrigger missing …
Copilot Feb 6, 2026
7335c65
Merge pull request #18 from marrobi/copilot/tighten-public-accessible…
marrobi Feb 6, 2026
55f3590
Tests [ass, needs flows and access manually validating.
marrobi Feb 10, 2026
b0c50e8
update core version
marrobi Feb 10, 2026
fcead34
Merge branch 'main' into copilot/redesign-airlock-storage-accounts
marrobi Feb 10, 2026
bd14845
fix: make consolidated core storage publicly accessible for SAS uploads
marrobi Feb 10, 2026
8b405ef
Merge branch 'main' into copilot/redesign-airlock-storage-accounts
marrobi Feb 10, 2026
115e778
Fix linting.
marrobi Feb 11, 2026
051ef76
Merge branch 'copilot/redesign-airlock-storage-accounts' of https://g…
marrobi Feb 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -214,3 +214,4 @@ validation.txt

/index.html
.DS_Store
*_old.tf
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ BUG FIXES:


ENHANCEMENTS:
* Consolidate airlock storage from 56 accounts to 2 using metadata-based stage management with ABAC workspace_id filtering. Reduces costs ~$7,943/month at 100 workspaces and speeds stage transitions 97-99.9% for most operations. ([#issue](https://github.com/marrobi/AzureTRE/issues/issue))
* Upgrade Guacamole to v1.6.0 with Java 17 and other security updates ([#4754](https://github.com/microsoft/AzureTRE/pull/4754))
* API: Replace HTTP_422_UNPROCESSABLE_ENTITY response with HTTP_422_UNPROCESSABLE_CONTENT as per RFC 9110 ([#4742](https://github.com/microsoft/AzureTRE/issues/4742))
* Change Group.ReadWrite.All permission to Group.Create for AUTO_WORKSPACE_GROUP_CREATION ([#4772](https://github.com/microsoft/AzureTRE/issues/4772))
Expand Down
3 changes: 3 additions & 0 deletions airlock_processor/BlobCreatedTrigger/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,9 @@ def main(msg: func.ServiceBusMessage,
elif constants.STORAGE_ACCOUNT_NAME_IMPORT_BLOCKED in topic or constants.STORAGE_ACCOUNT_NAME_EXPORT_BLOCKED in topic:
completed_step = constants.STAGE_BLOCKING_INPROGRESS
new_status = constants.STAGE_BLOCKED_BY_SCAN
else:
logging.warning(f"Unknown storage account in topic: {topic}")
return

# reply with a step completed event
stepResultEvent.set(
Expand Down
94 changes: 81 additions & 13 deletions airlock_processor/StatusChangedQueueTrigger/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

from exceptions import NoFilesInRequestException, TooManyFilesInRequestException

from shared_code import blob_operations, constants
from shared_code import blob_operations, constants, airlock_storage_helper
from pydantic import BaseModel, parse_obj_as


Expand All @@ -19,6 +19,7 @@ class RequestProperties(BaseModel):
previous_status: Optional[str]
type: str
workspace_id: str
review_workspace_id: Optional[str] = None


class ContainersCopyMetadata:
Expand All @@ -31,6 +32,8 @@ def __init__(self, source_account_name: str, dest_account_name: str):


def main(msg: func.ServiceBusMessage, stepResultEvent: func.Out[func.EventGridOutputEvent], dataDeletionEvent: func.Out[func.EventGridOutputEvent]):
request_properties = None
request_files = None
try:
request_properties = extract_properties(msg)
request_files = get_request_files(request_properties) if request_properties.new_status == constants.STAGE_SUBMITTED else None
Expand All @@ -53,9 +56,18 @@ def handle_status_changed(request_properties: RequestProperties, stepResultEvent

logging.info('Processing request with id %s. new status is "%s", type is "%s"', req_id, new_status, request_type)

# Check if using metadata-based stage management
use_metadata = os.getenv('USE_METADATA_STAGE_MANAGEMENT', 'false').lower() == 'true'

if new_status == constants.STAGE_DRAFT:
account_name = get_storage_account(status=constants.STAGE_DRAFT, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(account_name, req_id)
if use_metadata:
from shared_code.blob_operations_metadata import create_container_with_metadata
account_name = airlock_storage_helper.get_storage_account_name_for_request(request_type, new_status, ws_id)
stage = airlock_storage_helper.get_stage_from_status(request_type, new_status)
create_container_with_metadata(account_name, req_id, stage, workspace_id=ws_id, request_type=request_type)
else:
account_name = get_storage_account(status=constants.STAGE_DRAFT, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(account_name, req_id)
return

if new_status == constants.STAGE_CANCELLED:
Expand All @@ -68,11 +80,60 @@ def handle_status_changed(request_properties: RequestProperties, stepResultEvent
set_output_event_to_report_request_files(stepResultEvent, request_properties, request_files)

if (is_require_data_copy(new_status)):
logging.info('Request with id %s. requires data copy between storage accounts', req_id)
containers_metadata = get_source_dest_for_copy(new_status=new_status, previous_status=previous_status, request_type=request_type, short_workspace_id=ws_id)
blob_operations.create_container(containers_metadata.dest_account_name, req_id)
blob_operations.copy_data(containers_metadata.source_account_name,
containers_metadata.dest_account_name, req_id)
if use_metadata:
# Metadata mode: Update container stage instead of copying
from shared_code.blob_operations_metadata import update_container_stage, create_container_with_metadata

# For import submit, use review_workspace_id so data goes to review workspace storage
effective_ws_id = ws_id
if new_status == constants.STAGE_SUBMITTED and request_type.lower() == constants.IMPORT_TYPE and request_properties.review_workspace_id:
effective_ws_id = request_properties.review_workspace_id

# Get the storage account (might change from core to workspace or vice versa)
source_account = airlock_storage_helper.get_storage_account_name_for_request(request_type, previous_status, ws_id)
dest_account = airlock_storage_helper.get_storage_account_name_for_request(request_type, new_status, effective_ws_id)
new_stage = airlock_storage_helper.get_stage_from_status(request_type, new_status)

# Import approval_in_progress: metadata-only update (data is already in workspace storage)
if new_status == constants.STAGE_APPROVAL_INPROGRESS and request_type.lower() == constants.IMPORT_TYPE:
logging.info(f'Request {req_id}: Import approval - updating metadata only (no copy needed)')
update_container_stage(source_account, req_id, new_stage, changed_by='system')
elif source_account == dest_account:
# Same storage account - just update metadata
logging.info(f'Request {req_id}: Updating container stage to {new_stage} (no copy needed)')
update_container_stage(source_account, req_id, new_stage, changed_by='system')
else:
# Different storage account (e.g., core → workspace) - need to copy
logging.info(f'Request {req_id}: Copying from {source_account} to {dest_account}')
create_container_with_metadata(dest_account, req_id, new_stage, workspace_id=effective_ws_id, request_type=request_type)
blob_operations.copy_data(source_account, dest_account, req_id)

# In metadata mode, there is no BlobCreatedTrigger to signal completion,
# so we must send the step result event directly for terminal transitions.
completion_status_map = {
constants.STAGE_APPROVAL_INPROGRESS: constants.STAGE_APPROVED,
constants.STAGE_REJECTION_INPROGRESS: constants.STAGE_REJECTED,
constants.STAGE_BLOCKING_INPROGRESS: constants.STAGE_BLOCKED_BY_SCAN,
}
if new_status in completion_status_map:
final_status = completion_status_map[new_status]
logging.info(f'Request {req_id}: Metadata mode - sending step result for {new_status} -> {final_status}')
stepResultEvent.set(
func.EventGridOutputEvent(
id=str(uuid.uuid4()),
data={"completed_step": new_status, "new_status": final_status, "request_id": req_id},
subject=req_id,
event_type="Airlock.StepResult",
event_time=datetime.datetime.now(datetime.UTC),
data_version=constants.STEP_RESULT_EVENT_DATA_VERSION))
else:
# Legacy mode: Copy data between storage accounts
logging.info('Request with id %s. requires data copy between storage accounts', req_id)
review_ws_id = request_properties.review_workspace_id
containers_metadata = get_source_dest_for_copy(new_status=new_status, previous_status=previous_status, request_type=request_type, short_workspace_id=ws_id, review_workspace_id=review_ws_id)
blob_operations.create_container(containers_metadata.dest_account_name, req_id)
blob_operations.copy_data(containers_metadata.source_account_name,
containers_metadata.dest_account_name, req_id)
return

# Other statuses which do not require data copy are dismissed as we don't need to do anything...
Expand Down Expand Up @@ -102,7 +163,7 @@ def is_require_data_copy(new_status: str):
return False


def get_source_dest_for_copy(new_status: str, previous_status: str, request_type: str, short_workspace_id: str) -> ContainersCopyMetadata:
def get_source_dest_for_copy(new_status: str, previous_status: str, request_type: str, short_workspace_id: str, review_workspace_id: str = None) -> ContainersCopyMetadata:
# sanity
if is_require_data_copy(new_status) is False:
raise Exception("Given new status is not supported")
Expand All @@ -115,7 +176,7 @@ def get_source_dest_for_copy(new_status: str, previous_status: str, request_type
raise Exception(msg)

source_account_name = get_storage_account(previous_status, request_type, short_workspace_id)
dest_account_name = get_storage_account_destination_for_copy(new_status, request_type, short_workspace_id)
dest_account_name = get_storage_account_destination_for_copy(new_status, request_type, short_workspace_id, review_workspace_id=review_workspace_id)
return ContainersCopyMetadata(source_account_name, dest_account_name)


Expand Down Expand Up @@ -151,12 +212,14 @@ def get_storage_account(status: str, request_type: str, short_workspace_id: str)
raise Exception(error_message)


def get_storage_account_destination_for_copy(new_status: str, request_type: str, short_workspace_id: str) -> str:
def get_storage_account_destination_for_copy(new_status: str, request_type: str, short_workspace_id: str, review_workspace_id: str = None) -> str:
tre_id = _get_tre_id()

if request_type == constants.IMPORT_TYPE:
if new_status == constants.STAGE_SUBMITTED:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + tre_id
# Import submit: copy to review workspace storage, or tre_id for legacy compatibility
dest_id = review_workspace_id if review_workspace_id else tre_id
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + dest_id
elif new_status == constants.STAGE_APPROVAL_INPROGRESS:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_APPROVED + short_workspace_id
elif new_status == constants.STAGE_REJECTION_INPROGRESS:
Expand Down Expand Up @@ -218,7 +281,12 @@ def set_output_event_to_trigger_container_deletion(dataDeletionEvent, request_pr


def get_request_files(request_properties: RequestProperties):
storage_account_name = get_storage_account(request_properties.previous_status, request_properties.type, request_properties.workspace_id)
use_metadata = os.getenv('USE_METADATA_STAGE_MANAGEMENT', 'false').lower() == 'true'
if use_metadata:
storage_account_name = airlock_storage_helper.get_storage_account_name_for_request(
request_properties.type, request_properties.previous_status, request_properties.workspace_id)
else:
storage_account_name = get_storage_account(request_properties.previous_status, request_properties.type, request_properties.workspace_id)
return blob_operations.get_request_files(account_name=storage_account_name, request_id=request_properties.request_id)


Expand Down
2 changes: 1 addition & 1 deletion airlock_processor/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.8.9"
__version__ = "0.8.11"
82 changes: 82 additions & 0 deletions airlock_processor/shared_code/airlock_storage_helper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import os
from shared_code import constants


def use_metadata_stage_management() -> bool:
return os.getenv('USE_METADATA_STAGE_MANAGEMENT', 'false').lower() == 'true'


def get_storage_account_name_for_request(request_type: str, status: str, short_workspace_id: str) -> str:
tre_id = os.environ.get("TRE_ID", "")

if use_metadata_stage_management():
# Global workspace storage - all workspaces use same account
if request_type == constants.IMPORT_TYPE:
if status in [constants.STAGE_DRAFT, constants.STAGE_SUBMITTED, constants.STAGE_IN_REVIEW,
constants.STAGE_REJECTED, constants.STAGE_REJECTION_INPROGRESS,
constants.STAGE_BLOCKED_BY_SCAN, constants.STAGE_BLOCKING_INPROGRESS]:
# ALL core import stages in stalairlock
return constants.STORAGE_ACCOUNT_NAME_AIRLOCK_CORE + tre_id
else: # Approved, approval in progress
# Global workspace storage
return constants.STORAGE_ACCOUNT_NAME_AIRLOCK_WORKSPACE_GLOBAL + tre_id
else: # export
if status in [constants.STAGE_APPROVED, constants.STAGE_APPROVAL_INPROGRESS]:
# Export approved in core
return constants.STORAGE_ACCOUNT_NAME_AIRLOCK_CORE + tre_id
else: # Draft, submitted, in-review, rejected, blocked
# Global workspace storage
return constants.STORAGE_ACCOUNT_NAME_AIRLOCK_WORKSPACE_GLOBAL + tre_id
else:
# Legacy mode
if request_type == constants.IMPORT_TYPE:
if status == constants.STAGE_DRAFT:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_EXTERNAL + tre_id
elif status in [constants.STAGE_SUBMITTED, constants.STAGE_IN_REVIEW, constants.STAGE_APPROVAL_INPROGRESS,
constants.STAGE_REJECTION_INPROGRESS, constants.STAGE_BLOCKING_INPROGRESS]:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_INPROGRESS + tre_id
elif status == constants.STAGE_APPROVED:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_APPROVED + short_workspace_id
elif status == constants.STAGE_REJECTED:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_REJECTED + tre_id
elif status == constants.STAGE_BLOCKED_BY_SCAN:
return constants.STORAGE_ACCOUNT_NAME_IMPORT_BLOCKED + tre_id
else: # export
if status == constants.STAGE_DRAFT:
return constants.STORAGE_ACCOUNT_NAME_EXPORT_INTERNAL + short_workspace_id
elif status in [constants.STAGE_SUBMITTED, constants.STAGE_IN_REVIEW, constants.STAGE_APPROVAL_INPROGRESS,
constants.STAGE_REJECTION_INPROGRESS, constants.STAGE_BLOCKING_INPROGRESS]:
return constants.STORAGE_ACCOUNT_NAME_EXPORT_INPROGRESS + short_workspace_id
elif status == constants.STAGE_APPROVED:
return constants.STORAGE_ACCOUNT_NAME_EXPORT_APPROVED + tre_id
elif status == constants.STAGE_REJECTED:
return constants.STORAGE_ACCOUNT_NAME_EXPORT_REJECTED + short_workspace_id
elif status == constants.STAGE_BLOCKED_BY_SCAN:
return constants.STORAGE_ACCOUNT_NAME_EXPORT_BLOCKED + short_workspace_id


def get_stage_from_status(request_type: str, status: str) -> str:
if request_type == constants.IMPORT_TYPE:
if status == constants.STAGE_DRAFT:
return constants.STAGE_IMPORT_EXTERNAL
elif status in [constants.STAGE_SUBMITTED, constants.STAGE_IN_REVIEW]:
return constants.STAGE_IMPORT_IN_PROGRESS
elif status in [constants.STAGE_APPROVED, constants.STAGE_APPROVAL_INPROGRESS]:
return constants.STAGE_IMPORT_APPROVED
elif status in [constants.STAGE_REJECTED, constants.STAGE_REJECTION_INPROGRESS]:
return constants.STAGE_IMPORT_REJECTED
elif status in [constants.STAGE_BLOCKED_BY_SCAN, constants.STAGE_BLOCKING_INPROGRESS]:
return constants.STAGE_IMPORT_BLOCKED
else: # export
if status == constants.STAGE_DRAFT:
return constants.STAGE_EXPORT_INTERNAL
elif status in [constants.STAGE_SUBMITTED, constants.STAGE_IN_REVIEW]:
return constants.STAGE_EXPORT_IN_PROGRESS
elif status in [constants.STAGE_APPROVED, constants.STAGE_APPROVAL_INPROGRESS]:
return constants.STAGE_EXPORT_APPROVED
elif status in [constants.STAGE_REJECTED, constants.STAGE_REJECTION_INPROGRESS]:
return constants.STAGE_EXPORT_REJECTED
elif status in [constants.STAGE_BLOCKED_BY_SCAN, constants.STAGE_BLOCKING_INPROGRESS]:
return constants.STAGE_EXPORT_BLOCKED

return "unknown"
Loading
Loading