Symptom
As reported in IcM 294491919, even though az ad sp create-for-rbac has retry logic to create service principal after the application creation replication/propagation is done:
|
for retry_time in range(0, _RETRY_TIMES): |
|
try: |
|
aad_sp = _create_service_principal(cmd.cli_ctx, app_id, resolve_app=False) |
|
break |
|
except Exception as ex: # pylint: disable=broad-except |
|
err_msg = str(ex) |
|
if retry_time < _RETRY_TIMES and ( |
|
' does not reference ' in err_msg or |
|
' does not exist ' in err_msg or |
|
'service principal being created must in the local tenant' in err_msg): |
|
logger.warning("Creating service principal failed with error '%s'. Retrying: %s/%s", |
|
err_msg, retry_time + 1, _RETRY_TIMES) |
|
time.sleep(5) |
|
else: |
|
logger.warning( |
|
"Creating service principal failed for '%s'. Trace followed:\n%s", |
|
app_id, ex.response.headers |
|
if hasattr(ex, 'response') else ex) # pylint: disable=no-member |
|
raise |
It is still be possible that after _RETRY_TIMES (whose value is 36), the replication is not complete, leading to service principal creation failure:
The appId 'c807447b-5118-4756-a6c5-90dbdf919d22' of the service principal does not reference a valid application object.
Possible solution
AAD has a mechanism called Read Write Consistency (RWC) Token to solve this:
Problems
- AAD has a very restrictive approach towards onboarding any team on RwcToken.
- Even if we use RwcToken in
az ad sp create-for-rbac, it is difficult to apply RwcToken to separate command executions like az ad app create and az ad sp create. Azure CLI will have to save the RwcToken to a local file (because each execution is a separate Python process). How to save this token securely needs to be decided.
Alternative solutions
Increase _RETRY_TIMES to higher value or use exponential backoff in order to increase the max totally retry time (currently 36*5s=180s).
Symptom
As reported in IcM 294491919, even though
az ad sp create-for-rbachas retry logic to create service principal after the application creation replication/propagation is done:azure-cli/src/azure-cli/azure/cli/command_modules/role/custom.py
Lines 1451 to 1469 in f8ea47c
It is still be possible that after
_RETRY_TIMES(whose value is36), the replication is not complete, leading to service principal creation failure:Possible solution
AAD has a mechanism called Read Write Consistency (RWC) Token to solve this:
Problems
az ad sp create-for-rbac, it is difficult to apply RwcToken to separate command executions likeaz ad app createandaz ad sp create. Azure CLI will have to save the RwcToken to a local file (because each execution is a separate Python process). How to save this token securely needs to be decided.Alternative solutions
Increase
_RETRY_TIMESto higher value or use exponential backoff in order to increase the max totally retry time (currently 36*5s=180s).