Skip to content

feature/SOF 7777#294

Open
VsevolodX wants to merge 7 commits intomainfrom
feature/SOF-7777
Open

feature/SOF 7777#294
VsevolodX wants to merge 7 commits intomainfrom
feature/SOF-7777

Conversation

@VsevolodX
Copy link
Copy Markdown
Member

@VsevolodX VsevolodX commented Apr 10, 2026

  • update: option for polar
  • update: display file content

Summary by CodeRabbit

  • New Features

    • Added a valence band offset workflow for semiconductor interfaces: configurable DFT job setup, submission, automated result retrieval, and visualization of electrostatic potential profiles.
  • Documentation

    • Introduction updated to link to the new valence band offset workflow.
  • Bug Fixes

    • Improved multimaterial job creation compatibility.
    • Enhanced handling of file-backed job results by attaching signed URLs for easier access.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5983cb68-b11e-4f8b-8c1f-fb8bcf312799

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new Jupyter notebook that implements a valence band offset (VBO) DFT workflow, updates the Introduction notebook to link to it, and extends utils/api.py with a signed‑URL attachment helper plus multimaterial key/field compatibility fixes.

Changes

Cohort / File(s) Summary
Introduction notebook
other/materials_designer/workflows/Introduction.ipynb
Replaced placeholder with a markdown link to the new VBO workflow notebook; appended an empty code cell at the end.
New VBO workflow
other/materials_designer/workflows/valence_band_offset.ipynb
New comprehensive Jupyter notebook implementing VBO pipeline: auth/API client setup, interface material split and upload, workflow selection/configuration (DFT model, k-points, cutoffs, polar post-process), job creation/submission, async polling, and result retrieval/visualization (including file handling for polar outputs).
API utilities
utils/api.py
Added attach_signed_url_to_file_property(client, job_id, file_property) to attach signed URLs to file properties; corrected multimaterial key lookup to isMultiMaterial in create_job() and added fallback _material field alongside _materials for compatibility.

Sequence Diagram

sequenceDiagram
    participant User as User/Browser
    participant Auth as Auth System (OIDC)
    participant APIClient as Mat3ra API Client
    participant Platform as Platform/Storage
    participant DFT as DFT Job Engine

    User->>Auth: Obtain OIDC token
    Auth-->>APIClient: Return token
    APIClient->>Platform: Initialize client & select project/account
    APIClient->>Platform: Load interface from ../uploads
    APIClient->>Platform: Split interface into substrate/film and save materials
    APIClient->>Platform: Search & instantiate VBO workflow (Standata)
    APIClient->>Platform: Configure DFT model & subworkflows
    APIClient->>Platform: Create job with materials and workflow
    APIClient->>DFT: Submit job to compute cluster
    loop Polling
        APIClient->>DFT: Poll job status
    end
    DFT-->>Platform: Store results & file properties
    APIClient->>Platform: Retrieve VBO scalar and ESP profiles
    APIClient->>Platform: attach_signed_url_to_file_property(...)
    Platform-->>User: Provide visualizations and signed file access
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • pranabdas
  • timurbazhirov

"🐰
A rabbit hops through code and light,
Linking notebooks, making VBOs bright,
Signed URLs tucked in a file,
Materials sorted with a smile,
DFT dreams take flight tonight."

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'feature/SOF 7777' is a branch name/reference rather than a clear summary of the primary changes in the pull request. Replace with a descriptive summary of the main change, e.g., 'Add valence band offset workflow notebook and API utilities' or 'Introduce VBO calculation workflow with polar post-processing support'.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/SOF-7777

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
other/materials_designer/workflows/Introduction.ipynb (1)

102-109: Remove trailing empty code cell to keep notebook clean.

Line 102–109 adds a no-op cell with empty source/output; this is safe to drop.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/Introduction.ipynb` around lines 102 -
109, Remove the trailing empty code cell from the notebook JSON: locate the cell
object with "id": "1" (execution_count: null, empty "source" and "outputs") and
delete that entire cell entry so the notebook no longer contains the no-op empty
code cell.
other/materials_designer/workflows/valence_band_offset.ipynb (1)

320-327: Prefer material reuse to avoid duplicate writes on reruns.

Line 325 always creates new materials. In notebook workflows, reruns will keep duplicating records.

♻️ Proposed refactor using existing helper
+from utils.api import get_or_create_material
 from mat3ra.made.material import Material

 saved_materials = {}
 for role, material in materials_by_role.items():
     material_config = material.to_dict()
     material_config["name"] = material.name
     existing_tags = material_config.get("tags") or []
     material_config["tags"] = sorted(set([*existing_tags, interface_system_name]))
-    saved = Material.create(client.materials.create(material_config, owner_id=ACCOUNT_ID))
+    saved_dict = get_or_create_material(client, Material.create(material_config), owner_id=ACCOUNT_ID)
+    saved = Material.create(saved_dict)
     saved_materials[role] = saved
     print(f"  {role}: {saved.name} ({saved.id}) | tags={material_config['tags']}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
320 - 327, The code always calls Material.create(client.materials.create(...))
inside the materials_by_role loop which creates duplicate materials on reruns;
instead, before calling client.materials.create, query for an existing material
(e.g., via client.materials.list or your project's helper for finding materials
by name or tag) using material.name or interface_system_name, and if found use
Material.create(existing) (or update it) and assign that to
saved_materials[role]; only call client.materials.create when no existing
material is found, and ensure you still set/merge tags (material_config["tags"])
and print the reused vs created material accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@other/materials_designer/workflows/valence_band_offset.ipynb`:
- Around line 232-234: The code assumes client.projects.list(...) returns at
least one project and directly indexes projects[0], which can raise IndexError
for accounts without a default project; update the block that calls
client.projects.list and uses project_id to first check that the returned
projects list is non-empty (e.g., inspect the variable projects), and if empty
either raise a clear descriptive error or handle the case (create/select a
project) before accessing projects[0]; ensure the log/exception includes
ACCOUNT_ID and that you only set project_id from projects[0]["_id"] after
confirming projects has at least one element.
- Around line 526-530: The current selection for cluster using CLUSTER_NAME and
clusters can yield cluster=None or fail when clusters is empty; update the logic
around the cluster assignment (the CLUSTER_NAME check and the "cluster = ..."
lines that set cluster from clusters) to explicitly handle missing or unmatched
clusters by first verifying clusters is non-empty, then attempt to find a match,
and if not found either select a safe default (e.g., the first cluster) or raise
a clear error before passing cluster into the compute config so compute
configuration never receives None.
- Around line 639-669: The code assumes responses always exist and workflow
names never change: remove direct indexing and strict key lookups by checking
returned lists and dicts before using them (e.g., replace usages of
client.properties.get_for_job(...)[0] for vbo_value and avg_esp_data with a safe
check that the list is non-empty and handle empty case by logging/warning and
skipping); when mapping avg_esp_unit_ids and accessing by subworkflow_name from
ordered_names, use .get(subworkflow_name) and skip or warn if None to avoid
KeyError; add defensive handling around polar_file_unit_id and
visualize_properties so missing data is detected and the loop continues without
raising exceptions.

In `@utils/api.py`:
- Around line 275-276: The code currently sets is_multimaterial =
job_workflow_dict.get("isMultiMaterial", False) which misses legacy workflows
using the key "isMultimaterial"; update the assignment to check both keys
(prefer "isMultiMaterial" then fall back to "isMultimaterial", defaulting to
False) so the variable is_multimaterial reflects either flag in
job_workflow_dict and preserves backward compatibility.

---

Nitpick comments:
In `@other/materials_designer/workflows/Introduction.ipynb`:
- Around line 102-109: Remove the trailing empty code cell from the notebook
JSON: locate the cell object with "id": "1" (execution_count: null, empty
"source" and "outputs") and delete that entire cell entry so the notebook no
longer contains the no-op empty code cell.

In `@other/materials_designer/workflows/valence_band_offset.ipynb`:
- Around line 320-327: The code always calls
Material.create(client.materials.create(...)) inside the materials_by_role loop
which creates duplicate materials on reruns; instead, before calling
client.materials.create, query for an existing material (e.g., via
client.materials.list or your project's helper for finding materials by name or
tag) using material.name or interface_system_name, and if found use
Material.create(existing) (or update it) and assign that to
saved_materials[role]; only call client.materials.create when no existing
material is found, and ensure you still set/merge tags (material_config["tags"])
and print the reused vs created material accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8dbf1413-483f-4830-a2ca-8178dacfa839

📥 Commits

Reviewing files that changed from the base of the PR and between f13e84f and 741c4ea.

📒 Files selected for processing (3)
  • other/materials_designer/workflows/Introduction.ipynb
  • other/materials_designer/workflows/valence_band_offset.ipynb
  • utils/api.py

Comment on lines +232 to +234
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")\n"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against missing default project before indexing.

Line 233 assumes at least one project and can raise IndexError for accounts without a default project filter hit.

🔧 Proposed fix
 projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
-project_id = projects[0]["_id"]
-print(f"✅ Using project: {projects[0]['name']} ({project_id})")
+if not projects:
+    raise ValueError(f"No default project found for account {ACCOUNT_ID}")
+project_id = projects[0]["_id"]
+print(f"✅ Using project: {projects[0]['name']} ({project_id})")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")\n"
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
if not projects:
raise ValueError(f"No default project found for account {ACCOUNT_ID}")
project_id = projects[0]["_id"]
print(f"✅ Using project: {projects[0]['name']} ({project_id})")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
232 - 234, The code assumes client.projects.list(...) returns at least one
project and directly indexes projects[0], which can raise IndexError for
accounts without a default project; update the block that calls
client.projects.list and uses project_id to first check that the returned
projects list is non-empty (e.g., inspect the variable projects), and if empty
either raise a clear descriptive error or handle the case (create/select a
project) before accessing projects[0]; ensure the log/exception includes
ACCOUNT_ID and that you only set project_id from projects[0]["_id"] after
confirming projects has at least one element.

Comment on lines +526 to +530
"if CLUSTER_NAME:\n",
" cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n",
"else:\n",
" cluster = clusters[0]\n",
"\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle missing or unmatched cluster selection explicitly.

Line 529 and Line 531 can fail or pass None into compute config when no clusters exist or CLUSTER_NAME doesn’t match.

🔧 Proposed fix
 if CLUSTER_NAME:
     cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
 else:
-    cluster = clusters[0]
+    cluster = clusters[0] if clusters else None
+
+if cluster is None:
+    raise ValueError(
+        f"No cluster selected. Available={ [c['hostname'] for c in clusters] }, requested={CLUSTER_NAME}"
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"if CLUSTER_NAME:\n",
" cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n",
"else:\n",
" cluster = clusters[0]\n",
"\n",
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
else:
cluster = clusters[0] if clusters else None
if cluster is None:
raise ValueError(
f"No cluster selected. Available={ [c['hostname'] for c in clusters] }, requested={CLUSTER_NAME}"
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
526 - 530, The current selection for cluster using CLUSTER_NAME and clusters can
yield cluster=None or fail when clusters is empty; update the logic around the
cluster assignment (the CLUSTER_NAME check and the "cluster = ..." lines that
set cluster from clusters) to explicitly handle missing or unmatched clusters by
first verifying clusters is non-empty, then attempt to find a match, and if not
found either select a safe default (e.g., the first cluster) or raise a clear
error before passing cluster into the compute config so compute configuration
never receives None.

Comment on lines +639 to +669
"vbo_value = client.properties.get_for_job(\n",
" job_id,\n",
" property_name=PropertyName.scalar.valence_band_offset.value,\n",
")[0]\n",
"print(f\"Valence Band Offset (VBO) value: {vbo_value['value']:.3f} eV\")\n",
"\n",
"avg_esp_unit_ids = {}\n",
"polar_file_unit_id = None\n",
"for subworkflow in workflow[\"subworkflows\"]:\n",
" subworkflow_name = subworkflow[\"name\"]\n",
" for unit in subworkflow[\"units\"]:\n",
" result_names = [result[\"name\"] for result in unit.get(\"results\", [])]\n",
" if \"average_potential_profile\" in result_names:\n",
" avg_esp_unit_ids[subworkflow_name] = unit[\"flowchartId\"]\n",
" if IS_POLAR and \"file_content\" in result_names:\n",
" polar_file_unit_id = unit[\"flowchartId\"]\n",
"\n",
"ordered_names = [\n",
" \"BS + Avg ESP (Interface)\",\n",
" \"BS + Avg ESP (interface left)\",\n",
" \"BS + Avg ESP (interface right)\",\n",
"]\n",
"\n",
"for subworkflow_name in ordered_names:\n",
" unit_id = avg_esp_unit_ids[subworkflow_name]\n",
" avg_esp_data = client.properties.get_for_job(\n",
" job_id,\n",
" property_name=\"average_potential_profile\",\n",
" unit_id=unit_id,\n",
" )[0]\n",
" visualize_properties(avg_esp_data, title=subworkflow_name)\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Result extraction is brittle on missing properties or workflow name drift.

Line 639/664 direct [0] indexing and Line 663 strict key lookup can raise exceptions when output data is partial or workflow names change slightly.

🔧 Proposed hardening
-vbo_value = client.properties.get_for_job(
+vbo_results = client.properties.get_for_job(
     job_id,
     property_name=PropertyName.scalar.valence_band_offset.value,
-)[0]
+)
+if not vbo_results:
+    raise ValueError("VBO property was not found for this job.")
+vbo_value = vbo_results[0]
 print(f"Valence Band Offset (VBO) value: {vbo_value['value']:.3f} eV")
@@
 for subworkflow_name in ordered_names:
-    unit_id = avg_esp_unit_ids[subworkflow_name]
-    avg_esp_data = client.properties.get_for_job(
+    unit_id = avg_esp_unit_ids.get(subworkflow_name)
+    if not unit_id:
+        print(f"⚠️ Skipping missing ESP unit for: {subworkflow_name}")
+        continue
+    avg_esp_results = client.properties.get_for_job(
         job_id,
         property_name="average_potential_profile",
         unit_id=unit_id,
-    )[0]
-    visualize_properties(avg_esp_data, title=subworkflow_name)
+    )
+    if not avg_esp_results:
+        print(f"⚠️ No ESP property for: {subworkflow_name}")
+        continue
+    visualize_properties(avg_esp_results[0], title=subworkflow_name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
639 - 669, The code assumes responses always exist and workflow names never
change: remove direct indexing and strict key lookups by checking returned lists
and dicts before using them (e.g., replace usages of
client.properties.get_for_job(...)[0] for vbo_value and avg_esp_data with a safe
check that the list is non-empty and handle empty case by logging/warning and
skipping); when mapping avg_esp_unit_ids and accessing by subworkflow_name from
ordered_names, use .get(subworkflow_name) and skip or warn if None to avoid
KeyError; add defensive handling around polar_file_unit_id and
visualize_properties so missing data is detected and the loop continues without
raising exceptions.

Comment on lines +275 to 276
is_multimaterial = job_workflow_dict.get("isMultiMaterial", False)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve backward compatibility for legacy workflow flag key.

Line 275 only checks isMultiMaterial. Workflows still using isMultimaterial will silently fall back to single-material job config.

🔧 Proposed backward-compatible fix
-    is_multimaterial = job_workflow_dict.get("isMultiMaterial", False)
+    is_multimaterial = job_workflow_dict.get(
+        "isMultiMaterial",
+        job_workflow_dict.get("isMultimaterial", False),
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
is_multimaterial = job_workflow_dict.get("isMultiMaterial", False)
is_multimaterial = job_workflow_dict.get(
"isMultiMaterial",
job_workflow_dict.get("isMultimaterial", False),
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/api.py` around lines 275 - 276, The code currently sets
is_multimaterial = job_workflow_dict.get("isMultiMaterial", False) which misses
legacy workflows using the key "isMultimaterial"; update the assignment to check
both keys (prefer "isMultiMaterial" then fall back to "isMultimaterial",
defaulting to False) so the variable is_multimaterial reflects either flag in
job_workflow_dict and preserves backward compatibility.

@VsevolodX VsevolodX changed the base branch from main to feature/SOF-7879 April 10, 2026 21:26
}
)
workflow_dict["subworkflows"].append(_polar_postprocess_subworkflow(source_python_subworkflow))
return Workflow.create(workflow_dict)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this should not be needed:

  • implement the python-native in notebook
  • update VBO property with API if needed

Base automatically changed from feature/SOF-7879 to main April 11, 2026 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants