Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a new Jupyter notebook that implements a valence band offset (VBO) DFT workflow, updates the Introduction notebook to link to it, and extends Changes
Sequence DiagramsequenceDiagram
participant User as User/Browser
participant Auth as Auth System (OIDC)
participant APIClient as Mat3ra API Client
participant Platform as Platform/Storage
participant DFT as DFT Job Engine
User->>Auth: Obtain OIDC token
Auth-->>APIClient: Return token
APIClient->>Platform: Initialize client & select project/account
APIClient->>Platform: Load interface from ../uploads
APIClient->>Platform: Split interface into substrate/film and save materials
APIClient->>Platform: Search & instantiate VBO workflow (Standata)
APIClient->>Platform: Configure DFT model & subworkflows
APIClient->>Platform: Create job with materials and workflow
APIClient->>DFT: Submit job to compute cluster
loop Polling
APIClient->>DFT: Poll job status
end
DFT-->>Platform: Store results & file properties
APIClient->>Platform: Retrieve VBO scalar and ESP profiles
APIClient->>Platform: attach_signed_url_to_file_property(...)
Platform-->>User: Provide visualizations and signed file access
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
other/materials_designer/workflows/Introduction.ipynb (1)
102-109: Remove trailing empty code cell to keep notebook clean.Line 102–109 adds a no-op cell with empty source/output; this is safe to drop.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/Introduction.ipynb` around lines 102 - 109, Remove the trailing empty code cell from the notebook JSON: locate the cell object with "id": "1" (execution_count: null, empty "source" and "outputs") and delete that entire cell entry so the notebook no longer contains the no-op empty code cell.other/materials_designer/workflows/valence_band_offset.ipynb (1)
320-327: Prefer material reuse to avoid duplicate writes on reruns.Line 325 always creates new materials. In notebook workflows, reruns will keep duplicating records.
♻️ Proposed refactor using existing helper
+from utils.api import get_or_create_material from mat3ra.made.material import Material saved_materials = {} for role, material in materials_by_role.items(): material_config = material.to_dict() material_config["name"] = material.name existing_tags = material_config.get("tags") or [] material_config["tags"] = sorted(set([*existing_tags, interface_system_name])) - saved = Material.create(client.materials.create(material_config, owner_id=ACCOUNT_ID)) + saved_dict = get_or_create_material(client, Material.create(material_config), owner_id=ACCOUNT_ID) + saved = Material.create(saved_dict) saved_materials[role] = saved print(f" {role}: {saved.name} ({saved.id}) | tags={material_config['tags']}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines 320 - 327, The code always calls Material.create(client.materials.create(...)) inside the materials_by_role loop which creates duplicate materials on reruns; instead, before calling client.materials.create, query for an existing material (e.g., via client.materials.list or your project's helper for finding materials by name or tag) using material.name or interface_system_name, and if found use Material.create(existing) (or update it) and assign that to saved_materials[role]; only call client.materials.create when no existing material is found, and ensure you still set/merge tags (material_config["tags"]) and print the reused vs created material accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@other/materials_designer/workflows/valence_band_offset.ipynb`:
- Around line 232-234: The code assumes client.projects.list(...) returns at
least one project and directly indexes projects[0], which can raise IndexError
for accounts without a default project; update the block that calls
client.projects.list and uses project_id to first check that the returned
projects list is non-empty (e.g., inspect the variable projects), and if empty
either raise a clear descriptive error or handle the case (create/select a
project) before accessing projects[0]; ensure the log/exception includes
ACCOUNT_ID and that you only set project_id from projects[0]["_id"] after
confirming projects has at least one element.
- Around line 526-530: The current selection for cluster using CLUSTER_NAME and
clusters can yield cluster=None or fail when clusters is empty; update the logic
around the cluster assignment (the CLUSTER_NAME check and the "cluster = ..."
lines that set cluster from clusters) to explicitly handle missing or unmatched
clusters by first verifying clusters is non-empty, then attempt to find a match,
and if not found either select a safe default (e.g., the first cluster) or raise
a clear error before passing cluster into the compute config so compute
configuration never receives None.
- Around line 639-669: The code assumes responses always exist and workflow
names never change: remove direct indexing and strict key lookups by checking
returned lists and dicts before using them (e.g., replace usages of
client.properties.get_for_job(...)[0] for vbo_value and avg_esp_data with a safe
check that the list is non-empty and handle empty case by logging/warning and
skipping); when mapping avg_esp_unit_ids and accessing by subworkflow_name from
ordered_names, use .get(subworkflow_name) and skip or warn if None to avoid
KeyError; add defensive handling around polar_file_unit_id and
visualize_properties so missing data is detected and the loop continues without
raising exceptions.
In `@utils/api.py`:
- Around line 275-276: The code currently sets is_multimaterial =
job_workflow_dict.get("isMultiMaterial", False) which misses legacy workflows
using the key "isMultimaterial"; update the assignment to check both keys
(prefer "isMultiMaterial" then fall back to "isMultimaterial", defaulting to
False) so the variable is_multimaterial reflects either flag in
job_workflow_dict and preserves backward compatibility.
---
Nitpick comments:
In `@other/materials_designer/workflows/Introduction.ipynb`:
- Around line 102-109: Remove the trailing empty code cell from the notebook
JSON: locate the cell object with "id": "1" (execution_count: null, empty
"source" and "outputs") and delete that entire cell entry so the notebook no
longer contains the no-op empty code cell.
In `@other/materials_designer/workflows/valence_band_offset.ipynb`:
- Around line 320-327: The code always calls
Material.create(client.materials.create(...)) inside the materials_by_role loop
which creates duplicate materials on reruns; instead, before calling
client.materials.create, query for an existing material (e.g., via
client.materials.list or your project's helper for finding materials by name or
tag) using material.name or interface_system_name, and if found use
Material.create(existing) (or update it) and assign that to
saved_materials[role]; only call client.materials.create when no existing
material is found, and ensure you still set/merge tags (material_config["tags"])
and print the reused vs created material accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8dbf1413-483f-4830-a2ca-8178dacfa839
📒 Files selected for processing (3)
other/materials_designer/workflows/Introduction.ipynbother/materials_designer/workflows/valence_band_offset.ipynbutils/api.py
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | ||
| "project_id = projects[0][\"_id\"]\n", | ||
| "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")\n" |
There was a problem hiding this comment.
Guard against missing default project before indexing.
Line 233 assumes at least one project and can raise IndexError for accounts without a default project filter hit.
🔧 Proposed fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
-project_id = projects[0]["_id"]
-print(f"✅ Using project: {projects[0]['name']} ({project_id})")
+if not projects:
+ raise ValueError(f"No default project found for account {ACCOUNT_ID}")
+project_id = projects[0]["_id"]
+print(f"✅ Using project: {projects[0]['name']} ({project_id})")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | |
| "project_id = projects[0][\"_id\"]\n", | |
| "print(f\"✅ Using project: {projects[0]['name']} ({project_id})\")\n" | |
| projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID}) | |
| if not projects: | |
| raise ValueError(f"No default project found for account {ACCOUNT_ID}") | |
| project_id = projects[0]["_id"] | |
| print(f"✅ Using project: {projects[0]['name']} ({project_id})") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
232 - 234, The code assumes client.projects.list(...) returns at least one
project and directly indexes projects[0], which can raise IndexError for
accounts without a default project; update the block that calls
client.projects.list and uses project_id to first check that the returned
projects list is non-empty (e.g., inspect the variable projects), and if empty
either raise a clear descriptive error or handle the case (create/select a
project) before accessing projects[0]; ensure the log/exception includes
ACCOUNT_ID and that you only set project_id from projects[0]["_id"] after
confirming projects has at least one element.
| "if CLUSTER_NAME:\n", | ||
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | ||
| "else:\n", | ||
| " cluster = clusters[0]\n", | ||
| "\n", |
There was a problem hiding this comment.
Handle missing or unmatched cluster selection explicitly.
Line 529 and Line 531 can fail or pass None into compute config when no clusters exist or CLUSTER_NAME doesn’t match.
🔧 Proposed fix
if CLUSTER_NAME:
cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
else:
- cluster = clusters[0]
+ cluster = clusters[0] if clusters else None
+
+if cluster is None:
+ raise ValueError(
+ f"No cluster selected. Available={ [c['hostname'] for c in clusters] }, requested={CLUSTER_NAME}"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "if CLUSTER_NAME:\n", | |
| " cluster = next((c for c in clusters if CLUSTER_NAME in c[\"hostname\"]), None)\n", | |
| "else:\n", | |
| " cluster = clusters[0]\n", | |
| "\n", | |
| if CLUSTER_NAME: | |
| cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None) | |
| else: | |
| cluster = clusters[0] if clusters else None | |
| if cluster is None: | |
| raise ValueError( | |
| f"No cluster selected. Available={ [c['hostname'] for c in clusters] }, requested={CLUSTER_NAME}" | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
526 - 530, The current selection for cluster using CLUSTER_NAME and clusters can
yield cluster=None or fail when clusters is empty; update the logic around the
cluster assignment (the CLUSTER_NAME check and the "cluster = ..." lines that
set cluster from clusters) to explicitly handle missing or unmatched clusters by
first verifying clusters is non-empty, then attempt to find a match, and if not
found either select a safe default (e.g., the first cluster) or raise a clear
error before passing cluster into the compute config so compute configuration
never receives None.
| "vbo_value = client.properties.get_for_job(\n", | ||
| " job_id,\n", | ||
| " property_name=PropertyName.scalar.valence_band_offset.value,\n", | ||
| ")[0]\n", | ||
| "print(f\"Valence Band Offset (VBO) value: {vbo_value['value']:.3f} eV\")\n", | ||
| "\n", | ||
| "avg_esp_unit_ids = {}\n", | ||
| "polar_file_unit_id = None\n", | ||
| "for subworkflow in workflow[\"subworkflows\"]:\n", | ||
| " subworkflow_name = subworkflow[\"name\"]\n", | ||
| " for unit in subworkflow[\"units\"]:\n", | ||
| " result_names = [result[\"name\"] for result in unit.get(\"results\", [])]\n", | ||
| " if \"average_potential_profile\" in result_names:\n", | ||
| " avg_esp_unit_ids[subworkflow_name] = unit[\"flowchartId\"]\n", | ||
| " if IS_POLAR and \"file_content\" in result_names:\n", | ||
| " polar_file_unit_id = unit[\"flowchartId\"]\n", | ||
| "\n", | ||
| "ordered_names = [\n", | ||
| " \"BS + Avg ESP (Interface)\",\n", | ||
| " \"BS + Avg ESP (interface left)\",\n", | ||
| " \"BS + Avg ESP (interface right)\",\n", | ||
| "]\n", | ||
| "\n", | ||
| "for subworkflow_name in ordered_names:\n", | ||
| " unit_id = avg_esp_unit_ids[subworkflow_name]\n", | ||
| " avg_esp_data = client.properties.get_for_job(\n", | ||
| " job_id,\n", | ||
| " property_name=\"average_potential_profile\",\n", | ||
| " unit_id=unit_id,\n", | ||
| " )[0]\n", | ||
| " visualize_properties(avg_esp_data, title=subworkflow_name)\n", |
There was a problem hiding this comment.
Result extraction is brittle on missing properties or workflow name drift.
Line 639/664 direct [0] indexing and Line 663 strict key lookup can raise exceptions when output data is partial or workflow names change slightly.
🔧 Proposed hardening
-vbo_value = client.properties.get_for_job(
+vbo_results = client.properties.get_for_job(
job_id,
property_name=PropertyName.scalar.valence_band_offset.value,
-)[0]
+)
+if not vbo_results:
+ raise ValueError("VBO property was not found for this job.")
+vbo_value = vbo_results[0]
print(f"Valence Band Offset (VBO) value: {vbo_value['value']:.3f} eV")
@@
for subworkflow_name in ordered_names:
- unit_id = avg_esp_unit_ids[subworkflow_name]
- avg_esp_data = client.properties.get_for_job(
+ unit_id = avg_esp_unit_ids.get(subworkflow_name)
+ if not unit_id:
+ print(f"⚠️ Skipping missing ESP unit for: {subworkflow_name}")
+ continue
+ avg_esp_results = client.properties.get_for_job(
job_id,
property_name="average_potential_profile",
unit_id=unit_id,
- )[0]
- visualize_properties(avg_esp_data, title=subworkflow_name)
+ )
+ if not avg_esp_results:
+ print(f"⚠️ No ESP property for: {subworkflow_name}")
+ continue
+ visualize_properties(avg_esp_results[0], title=subworkflow_name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/valence_band_offset.ipynb` around lines
639 - 669, The code assumes responses always exist and workflow names never
change: remove direct indexing and strict key lookups by checking returned lists
and dicts before using them (e.g., replace usages of
client.properties.get_for_job(...)[0] for vbo_value and avg_esp_data with a safe
check that the list is non-empty and handle empty case by logging/warning and
skipping); when mapping avg_esp_unit_ids and accessing by subworkflow_name from
ordered_names, use .get(subworkflow_name) and skip or warn if None to avoid
KeyError; add defensive handling around polar_file_unit_id and
visualize_properties so missing data is detected and the loop continues without
raising exceptions.
| is_multimaterial = job_workflow_dict.get("isMultiMaterial", False) | ||
|
|
There was a problem hiding this comment.
Preserve backward compatibility for legacy workflow flag key.
Line 275 only checks isMultiMaterial. Workflows still using isMultimaterial will silently fall back to single-material job config.
🔧 Proposed backward-compatible fix
- is_multimaterial = job_workflow_dict.get("isMultiMaterial", False)
+ is_multimaterial = job_workflow_dict.get(
+ "isMultiMaterial",
+ job_workflow_dict.get("isMultimaterial", False),
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| is_multimaterial = job_workflow_dict.get("isMultiMaterial", False) | |
| is_multimaterial = job_workflow_dict.get( | |
| "isMultiMaterial", | |
| job_workflow_dict.get("isMultimaterial", False), | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@utils/api.py` around lines 275 - 276, The code currently sets
is_multimaterial = job_workflow_dict.get("isMultiMaterial", False) which misses
legacy workflows using the key "isMultimaterial"; update the assignment to check
both keys (prefer "isMultiMaterial" then fall back to "isMultimaterial",
defaulting to False) so the variable is_multimaterial reflects either flag in
job_workflow_dict and preserves backward compatibility.
| } | ||
| ) | ||
| workflow_dict["subworkflows"].append(_polar_postprocess_subworkflow(source_python_subworkflow)) | ||
| return Workflow.create(workflow_dict) |
There was a problem hiding this comment.
All of this should not be needed:
- implement the python-native in notebook
- update VBO property with API if needed
Summary by CodeRabbit
New Features
Documentation
Bug Fixes