-
Notifications
You must be signed in to change notification settings - Fork 0
Implement DWG converter endpoint #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6369d76
d5e559f
3d3d912
eded0e3
94fdcf2
8612c92
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *.egg-info/ | ||
| .pytest_cache/ | ||
| .mypy_cache/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,21 +1,103 @@ | ||
| # ckanext-malmo | ||
|
|
||
| Customizations for the City of Malmö CKAN instance. | ||
| Customizations for the City of Malmo CKAN instance. | ||
|
|
||
| ## Requirements | ||
|
|
||
| - CKAN 2.10+ (tested on 2.11) | ||
|
|
||
| ## DWG Preview | ||
|
|
||
| This extension exposes a binary preview action at: | ||
|
|
||
| ```text | ||
| /api/3/action/convert_dwg?id=<resource-id> | ||
| ``` | ||
|
|
||
| The preview flow is: | ||
|
|
||
| 1. stage the DWG resource into a temporary file | ||
| 2. convert DWG -> DXF with ODA File Converter | ||
| 3. render DXF -> PNG with `ezdxf` and `matplotlib` | ||
| 4. cache the generated PNG by resource id + file hash | ||
| 5. return the PNG directly from the CKAN endpoint | ||
|
|
||
| Important runtime requirements: | ||
|
|
||
| - ODA File Converter must be installed and available on `PATH` | ||
| - `xvfb` is used automatically when `xvfb-run` is available | ||
| - Python rendering dependencies must be installed in the CKAN runtime | ||
|
|
||
| Python runtime dependencies: | ||
|
|
||
| - `ezdxf` | ||
| - `matplotlib` | ||
|
|
||
| System/runtime dependencies: | ||
|
|
||
| - ODA File Converter Linux asset (`.AppImage` or `.deb`) | ||
| - `xvfb` | ||
|
|
||
| ## DWG Preview Configuration | ||
|
|
||
| The DWG preview pipeline supports these CKAN config settings: | ||
|
|
||
| - `ckanext.malmo.dwg_preview_timeout` | ||
| Conversion timeout in seconds. Default: `45`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_download_timeout` | ||
| Download timeout in seconds for remote DWG resources. Default: `30`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_max_download_bytes` | ||
| Maximum DWG download size in bytes. Default: `104857600`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_oda_executable` | ||
| Absolute path or executable name for ODA File Converter. Default: `ODAFileConverter`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_oda_output_version` | ||
| DXF target version passed to ODA File Converter. Default: `ACAD2018`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_xvfb_screen` | ||
| Screen configuration passed to `xvfb-run` when launching ODA File Converter in headless Docker environments. Default: `-screen 0 1600x1200x24`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_render_margin` | ||
| Extra margin applied around rendered geometry. Default: `0.05`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_image_width` | ||
| Output preview width in pixels. Default: `1600`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_image_height` | ||
| Output preview height in pixels. Default: `1200`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_min_preview_bytes` | ||
| Minimum byte size for accepting a generated preview. Default: `1024`. | ||
|
|
||
| - `ckanext.malmo.dwg_preview_cache_dir` | ||
| Directory used for cached PNG previews. Default: system temporary directory + `ckan-dwg-preview-cache`. | ||
|
|
||
| ## Docker Setup | ||
|
|
||
| In the local development Docker setup: | ||
|
|
||
| - ODA File Converter is installed during image build | ||
| - the local `src/ckanext-malmo` extension is installed at container startup from the mounted workspace | ||
| - `xvfb` is installed for headless ODA execution | ||
|
|
||
| Optional local ODA asset override directory: | ||
|
|
||
| ```text | ||
| ckan/vendor/oda/ | ||
| ``` | ||
|
|
||
| Supported local asset formats: | ||
|
|
||
| - `.AppImage` | ||
| - `.deb` | ||
|
|
||
| ## Installation | ||
|
|
||
| To install `ckanext-malmo`: | ||
|
|
||
| 1. Clone this repository (or copy the extension files). | ||
| 2. Install the extension in your environment: | ||
| ```bash | ||
| pip install -e ckan/extensions/ckanext-malmo | ||
| ``` | ||
| 3. Add `malmo` to the `ckan.plugins` setting in your CKAN configuration file (`ckan.ini`): | ||
| ```ini | ||
| ckan.plugins = ... malmo | ||
| ``` | ||
| 1. Install the extension in your environment. | ||
| 2. Install ODA File Converter and make sure `ODAFileConverter` is available in the runtime environment. | ||
| 3. Add `malmo` to the `ckan.plugins` setting in your CKAN configuration file. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| from .service import build_preview_payload | ||
|
|
||
| __all__ = ["build_preview_payload"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import hashlib | ||
| import os | ||
| import shutil | ||
|
|
||
|
|
||
| def file_sha256(path: str) -> str: | ||
| digest = hashlib.sha256() | ||
| with open(path, "rb") as source_file: | ||
| for chunk in iter(lambda: source_file.read(1024 * 1024), b""): | ||
| digest.update(chunk) | ||
| return digest.hexdigest() | ||
|
|
||
|
|
||
| def build_cache_path(cache_dir: str, resource_id: str, source_hash: str) -> str: | ||
| cache_key = hashlib.sha256(f"{resource_id}:{source_hash}".encode("utf-8")).hexdigest() | ||
| return os.path.join(cache_dir, f"{cache_key}.png") | ||
|
|
||
|
|
||
| def is_cached_preview_valid(path: str, min_preview_bytes: int) -> bool: | ||
| return os.path.exists(path) and os.path.getsize(path) >= min_preview_bytes | ||
|
|
||
|
|
||
| def store_cached_preview(source_path: str, cache_path: str) -> None: | ||
| os.makedirs(os.path.dirname(cache_path), exist_ok=True) | ||
| temp_path = f"{cache_path}.tmp" | ||
| shutil.copyfile(source_path, temp_path) | ||
| os.replace(temp_path, cache_path) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,162 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import os | ||
| import tempfile | ||
| from dataclasses import dataclass | ||
|
|
||
| from ckan.plugins import toolkit | ||
|
|
||
| log = logging.getLogger(__name__) | ||
|
|
||
| DEFAULT_TIMEOUT_SECONDS = 45 | ||
| DEFAULT_DOWNLOAD_TIMEOUT_SECONDS = 30 | ||
| DEFAULT_MAX_DOWNLOAD_BYTES = 100 * 1024 * 1024 | ||
| DEFAULT_ODA_OUTPUT_VERSION = "ACAD2018" | ||
| DEFAULT_XVFB_SCREEN = "-screen 0 1600x1200x24" | ||
| DEFAULT_RENDER_MARGIN = 0.05 | ||
| DEFAULT_IMAGE_WIDTH = 1600 | ||
| DEFAULT_IMAGE_HEIGHT = 1200 | ||
| DEFAULT_MIN_PREVIEW_BYTES = 1024 | ||
| DEFAULT_CACHE_DIR = os.path.join(tempfile.gettempdir(), "ckan-dwg-preview-cache") | ||
| DEFAULT_MIN_CONTENT_COVERAGE = 0.002 | ||
| DEFAULT_MAX_INITIAL_COVERAGE = 0.6 | ||
| DEFAULT_RETRY_RENDER_MARGIN = 0.01 | ||
| DEFAULT_LINEWEIGHT_SCALING = 1.5 | ||
| DEFAULT_MIN_OCCUPIED_WIDTH_RATIO = 0.2 | ||
| DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO = 0.2 | ||
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class PreviewConfig: | ||
| timeout: int | ||
| download_timeout: int | ||
| max_download_bytes: int | ||
| oda_executable: str | ||
| oda_output_version: str | ||
| xvfb_screen: str | ||
| render_margin: float | ||
| image_width: int | ||
| image_height: int | ||
| min_preview_bytes: int | ||
| cache_dir: str | ||
| min_content_coverage: float | ||
| max_initial_coverage: float | ||
| retry_render_margin: float | ||
| lineweight_scaling: float | ||
| min_occupied_width_ratio: float | ||
| min_occupied_height_ratio: float | ||
|
|
||
| @classmethod | ||
| def from_ckan_config(cls) -> "PreviewConfig": | ||
| return cls( | ||
| timeout=_get_int("ckanext.malmo.dwg_preview_timeout", DEFAULT_TIMEOUT_SECONDS, minimum=1), | ||
| download_timeout=_get_int( | ||
| "ckanext.malmo.dwg_preview_download_timeout", | ||
| DEFAULT_DOWNLOAD_TIMEOUT_SECONDS, | ||
| minimum=1, | ||
| ), | ||
| max_download_bytes=_get_int( | ||
| "ckanext.malmo.dwg_preview_max_download_bytes", | ||
| DEFAULT_MAX_DOWNLOAD_BYTES, | ||
| minimum=1024, | ||
| ), | ||
| oda_executable=_get_string("ckanext.malmo.dwg_preview_oda_executable", "ODAFileConverter"), | ||
| oda_output_version=_get_string( | ||
| "ckanext.malmo.dwg_preview_oda_output_version", | ||
| DEFAULT_ODA_OUTPUT_VERSION, | ||
| ), | ||
| xvfb_screen=_get_string( | ||
| "ckanext.malmo.dwg_preview_xvfb_screen", | ||
| DEFAULT_XVFB_SCREEN, | ||
| ), | ||
| render_margin=_get_float( | ||
| "ckanext.malmo.dwg_preview_render_margin", | ||
| DEFAULT_RENDER_MARGIN, | ||
| minimum=0.0, | ||
| ), | ||
| image_width=_get_int( | ||
| "ckanext.malmo.dwg_preview_image_width", | ||
| DEFAULT_IMAGE_WIDTH, | ||
| minimum=256, | ||
| ), | ||
| image_height=_get_int( | ||
| "ckanext.malmo.dwg_preview_image_height", | ||
| DEFAULT_IMAGE_HEIGHT, | ||
| minimum=256, | ||
| ), | ||
| min_preview_bytes=_get_int( | ||
| "ckanext.malmo.dwg_preview_min_preview_bytes", | ||
| DEFAULT_MIN_PREVIEW_BYTES, | ||
| minimum=1, | ||
| ), | ||
| cache_dir=_get_string("ckanext.malmo.dwg_preview_cache_dir", DEFAULT_CACHE_DIR), | ||
| min_content_coverage=_get_float( | ||
| "ckanext.malmo.dwg_preview_min_content_coverage", | ||
| DEFAULT_MIN_CONTENT_COVERAGE, | ||
| minimum=0.00001, | ||
| ), | ||
| max_initial_coverage=_get_float( | ||
| "ckanext.malmo.dwg_preview_max_initial_coverage", | ||
| DEFAULT_MAX_INITIAL_COVERAGE, | ||
| minimum=0.001, | ||
| ), | ||
| retry_render_margin=_get_float( | ||
| "ckanext.malmo.dwg_preview_retry_render_margin", | ||
| DEFAULT_RETRY_RENDER_MARGIN, | ||
| minimum=0.0, | ||
| ), | ||
| lineweight_scaling=_get_float( | ||
| "ckanext.malmo.dwg_preview_lineweight_scaling", | ||
| DEFAULT_LINEWEIGHT_SCALING, | ||
| minimum=0.1, | ||
| ), | ||
| min_occupied_width_ratio=_get_float( | ||
| "ckanext.malmo.dwg_preview_min_occupied_width_ratio", | ||
| DEFAULT_MIN_OCCUPIED_WIDTH_RATIO, | ||
| minimum=0.0, | ||
| ), | ||
| min_occupied_height_ratio=_get_float( | ||
| "ckanext.malmo.dwg_preview_min_occupied_height_ratio", | ||
| DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO, | ||
| minimum=0.0, | ||
| ), | ||
|
Comment on lines
+94
to
+123
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Validate that coverage and ratio config values are ≤ 1.0. Coverage and ratio fields represent proportions and should not exceed 1.0:
Values > 1.0 would cause confusing validation failures downstream. Add maximum validation to fail fast with clear config errors. ✅ Proposed fix to validate upper bounds min_content_coverage=_get_float(
"ckanext.malmo.dwg_preview_min_content_coverage",
DEFAULT_MIN_CONTENT_COVERAGE,
minimum=0.00001,
+ maximum=1.0,
),
max_initial_coverage=_get_float(
"ckanext.malmo.dwg_preview_max_initial_coverage",
DEFAULT_MAX_INITIAL_COVERAGE,
minimum=0.001,
+ maximum=1.0,
),
# ... similar for retry_render_margin if it's also a ratio ...
min_occupied_width_ratio=_get_float(
"ckanext.malmo.dwg_preview_min_occupied_width_ratio",
DEFAULT_MIN_OCCUPIED_WIDTH_RATIO,
minimum=0.0,
+ maximum=1.0,
),
min_occupied_height_ratio=_get_float(
"ckanext.malmo.dwg_preview_min_occupied_height_ratio",
DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO,
minimum=0.0,
+ maximum=1.0,
),Update 🤖 Prompt for AI Agents |
||
| ) | ||
|
|
||
|
|
||
| def _get_string(config_key: str, default_value: str) -> str: | ||
| raw_value = toolkit.config.get(config_key) | ||
| if raw_value in (None, ""): | ||
| return default_value | ||
| value = str(raw_value).strip() | ||
| return value or default_value | ||
|
|
||
|
|
||
| def _get_int(config_key: str, default_value: int, minimum: int | None = None) -> int: | ||
| raw_value = toolkit.config.get(config_key) | ||
| if raw_value in (None, ""): | ||
| return default_value | ||
| try: | ||
| value = int(raw_value) | ||
| except (TypeError, ValueError): | ||
| log.warning("Invalid integer config %s=%r, using default %s", config_key, raw_value, default_value) | ||
| return default_value | ||
| if minimum is not None and value < minimum: | ||
| log.warning("Config %s=%r is below minimum %s, using default %s", config_key, value, minimum, default_value) | ||
| return default_value | ||
| return value | ||
|
|
||
|
|
||
| def _get_float(config_key: str, default_value: float, minimum: float | None = None) -> float: | ||
| raw_value = toolkit.config.get(config_key) | ||
| if raw_value in (None, ""): | ||
| return default_value | ||
| try: | ||
| value = float(raw_value) | ||
| except (TypeError, ValueError): | ||
| log.warning("Invalid float config %s=%r, using default %s", config_key, raw_value, default_value) | ||
| return default_value | ||
| if minimum is not None and value < minimum: | ||
| log.warning("Config %s=%r is below minimum %s, using default %s", config_key, value, minimum, default_value) | ||
| return default_value | ||
| return value | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add maximum bounds to image dimension config to prevent memory exhaustion.
The
image_widthandimage_heightconfig values are passed directly to matplotlib'sfigure.figsize(render.py line 152) with only minimum validation. Maliciously or mistakenly configured large values (e.g.,100000x100000) could cause out-of-memory crashes.Add maximum validation (e.g.,
maximum=8192) to both_get_intcalls to cap resource usage at reasonable limits.🛡️ Proposed fix to add maximum bounds
image_width=_get_int( "ckanext.malmo.dwg_preview_image_width", DEFAULT_IMAGE_WIDTH, minimum=256, + maximum=8192, ), image_height=_get_int( "ckanext.malmo.dwg_preview_image_height", DEFAULT_IMAGE_HEIGHT, minimum=256, + maximum=8192, ),Update
_get_intsignature and implementation to accept and enforce themaximumparameter.🤖 Prompt for AI Agents