Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
__pycache__/
*.py[cod]
*.egg-info/
.pytest_cache/
.mypy_cache/
102 changes: 92 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,103 @@
# ckanext-malmo

Customizations for the City of Malmö CKAN instance.
Customizations for the City of Malmo CKAN instance.

## Requirements

- CKAN 2.10+ (tested on 2.11)

## DWG Preview

This extension exposes a binary preview action at:

```text
/api/3/action/convert_dwg?id=<resource-id>
```

The preview flow is:

1. stage the DWG resource into a temporary file
2. convert DWG -> DXF with ODA File Converter
3. render DXF -> PNG with `ezdxf` and `matplotlib`
4. cache the generated PNG by resource id + file hash
5. return the PNG directly from the CKAN endpoint

Important runtime requirements:

- ODA File Converter must be installed and available on `PATH`
- `xvfb` is used automatically when `xvfb-run` is available
- Python rendering dependencies must be installed in the CKAN runtime

Python runtime dependencies:

- `ezdxf`
- `matplotlib`

System/runtime dependencies:

- ODA File Converter Linux asset (`.AppImage` or `.deb`)
- `xvfb`

## DWG Preview Configuration

The DWG preview pipeline supports these CKAN config settings:

- `ckanext.malmo.dwg_preview_timeout`
Conversion timeout in seconds. Default: `45`.

- `ckanext.malmo.dwg_preview_download_timeout`
Download timeout in seconds for remote DWG resources. Default: `30`.

- `ckanext.malmo.dwg_preview_max_download_bytes`
Maximum DWG download size in bytes. Default: `104857600`.

- `ckanext.malmo.dwg_preview_oda_executable`
Absolute path or executable name for ODA File Converter. Default: `ODAFileConverter`.

- `ckanext.malmo.dwg_preview_oda_output_version`
DXF target version passed to ODA File Converter. Default: `ACAD2018`.

- `ckanext.malmo.dwg_preview_xvfb_screen`
Screen configuration passed to `xvfb-run` when launching ODA File Converter in headless Docker environments. Default: `-screen 0 1600x1200x24`.

- `ckanext.malmo.dwg_preview_render_margin`
Extra margin applied around rendered geometry. Default: `0.05`.

- `ckanext.malmo.dwg_preview_image_width`
Output preview width in pixels. Default: `1600`.

- `ckanext.malmo.dwg_preview_image_height`
Output preview height in pixels. Default: `1200`.

- `ckanext.malmo.dwg_preview_min_preview_bytes`
Minimum byte size for accepting a generated preview. Default: `1024`.

- `ckanext.malmo.dwg_preview_cache_dir`
Directory used for cached PNG previews. Default: system temporary directory + `ckan-dwg-preview-cache`.

## Docker Setup

In the local development Docker setup:

- ODA File Converter is installed during image build
- the local `src/ckanext-malmo` extension is installed at container startup from the mounted workspace
- `xvfb` is installed for headless ODA execution

Optional local ODA asset override directory:

```text
ckan/vendor/oda/
```

Supported local asset formats:

- `.AppImage`
- `.deb`

## Installation

To install `ckanext-malmo`:

1. Clone this repository (or copy the extension files).
2. Install the extension in your environment:
```bash
pip install -e ckan/extensions/ckanext-malmo
```
3. Add `malmo` to the `ckan.plugins` setting in your CKAN configuration file (`ckan.ini`):
```ini
ckan.plugins = ... malmo
```
1. Install the extension in your environment.
2. Install ODA File Converter and make sure `ODAFileConverter` is available in the runtime environment.
3. Add `malmo` to the `ckan.plugins` setting in your CKAN configuration file.
3 changes: 3 additions & 0 deletions ckanext/malmo/dwg_preview/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from .service import build_preview_payload

__all__ = ["build_preview_payload"]
29 changes: 29 additions & 0 deletions ckanext/malmo/dwg_preview/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from __future__ import annotations

import hashlib
import os
import shutil


def file_sha256(path: str) -> str:
digest = hashlib.sha256()
with open(path, "rb") as source_file:
for chunk in iter(lambda: source_file.read(1024 * 1024), b""):
digest.update(chunk)
return digest.hexdigest()


def build_cache_path(cache_dir: str, resource_id: str, source_hash: str) -> str:
cache_key = hashlib.sha256(f"{resource_id}:{source_hash}".encode("utf-8")).hexdigest()
return os.path.join(cache_dir, f"{cache_key}.png")


def is_cached_preview_valid(path: str, min_preview_bytes: int) -> bool:
return os.path.exists(path) and os.path.getsize(path) >= min_preview_bytes


def store_cached_preview(source_path: str, cache_path: str) -> None:
os.makedirs(os.path.dirname(cache_path), exist_ok=True)
temp_path = f"{cache_path}.tmp"
shutil.copyfile(source_path, temp_path)
os.replace(temp_path, cache_path)
162 changes: 162 additions & 0 deletions ckanext/malmo/dwg_preview/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
from __future__ import annotations

import logging
import os
import tempfile
from dataclasses import dataclass

from ckan.plugins import toolkit

log = logging.getLogger(__name__)

DEFAULT_TIMEOUT_SECONDS = 45
DEFAULT_DOWNLOAD_TIMEOUT_SECONDS = 30
DEFAULT_MAX_DOWNLOAD_BYTES = 100 * 1024 * 1024
DEFAULT_ODA_OUTPUT_VERSION = "ACAD2018"
DEFAULT_XVFB_SCREEN = "-screen 0 1600x1200x24"
DEFAULT_RENDER_MARGIN = 0.05
DEFAULT_IMAGE_WIDTH = 1600
DEFAULT_IMAGE_HEIGHT = 1200
DEFAULT_MIN_PREVIEW_BYTES = 1024
DEFAULT_CACHE_DIR = os.path.join(tempfile.gettempdir(), "ckan-dwg-preview-cache")
DEFAULT_MIN_CONTENT_COVERAGE = 0.002
DEFAULT_MAX_INITIAL_COVERAGE = 0.6
DEFAULT_RETRY_RENDER_MARGIN = 0.01
DEFAULT_LINEWEIGHT_SCALING = 1.5
DEFAULT_MIN_OCCUPIED_WIDTH_RATIO = 0.2
DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO = 0.2


@dataclass(frozen=True)
class PreviewConfig:
timeout: int
download_timeout: int
max_download_bytes: int
oda_executable: str
oda_output_version: str
xvfb_screen: str
render_margin: float
image_width: int
image_height: int
min_preview_bytes: int
cache_dir: str
min_content_coverage: float
max_initial_coverage: float
retry_render_margin: float
lineweight_scaling: float
min_occupied_width_ratio: float
min_occupied_height_ratio: float

@classmethod
def from_ckan_config(cls) -> "PreviewConfig":
return cls(
timeout=_get_int("ckanext.malmo.dwg_preview_timeout", DEFAULT_TIMEOUT_SECONDS, minimum=1),
download_timeout=_get_int(
"ckanext.malmo.dwg_preview_download_timeout",
DEFAULT_DOWNLOAD_TIMEOUT_SECONDS,
minimum=1,
),
max_download_bytes=_get_int(
"ckanext.malmo.dwg_preview_max_download_bytes",
DEFAULT_MAX_DOWNLOAD_BYTES,
minimum=1024,
),
oda_executable=_get_string("ckanext.malmo.dwg_preview_oda_executable", "ODAFileConverter"),
oda_output_version=_get_string(
"ckanext.malmo.dwg_preview_oda_output_version",
DEFAULT_ODA_OUTPUT_VERSION,
),
xvfb_screen=_get_string(
"ckanext.malmo.dwg_preview_xvfb_screen",
DEFAULT_XVFB_SCREEN,
),
render_margin=_get_float(
"ckanext.malmo.dwg_preview_render_margin",
DEFAULT_RENDER_MARGIN,
minimum=0.0,
),
image_width=_get_int(
"ckanext.malmo.dwg_preview_image_width",
DEFAULT_IMAGE_WIDTH,
minimum=256,
),
image_height=_get_int(
"ckanext.malmo.dwg_preview_image_height",
DEFAULT_IMAGE_HEIGHT,
minimum=256,
),
Comment on lines +78 to +87

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add maximum bounds to image dimension config to prevent memory exhaustion.

The image_width and image_height config values are passed directly to matplotlib's figure.figsize (render.py line 152) with only minimum validation. Maliciously or mistakenly configured large values (e.g., 100000x100000) could cause out-of-memory crashes.

Add maximum validation (e.g., maximum=8192) to both _get_int calls to cap resource usage at reasonable limits.

🛡️ Proposed fix to add maximum bounds
         image_width=_get_int(
             "ckanext.malmo.dwg_preview_image_width",
             DEFAULT_IMAGE_WIDTH,
             minimum=256,
+            maximum=8192,
         ),
         image_height=_get_int(
             "ckanext.malmo.dwg_preview_image_height",
             DEFAULT_IMAGE_HEIGHT,
             minimum=256,
+            maximum=8192,
         ),

Update _get_int signature and implementation to accept and enforce the maximum parameter.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ckanext/malmo/dwg_preview/config.py` around lines 78 - 87, The image
dimension config lacks an upper bound and can be set to extremely large values;
update the two calls to _get_int that set image_width (using
DEFAULT_IMAGE_WIDTH) and image_height (using DEFAULT_IMAGE_HEIGHT) to pass
maximum=8192, and modify the _get_int function implementation/signature to
accept and validate a maximum parameter (raising or clipping on violation) so
matplotlib usage in render.py (figure.figsize) cannot receive unbounded sizes.

min_preview_bytes=_get_int(
"ckanext.malmo.dwg_preview_min_preview_bytes",
DEFAULT_MIN_PREVIEW_BYTES,
minimum=1,
),
cache_dir=_get_string("ckanext.malmo.dwg_preview_cache_dir", DEFAULT_CACHE_DIR),
min_content_coverage=_get_float(
"ckanext.malmo.dwg_preview_min_content_coverage",
DEFAULT_MIN_CONTENT_COVERAGE,
minimum=0.00001,
),
max_initial_coverage=_get_float(
"ckanext.malmo.dwg_preview_max_initial_coverage",
DEFAULT_MAX_INITIAL_COVERAGE,
minimum=0.001,
),
retry_render_margin=_get_float(
"ckanext.malmo.dwg_preview_retry_render_margin",
DEFAULT_RETRY_RENDER_MARGIN,
minimum=0.0,
),
lineweight_scaling=_get_float(
"ckanext.malmo.dwg_preview_lineweight_scaling",
DEFAULT_LINEWEIGHT_SCALING,
minimum=0.1,
),
min_occupied_width_ratio=_get_float(
"ckanext.malmo.dwg_preview_min_occupied_width_ratio",
DEFAULT_MIN_OCCUPIED_WIDTH_RATIO,
minimum=0.0,
),
min_occupied_height_ratio=_get_float(
"ckanext.malmo.dwg_preview_min_occupied_height_ratio",
DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO,
minimum=0.0,
),
Comment on lines +94 to +123

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate that coverage and ratio config values are ≤ 1.0.

Coverage and ratio fields represent proportions and should not exceed 1.0:

  • min_content_coverage (line 94-98)
  • max_initial_coverage (line 99-103)
  • min_occupied_width_ratio (line 114-118)
  • min_occupied_height_ratio (line 119-123)

Values > 1.0 would cause confusing validation failures downstream. Add maximum validation to fail fast with clear config errors.

✅ Proposed fix to validate upper bounds
         min_content_coverage=_get_float(
             "ckanext.malmo.dwg_preview_min_content_coverage",
             DEFAULT_MIN_CONTENT_COVERAGE,
             minimum=0.00001,
+            maximum=1.0,
         ),
         max_initial_coverage=_get_float(
             "ckanext.malmo.dwg_preview_max_initial_coverage",
             DEFAULT_MAX_INITIAL_COVERAGE,
             minimum=0.001,
+            maximum=1.0,
         ),
         # ... similar for retry_render_margin if it's also a ratio ...
         min_occupied_width_ratio=_get_float(
             "ckanext.malmo.dwg_preview_min_occupied_width_ratio",
             DEFAULT_MIN_OCCUPIED_WIDTH_RATIO,
             minimum=0.0,
+            maximum=1.0,
         ),
         min_occupied_height_ratio=_get_float(
             "ckanext.malmo.dwg_preview_min_occupied_height_ratio",
             DEFAULT_MIN_OCCUPIED_HEIGHT_RATIO,
             minimum=0.0,
+            maximum=1.0,
         ),

Update _get_float signature and implementation to accept and enforce the maximum parameter.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ckanext/malmo/dwg_preview/config.py` around lines 94 - 123, The listed
coverage/ratio configs (min_content_coverage, max_initial_coverage,
min_occupied_width_ratio, min_occupied_height_ratio) must be validated to be ≤
1.0; update the calls in config initialization to pass maximum=1.0 for those
_get_float invocations and modify the _get_float function
signature/implementation to accept a maximum parameter and raise a clear
configuration error when the parsed float > maximum (in addition to existing
minimum checks) so invalid proportions fail fast with a descriptive message.

)


def _get_string(config_key: str, default_value: str) -> str:
raw_value = toolkit.config.get(config_key)
if raw_value in (None, ""):
return default_value
value = str(raw_value).strip()
return value or default_value


def _get_int(config_key: str, default_value: int, minimum: int | None = None) -> int:
raw_value = toolkit.config.get(config_key)
if raw_value in (None, ""):
return default_value
try:
value = int(raw_value)
except (TypeError, ValueError):
log.warning("Invalid integer config %s=%r, using default %s", config_key, raw_value, default_value)
return default_value
if minimum is not None and value < minimum:
log.warning("Config %s=%r is below minimum %s, using default %s", config_key, value, minimum, default_value)
return default_value
return value


def _get_float(config_key: str, default_value: float, minimum: float | None = None) -> float:
raw_value = toolkit.config.get(config_key)
if raw_value in (None, ""):
return default_value
try:
value = float(raw_value)
except (TypeError, ValueError):
log.warning("Invalid float config %s=%r, using default %s", config_key, raw_value, default_value)
return default_value
if minimum is not None and value < minimum:
log.warning("Config %s=%r is below minimum %s, using default %s", config_key, value, minimum, default_value)
return default_value
return value
Loading