Skip to content

Rheosoph/node-package-rss-utils

Repository files navigation

RSS Utils for Flow-Like

Flow-Like WASM nodes for RSS and Atom feed utility workflows. The package is intended to cover the practical pieces around feed discovery, fetch request preparation, central parsing, release-date filtering, deduplication, and formatting.

Repository: https://github.com/Rheosoph/node-package-rss-utils

Author: Felix Schultz, Rheosoph GmbH

Store page copy: STORE.md

Implemented Nodes

Node Category Purpose
Normalize Feed URL RSS/Utilities Trim and normalize a feed URL, optionally adding https:// and removing fragments.
Build Feed Request RSS/Fetch Create a typed HTTP request descriptor with RSS/Atom/JSON Feed Accept, User-Agent, ETag, and Last-Modified headers.
Parse Feed RSS/Parsing Centrally parse RSS, Atom, RDF, and JSON Feed documents into normalized feed metadata and items.
Extract Feed Links RSS/Discovery Find RSS, Atom, and JSON Feed alternate links in an HTML document.
Search Feed Items RSS/Items Keep items whose text fields contain a query.
Filter Feed Items by Category RSS/Items Keep items with a matching category term.
Filter Feed Items by Date RSS/Items Keep items inside an inclusive release-date window.
Dedupe Feed Items RSS/Items Remove repeated items by id, link, or title plus date.
Limit Feed Items RSS/Items Keep the first N feed items.
Feed Item to Markdown RSS/Formatting Format a normalized feed item as Markdown for summaries, digests, and notifications.

Layout

Each Flow-Like node lives in its own file under src/nodes/:

src/nodes/normalize_feed_url.rs
src/nodes/build_feed_request.rs
src/nodes/parse_feed.rs
src/nodes/extract_feed_links.rs
src/nodes/search_feed_items.rs
src/nodes/filter_feed_items_by_category.rs
src/nodes/filter_feed_items_by_date.rs
src/nodes/dedupe_feed_items.rs
src/nodes/limit_feed_items.rs
src/nodes/feed_item_to_markdown.rs

Shared code is split by responsibility: parser.rs, dates.rs, filtering.rs, discovery.rs, formatting.rs, and types.rs.

When parsing a body fetched by another node, pass the original feed URL into Parse Feed as source_url. The parser uses it as the base URL for feeds that contain relative media, enclosure, or content links.

Extract Feed Links only returns absolute HTTP(S) feed URLs. Relative feed links are resolved when base_url is provided; unresolved relative links are ignored so downstream HTTP nodes do not receive invalid request URLs.

Release-Date Filtering

Filter Feed Items by Date uses Flow-Like Date pins for inclusive released_from and released_to bounds. Programmatic callers may still pass strings or wrapped Flow-Like date objects. The parser supports RFC3339, RFC2822, HTTP-date, Unix seconds/milliseconds/microseconds/nanoseconds, YYYY, YYYY-MM, YYYY-MM-DD, named dates such as June 4, 2026, and common timezone abbreviations. Date-only values expand to the whole day, month, or year.

Validation Corpus

tests/feed_corpus.csv tracks 200 public feed targets for parser validation: 50 Reddit feeds, 25 AWS feeds, 50 news feeds, 50 technology feeds, and 25 GitHub Atom feeds. Normal tests validate the corpus structure and coverage without making live network calls.

Node Ideas

The most useful next nodes fall into a few groups:

Node Idea Why it is useful
Fetch Feed Perform the actual HTTP GET, expose status, response headers, body, ETag, and Last-Modified; requires NetworkHttp.
Validate Feed Report XML, date, URL, enclosure, and required-field issues before downstream processing.
Sort Feed Items Sort items by published or updated date with stable fallback behavior.
Merge Feeds Combine multiple normalized item arrays while preserving source metadata.
Configurable Dedupe Feed Items Selectable key strategy: GUID, canonical link, title/date, or content hash.
Extract Enclosures Pull podcast/audio/video/image enclosures into typed media structs.
Feed Items to Digest Produce Markdown, HTML, or plain-text digests grouped by feed, date, author, or category.
Build RSS Feed Emit valid RSS XML from normalized feed metadata and items.
Build Atom Feed Emit valid Atom XML from normalized feed metadata and items.
Feed Health Check Track stale feeds, changed status codes, parse failures, item velocity, and missing conditional headers.
Feed Cache Key Create deterministic cache keys from URL plus conditional headers for scheduled workflows.
Discover Site Feeds Fetch a webpage and combine HTML link extraction with common paths like /feed, /rss, and /atom.xml.
OPML Import Parse OPML subscription lists into feed source records.
OPML Export Write feed source records back to OPML.

Build

cargo build --release --target wasm32-wasip2

The release WASM component is written to:

target/wasm32-wasip2/release/node_package_rss_utils.wasm

Using mise:

mise run build

This also copies the component to node.wasm.

Test

cargo test --target $(rustc -vV | grep host | awk '{print $2}')

The default test suite runs offline. It validates the 200-feed corpus structure and smoke-tests every node against corpus-derived fixtures. Live third-party feed fetching is intentionally ignored by default:

cargo test --target $(rustc -vV | grep host | awk '{print $2}') -- --ignored live_feed_corpus_fetches_and_parses_all_feeds

Using mise:

mise run test

Publishing

  1. Build with mise run build.
  2. Open Flow-Like Desktop.
  3. Go to Library -> Packages -> Publish.
  4. Select node.wasm and flow-like.toml.
  5. Submit the package.

License

Licensed under either of:

at your option.

Copyright (c) 2026 Rheosoph GmbH, Felix Schultz.

About

Flow-Like WASM nodes for RSS, Atom, RDF, and JSON Feed discovery, parsing, filtering, deduplication, and Markdown formatting.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages