PFone-2-PFtwo

This repository contains a PDF parser that extracts embedded images and builds a Foundry VTT JournalEntry compendium.

Requirements

Python 3.10 or 3.11

Installation

Install the project and its dependencies with pip:

pip install .

Usage

Run the parser from anywhere using the pfpdf command:

pfpdf path/to/file.pdf output_dir

Images are written to output_dir, and the directory will contain a module.json manifest and a packs/images.json JournalEntry compendium file ready for import into Foundry VTT v13.

The parser uses PyMuPDF to extract images, deduplicates them using PDF metadata, and can be extended with additional processing as needed. Optional flags provide extra metadata for the generated entries and module manifest:

--module-id my-module – set the module identifier (the manifest name).
--title "My Title" – set the module manifest title.
--tags-from-text – include page text and bookmarks as tags on each entry.
--note "Some note" – attach a note to every generated entry.

Environment variables PFPDF_MODULE_ID and PFPDF_TITLE override the corresponding command-line options when set.

Example:

pfpdf file.pdf out --module-id my-module --title "My Module" --tags-from-text --note "GM only"

Testing

Run the linter and test suite before submitting changes:

pylint pdf_parser.py
pytest

Labeling and Folder Hierarchy

Metadata-based labeling: Alt text and bookmark titles label each image. Duplicate metadata points to the same JournalEntry so repeated images are not duplicated.
Page+index fallback: When no metadata is available, entries are named page_<page>_<index> to guarantee a stable label.
Nested folders: Bookmark hierarchies create nested folders inside the compendium, preserving the structure of the original PDF.

Example `module.json`

{
  "name": "pf-images",
  "title": "PF Images",
  "version": "1.0.0",
  "compatibleCoreVersion": "13",
  "packs": [
    {
      "name": "images",
      "label": "Images",
      "path": "packs/images.json",
      "type": "JournalEntry"
    }
  ]
}

Example `packs/images.json`

[
  {
    "_id": "abc123",
    "name": "Goblin Ambush",
    "folder": "Encounters/Goblins",
    "pages": [
      {
        "name": "Goblin Ambush",
        "type": "image",
        "image": {"src": "list/0.png"}
      }
    ]
  }
]

Importing into Foundry VTT v13

Copy module.json, the packs/ directory, and the extracted image files into Foundry's Data/modules/<your-module> folder.
Launch Foundry VTT v13 and enable the module from Settings → Manage Modules.
Open the Compendium Packs sidebar, locate the Images JournalEntry compendium, and choose Import All or drag entries into your world.
Imported entries appear in nested folders mirroring the PDF's bookmark structure.

Contributing

Before submitting changes, review AGENTS.md for the project's master plan, development guidelines, and hierarchy expectations.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.github/workflows		.github/workflows
scripts		scripts
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
pdf_parser.py		pdf_parser.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PFone-2-PFtwo

Requirements

Installation

Usage

Testing

Labeling and Folder Hierarchy

Example `module.json`

Example `packs/images.json`

Importing into Foundry VTT v13

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PFone-2-PFtwo

Requirements

Installation

Usage

Testing

Labeling and Folder Hierarchy

Example module.json

Example packs/images.json

Importing into Foundry VTT v13

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Example `module.json`

Example `packs/images.json`

Packages