diff --git a/contributing.md b/contributing.md
index f8dc3815d..1f549469e 100644
--- a/contributing.md
+++ b/contributing.md
@@ -1,50 +1,492 @@
-### Contributing
+# Contributing to NV-Ingest
 
-We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original
-work, or you have rights to submit it under the same license, or a compatible license.
+External contributions will be welcome soon, and they are greatly appreciated! Every little bit helps, and credit will always be given.
 
-Any contribution which contains commits that are not signed off are not accepted.
+## Table of Contents
 
-To sign off on a commit, use the --signoff (or -s) option when you commit your changes as shown following.
+1. [Filing Issues](#filing-issues)
+2. [Cloning the Repository](#cloning-the-repository)
+3. [Code Contributions](#code-contributions)
+   - [Your First Issue](#your-first-issue)
+   - [Seasoned Developers](#seasoned-developers)
+   - [Workflow](#workflow)
+   - [Common Processing Patterns](#common-processing-patterns)
+     - [traceable](#traceable---srcnv_ingestutiltracingtaggingpy)
+     - [nv_ingest_node_failure_context_manager](#nv_ingest_node_failure_context_manager---srcnv_ingestutilexception_handlersdecoratorspy)
+     - [filter_by_task](#filter_by_task---srcnv_ingestutilflow_controlfilter_by_taskpy)
+   - [Adding a New Stage or Module](#adding-a-new-stage-or-module)
+   - [Common Practices for Writing Unit Tests](#common-practices-for-writing-unit-tests)
+     - [General Guidelines](#general-guidelines)
+     - [Mocking External Services](#mocking-external-services)
+   - [Submodules, Third Party Libraries, and Models](#submodules-third-party-libraries-and-models)
+     - [Submodules](#submodules)
+     - [Models](#models)
+4. [Architectural Guidelines](#architectural-guidelines)
+   - [Single Responsibility Principle (SRP)](#1-single-responsibility-principle-srp)
+   - [Interface Segregation Principle (ISP)](#2-interface-segregation-principle-isp)
+   - [Dependency Inversion Principle (DIP)](#3-dependency-inversion-principle-dip)
+   - [Physical Design Structure Mirroring Logical Design Structure](#4-physical-design-structure-mirroring-logical-design-structure)
+   - [Levelization](#5-levelization)
+   - [Acyclic Dependencies Principle (ADP)](#6-acyclic-dependencies-principle-adp)
+   - [Package Cohesion Principles](#7-package-cohesion-principles)
+     - [Common Closure Principle (CCP)](#common-closure-principle-ccp)
+     - [Common Reuse Principle (CRP)](#common-reuse-principle-crp)
+   - [Encapsulate What Varies](#8-encapsulate-what-varies)
+   - [Favor Composition Over Inheritance](#9-favor-composition-over-inheritance)
+   - [Clean Separation of Concerns (SoC)](#10-clean-separation-of-concerns-soc)
+   - [Principle of Least Knowledge (Law of Demeter)](#11-principle-of-least-knowledge-law-of-demeter)
+   - [Document Assumptions and Decisions](#12-document-assumptions-and-decisions)
+   - [Continuous Integration and Testing](#13-continuous-integration-and-testing)
+5. [Writing Good and Thorough Documentation](#writing-good-and-thorough-documentation)
+6. [Licensing](#licensing)
+7. [Attribution](#attribution)
 
-```
-$ git commit --signoff --message "Add cool feature."
-```
+## Filing Issues
 
-This appends the following text to your commit message.
+1. **Bug Reports, Feature Requests, and Documentation Issues:** Please file
+   an [issue](https://github.com/NVIDIA/nv-ingest/issues) with a detailed
+   description of
+   the problem, feature request, or documentation issue. The NV-Ingest team will review and triage these issues,
+   and if appropriate, schedule them for a future release.
 
+## Cloning the repository
+
+```bash
+DATASET_ROOT=[path to your dataset root]
+MODULE_NAME=[]
+NV_INGEST_ROOT=[path to your NV-Ingest root]
+git clone https://github.com/NVIDIA/nv-ingest.git $NV_INGEST_ROOT
+cd $NV_INGEST_ROOT
 ```
-Signed-off-by: Your Name <your@email.com>
+
+Ensure all submodules are checked out:
+
+```bash
+git submodule update --init --recursive
 ```
 
-#### Developer Certificate of Origin (DCO)
+## Code Contributions
 
-The following is the full text of the Developer Certificate of Origin (DCO)
+### Your First Issue
 
-```
-  Developer Certificate of Origin
-  Version 1.1
+1. **Finding an Issue:** Start with issues
+   labeled [good first issue](https://github.com/NVIDIA/nv-ingest/labels/bug).
+2. **Claim an Issue:** Comment on the issue you wish to work on.
+3. **Implement Your Solution:** Dive into the code! Update or add unit tests as necessary.
+4. **Submit Your Pull Request:
+   ** [Create a pull request](https://github.com/NVIDIA/nv-ingest/pulls) once your
+   code is ready.
+5. **Code Review:** Wait for the review by other developers and make necessary updates.
+6. **Merge:** After approval, an NVIDIA developer will approve your pull request.
 
-  Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
-  1 Letterman Drive
-  Suite D4700
-  San Francisco, CA, 94129
+### Seasoned Developers
 
-  Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
-```
+For those familiar with the codebase, please check
+the [project boards](https://github.com/orgs/NVIDIA/projects/48/views/1) for
+issues. Look for unassigned issues and follow the steps starting from **Claim an Issue**.
+
+### Workflow
+
+1. **NV-Ingest Foundation**: Built on top
+   of [RAY](https://docs.ray.io/en/latest/serve/architecture.html).
+
+2. **Pipeline Structure**: Designed around a pipeline that processes individual jobs within an asynchronous execution
+   graph. Each job is processed by a series of stages or task handlers.
+
+3. **Job Composition**: Jobs consist of a data payload, metadata, and task specifications that determine the processing
+   steps applied to the data.
+
+4. **Job Submission**:
+
+   - A job is submitted as a JSON specification and converted into
+     a [ControlMessage](https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/docs/source/developer_guide/guides/9_control_messages.md),
+     with the payload consisting of a cuDF dataframe.
+   - For example:
+     ```text
+         document_type source_id   uuid     metadata
+         0             pdf         somefile  1234  { ... }
+     ```
+   - The `metadata` column contents correspond to
+     the [schema-enforced metadata format of returned data](docs/docs/extraction/content-metadata.md).
+
+5. **Pipeline Processing**:
+
+   - The `ControlMessage` is passed through the pipeline, where each stage processes the data and metadata as needed.
+   - Subsequent stages may add, transform, or filter data as needed, with all resulting artifacts stored in
+     the `ControlMessage`'s payload.
+   - For example, after processing, the payload may look like:
+     ```text
+         document_type   source_id   uuid       metadata
+         0               text        somefile   abcd-1234   {'content': "The quick brown fox jumped...", ...}
+         1               image       somefile   efgh-5678   {'content': "base64 encoded image", ...}
+         2               image       somefile   xyza-5618   {'content': "base64 encoded image", ...}
+         3               image       somefile   zxya-5628   {'content': "base64 encoded image", ...}
+         4               status      somefile   kvq9-5600   {'content': "", 'status': "filtered", ...}
+     ```
+   - A single job can result in multiple artifacts, each with its own metadata element definition.
+
+6. **Job Completion**:
+   - Upon reaching the end of the pipeline, the `ControlMessage` is converted into a `JobResult` object and pushed to
+     the ephemeral output queue for client retrieval.
+   - `JobResult` objects consist of a dictionary containing:
+     1. **data**: A list of metadata artifacts produced by the job.
+     2. **status**: The job status as success or failure.
+     3. **description**: A human-readable description of the job status.
+     4. **trace**: A list of timing traces generated during the job's processing.
+     5. **annotations**: A list of task annotations generated during the job's processing.
+
+### Updating Dependencies
+
+- Dependencies are managed with `uv` and project-local `pyproject.toml` files.
+- Dependencies are stored in package definitions:
+    1. **Service Dependencies** `src/pyproject.toml`.
+    2. **Client Dependencies** `client/pyproject.toml`.
+
+- To update dependencies:
+  - Create a clean environment using `uv venv`.
+  - Update dependencies in the relevant `pyproject.toml` and validate the changes.
+  - Recreate the environment and install via `uv pip`.
+    - For example:
+      ```bash
+      uv venv .venv
+      source .venv/bin/activate
+      uv pip install -e ./src -e ./client -e ./api
+      ```
+
+### Common Processing Patterns
+
+In NV-Ingest, decorators are used to enhance the functionality of functions by adding additional processing logic. These
+decorators help ensure consistency, traceability, and robust error handling across the pipeline. Below, we introduce
+some common decorators used in NV-Ingest, explain their usage, and provide examples.
+
+#### `traceable` -> `src/nv_ingest/util/tracing/tagging.py`
+
+The `traceable` decorator adds entry and exit trace timestamps to a `ControlMessage`'s metadata. This helps in
+monitoring and debugging by recording the time taken for function execution.
+
+**Usage:**
+
+- To track function execution time with default trace names:
+  ```python
+  @traceable()
+  def process_message(message):
+      pass
+  ```
+- To use a custom trace name:
+  ```python
+  @traceable(trace_name="CustomTraceName")
+  def process_message(message):
+      pass
+  ```
+
+#### `nv_ingest_node_failure_context_manager` -> `src/nv_ingest/util/exception_handlers/decorators.py`
+
+This decorator wraps a function with failure handling logic to manage potential failures involving `ControlMessages`. It
+ensures that failures are managed consistently, optionally raising exceptions or annotating the `ControlMessage`.
+
+**Usage:**
+
+- To handle failures with default settings:
+  ```python
+  @nv_ingest_node_failure_context_manager(annotation_id="example_task")
+  def process_message(message):
+      pass
+  ```
+- To handle failures and allow empty payloads:
+  ```python
+  @nv_ingest_node_failure_context_manager(annotation_id="example_task", payload_can_be_empty=True)
+  def process_message(message):
+      pass
+  ```
+
+#### `filter_by_task` -> `src/nv_ingest/util/flow_control/filter_by_task.py`
+
+The `filter_by_task` decorator checks if the `ControlMessage` contains any of the specified tasks. Each task can be a
+string of the task name or a tuple of the task name and task properties. If the message does not contain any listed task
+and/or task properties, the message is returned directly without calling the wrapped function, unless a forwarding
+function is provided.
+
+**Usage:**
+
+- To filter messages based on tasks:
+  ```python
+  @filter_by_task(["task1", "task2"])
+  def process_message(message):
+      pass
+  ```
+- To filter messages based on tasks with specific properties:
+  ```python
+  @filter_by_task([("task", {"prop": "value"})])
+  def process_message(message):
+      pass
+  ```
+- To forward messages to another function. This is necessary when the decorated function does not return the message
+  directly, but instead forwards it to another function. In this case, the forwarding function should be provided as an
+  argument to the decorator.
+  ```python
+  @filter_by_task(["task1", "task2"], forward_func=other_function)
+  def process_message(message):
+      pass
+  ```
+
+#### `cm_skip_processing_if_failed` -> `morpheus/utils/control_message_utils.py`
+
+The `cm_skip_processing_if_failed` decorator skips the processing of a `ControlMessage` if it has already failed. This
+ensures that no further processing is attempted on a failed message, maintaining the integrity of the pipeline.
+
+**Usage:**
+
+- To skip processing if the message has failed:
+  ```python
+  @cm_skip_processing_if_failed
+  def process_message(message):
+      pass
+  ```
+
+### Adding a New Stage or Module
+
+#### TODO(Devin): Add details about adding a new stage or module once we have router node functionality in place.
+
+### Common Practices for Writing Unit Tests
+
+Writing unit tests is essential for maintaining code quality and ensuring that changes do not introduce new bugs. In
+this project, we use `pytest` for running tests and adopt blackbox testing principles. Below are some common practices
+for writing unit tests, which are located in the `[repo_root]/tests` directory.
+
+#### General Guidelines
+
+1. **Test Structure**: Each test module should test a specific module or functionality within the codebase. The test
+   module should be named `test_<module_name>.py`, and reside on a mirrored physical path to its corresponding test
+   target to be easily discoverable by `pytest`.
+
+   1. Example: `nv_ingest/some_path/another_path/my_module.py` should have a corresponding test file:
+      `tests/some_path/another_path/test_my_module.py`.
+
+2. **Test Functions**: Each test function should focus on a single aspect of the functionality. Use descriptive names
+   that clearly indicate what is being tested. For example, `test_function_returns_correct_value`
+   or `test_function_handles_invalid_input`.
+
+3. **Setup and Teardown**: Use `pytest` fixtures to manage setup and teardown operations for your tests. Fixtures help
+   in creating a consistent and reusable setup environment.
+
+4. **Assertions**: Use assertions to validate the behavior of the code. Ensure that the tests cover both expected
+   outcomes and edge cases.
+
+#### Mocking External Services
+
+When writing tests that depend on external services (e.g., databases, APIs), it is important to mock these dependencies
+to ensure that tests are reliable, fast, and do not depend on external factors.
+
+1. **Mocking Libraries**: Use libraries like `unittest.mock` to create mocks for external services. The `pytest-mock`
+   plugin can also be used to integrate mocking capabilities directly with `pytest`.
+
+2. **Mock Objects**: Create mock objects to simulate the behavior of external services. Use these mocks to test how your
+   code interacts with these services without making actual network calls or database transactions.
+
+3. **Patching**: Use `patch` to replace real objects in your code with mocks. This can be done at the function, method,
+   or object level. Ensure that patches are applied in the correct scope to avoid side effects.
+
+#### Example Test Structure
+
+Here is an example of how to structure a test module in the `[repo_root]/tests` directory:
+
+```python
+import pytest
+from unittest.mock import patch, Mock
+
+# Assuming the module to test is located at [repo_root]/module.py
+from module import function_to_test
+
+
+@pytest.fixture
+def mock_external_service():
+    with patch('module.ExternalService') as mock_service:
+        yield mock_service
+
+
+def test_function_returns_correct_value(mock_external_service):
+    # Arrange
+    mock_external_service.return_value.some_method.return_value = 'expected_value'
 
+    # Act
+    result = function_to_test()
+
+    # Assert
+    assert result == 'expected_value'
+
+
+def test_function_handles_invalid_input(mock_external_service):
+    # Arrange
+    mock_external_service.return_value.some_method.side_effect = ValueError("Invalid input")
+
+    # Act and Assert
+    with pytest.raises(ValueError, match="Invalid input"):
+        function_to_test(invalid_input)
 ```
-  Developer's Certificate of Origin 1.1
 
-  By making a contribution to this project, I certify that:
+## Submodules, Third Party Libraries, and Models
+
+### Submodules
+
+1. Submodules are used to manage third-party libraries and dependencies.
+2. Submodules should be created in the `third_party` directory.
+3. Ensure that the submodule is updated to the latest commit before making changes.
+
+### Models
+
+1. **Model Integration**: NV-Ingest is designed to be scalable and flexible, so running models directly in the pipeline
+   is discouraged.
+2. **Model Export**: Models should be exported to a format compatible with Triton Inference Server or TensorRT.
+   - Model acquisition and conversion should be documented in `triton_models/README.md`, including the model name,
+     version, pbtxt file, Triton model files, etc., along with an example of how to query the model in Triton.
+   - Models should be externally hosted and downloaded during the pipeline execution, or added via LFS.
+   - Any additional code, configuration files, or scripts required to run the model should be included in
+     the `triton_models/[MODEL_NAME]` directory.
+3. **Self-Contained Dependencies**: No assumptions should be made regarding other models or libraries being available in
+   the pipeline. All dependencies should be self-contained.
+4. **Base Triton Container**: Directions for the creation of the base Triton container are listed in
+   the `triton_models/README.md` file. If a new model requires additional base dependencies, please update
+   the `Dockerfile` in the `triton_models` directory.
+
+## Architectural Guidelines
+
+To ensure the quality and maintainability of the NV-Ingest codebase, the following architectural guidelines should be
+followed:
+
+### 1. Single Responsibility Principle (SRP)
+
+- Ensure that each module, class, or function has only one reason to change.
+
+### 2. Interface Segregation Principle (ISP)
+
+- Avoid forcing clients to depend on interfaces they do not use.
+
+### 3. Dependency Inversion Principle (DIP)
+
+- High-level modules should not depend on low-level modules, both should depend on abstractions.
+
+### 4. Physical Design Structure Mirroring Logical Design Structure
+
+- The physical layout of the codebase should reflect its logical structure.
+
+### 5. Levelization
+
+- Organize code into levels where higher-level components depend on lower-level components but not vice versa.
+
+### 6. Acyclic Dependencies Principle (ADP)
+
+- Ensure the dependency graph of packages/modules has no cycles.
+
+### 7. Package Cohesion Principles
+
+#### Common Closure Principle (CCP)
+
+- Package classes that change together.
+
+#### Common Reuse Principle (CRP)
+
+- Package classes that are used together.
+
+### 8. Encapsulate What Varies
+
+- Identify aspects of the application that vary and separate them from what stays the same.
+
+### 9. Favor Composition Over Inheritance
 
-  (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or
+- Utilize object composition over class inheritance for behavior reuse where possible.
 
-  (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or
+### 10. Clean Separation of Concerns (SoC)
 
-  (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it.
+- Divide the application into distinct features with minimal overlap in functionality.
 
-  (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
+### 11. Principle of Least Knowledge (Law of Demeter)
+
+- Objects should assume as little as possible about the structure or properties of anything else, including their
+  subcomponents.
+
+### 12. Document Assumptions and Decisions
+
+- Assumptions made and reasons behind architectural and design decisions should be clearly documented.
+
+### 13. Continuous Integration and Testing
+
+- Integrate code frequently into a shared repository and ensure comprehensive testing is an integral part of the
+  development cycle.
+
+Contributors are encouraged to follow these guidelines to ensure contributions are in line with the project's
+architectural consistency and maintainability.
+
+
+## Writing Good and Thorough Documentation
+
+As a contributor to our codebase, writing high-quality documentation is an essential part of ensuring that others can
+understand and work with your code effectively. Good documentation helps to reduce confusion, facilitate collaboration,
+and streamline the development process. In this guide, we will outline the principles and best practices for writing
+thorough and readable documentation that adheres to the Chicago Manual of Style.
+
+### Chicago Manual of Style
+
+Our documentation follows the Chicago Manual of Style, a widely accepted standard for writing and formatting. This style
+guide provides a consistent approach to writing, grammar, and punctuation, making it easier for readers to understand
+and navigate our documentation.
+
+### Key Principles
+
+When writing documentation, keep the following principles in mind:
+
+1. **Clarity**: Use clear and concise language to convey your message. Avoid ambiguity and jargon that may confuse readers.
+2. **Accuracy**: Ensure that your documentation is accurate and up-to-date. Verify facts, details, and code snippets
+    before publishing.
+3. **Completeness**: Provide all necessary information to understand the code, including context, syntax, and examples.
+4. **Consistency**: Use a consistent tone, voice, and style throughout the documentation.
+5. **Accessibility**: Make your documentation easy to read and understand by using headings, bullet points, and short paragraphs.
+
+### Documentation Structure
+
+A well-structured documentation page should include the following elements:
+
+1. **Header**: A brief title that summarizes the content of the page.
+2. **Introduction**: A short overview of the topic, including its purpose and relevance.
+3. **Syntax and Parameters**: A detailed explanation of the code syntax, including parameters, data types, and return values.
+4. **Examples**: Concrete examples that illustrate how to use the code, including input and output.
+5. **Tips and Variations**: Additional information, such as best practices, common pitfalls, and alternative approaches.
+6. **Related Resources**: Links to relevant documentation, tutorials, and external resources.
+
+### Best Practices
+
+To ensure high-quality documentation, follow these best practices:
+
+1. **Use headings and subheadings**: Organize your content with clear headings and subheadings to facilitate scanning and navigation.
+2. **Use bullet points and lists**: Break up complex information into easy-to-read lists and bullet points.
+3. **Provide context**: Give readers a clear understanding of the code's purpose, history, and relationships to other components.
+4. **Review and edit**: Carefully review and edit your documentation to ensure accuracy, completeness, and consistency.
+
+### Resources
+
+For more information on the Chicago Manual of Style, refer to their
+[online published version](https://www.chicagomanualofstyle.org/home.html?_ga=2.188145128.1312333204.1728079521-706076405.1727890116).
+
+By following these guidelines and principles, you will be able to create high-quality documentation that helps others
+understand and work with your code effectively. Remember to always prioritize clarity, accuracy, and completeness, and
+to use the Chicago Style Guide as your reference for writing and formatting.
+
+
+## Licensing
+
+NV-Ingest is licensed under the NVIDIA Proprietary Software License -- ensure that any contributions are compatible.
+
+The following should be included in the header of any new files:
+
+```text
+SPDX-FileCopyrightText: Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES.
+All rights reserved.
+SPDX-License-Identifier: Apache-2.0
 ```
 
+## Attribution
+
+Portions adopted from
 
+- [https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/CONTRIBUTING.md](https://github.com/nv-morpheus/Morpheus/blob/branch-24.06/CONTRIBUTING.md)
+- [https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md](https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md)
+- [https://github.com/dask/dask/blob/master/docs/source/develop.rst](https://github.com/dask/dask/blob/master/docs/source/develop.rst)
diff --git a/docs/docs/extraction/contributing.md b/docs/docs/extraction/contributing.md
deleted file mode 100644
index 6a136c218..000000000
--- a/docs/docs/extraction/contributing.md
+++ /dev/null
@@ -1,4 +0,0 @@
-# Contributing to NV-Ingest
-
-External contributions to NV-Ingest will be welcome soon, and they are greatly appreciated! 
-For more information, refer to [Contributing to NV-Ingest](https://github.com/NVIDIA/nv-ingest/blob/main/CONTRIBUTING.md).
diff --git a/docs/docs/extraction/helm.md b/docs/docs/extraction/helm.md
deleted file mode 100644
index 1a6e885a3..000000000
--- a/docs/docs/extraction/helm.md
+++ /dev/null
@@ -1,6 +0,0 @@
-# Deploy With Helm for NeMo Retriever Library
-
-<!-- Use this documentation to deploy [NeMo Retriever Library](overview.md) by using Helm. -->
-
-To deploy [NeMo Retriever Library](overview.md) by using Helm, 
-refer to [NV-Ingest Helm Charts](https://github.com/NVIDIA/nv-ingest/blob/release/26.1.2/helm/README.md).
diff --git a/docs/docs/extraction/quickstart-guide.md b/docs/docs/extraction/quickstart-guide.md
index 6f7ab7194..761350378 100644
--- a/docs/docs/extraction/quickstart-guide.md
+++ b/docs/docs/extraction/quickstart-guide.md
@@ -111,7 +111,7 @@ To interact from the host, you'll need a Python environment that has the client
 ```
 uv venv --python 3.12 nv-ingest-dev
 source nv-ingest-dev/bin/activate
-uv pip install nv-ingest==26.1.2 nv-ingest-api==26.1.2 nv-ingest-client==26.1.2
+uv pip install nv-ingest==26.3.0 nv-ingest-api==26.3.0 nv-ingest-client==26.3.0
 ```
 
 !!! tip
@@ -358,7 +358,7 @@ INFO:nv_ingest_client.cli.util.processing:Throughput (Pages/sec): 1.28
 INFO:nv_ingest_client.cli.util.processing:Throughput (Files/sec): 0.43
 ```
 
-## Step 4: Inspecting and Consuming Results
+## Step 3: Inspecting and Consuming Results
 
 After the ingestion steps above have been completed, you should be able to find the `text` and `image` subfolders inside your processed docs folder. Each will contain JSON-formatted extracted content and metadata.
 
@@ -429,6 +429,16 @@ You can specify multiple `--profile` options.
 | `nemotron-parse`      | Advanced | Use [nemotron-parse](https://build.nvidia.com/nvidia/nemotron-parse), which adds state-of-the-art text and table extraction. For more information, refer to [Advanced Visual Parsing](nemoretriever-parse.md). | 
 | `vlm`                 | Advanced | Use [llama 3.1 Nemotron 8B Vision](https://build.nvidia.com/nvidia/llama-3.1-nemotron-nano-vl-8b-v1/modelcard) for image captioning of unstructured images and infographics. This profile enables the `caption` method in the Python API to generate text descriptions of visual content. For more information, refer to [Use Multimodal Embedding](vlm-embed.md) and [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images). | 
 
+## Air-Gapped Deployment (Docker Compose)
+
+When deploying in an air-gapped environment (no internet or NGC registry access), you must pre-stage container images on a machine with network access, then transfer and load them in the isolated environment.
+
+1. On a machine with network access: Clone the repo, authenticate with NGC (`docker login nvcr.io`), and pull all images used by your chosen profile (for example, `docker compose --profile retrieval pull`).
+2. Save images: Export the images to archives (for example, using `docker save` for each image or a script that saves all images referenced by your [docker-compose.yaml](https://github.com/NVIDIA/NeMo-Retriever/blob/main/docker-compose.yaml)).
+3. Transfer the image archives and your `docker-compose.yaml` (and `.env` if used) to the air-gapped system.
+4. On the air-gapped machine: Load the images (`docker load -i <archive>`) and start the stack with the same profile (for example, `docker compose --profile retrieval up`).
+
+Ensure the same image tags and `docker-compose.yaml` version are used in both environments so that service configuration stays consistent.
 
 ## Docker Compose override files
 
@@ -515,6 +525,7 @@ This syntax and structure can be repeated for each NIM model used by CAS, ensuri
 
     Advanced features require additional GPU support and disk space. For more information, refer to [Support Matrix](support-matrix.md).
 
+
 ## Related Topics
 
 - [Troubleshoot](troubleshoot.md)
diff --git a/docs/docs/extraction/quickstart-library-mode.md b/docs/docs/extraction/quickstart-library-mode.md
deleted file mode 100644
index c027e6a0d..000000000
--- a/docs/docs/extraction/quickstart-library-mode.md
+++ /dev/null
@@ -1,487 +0,0 @@
-# Deploy Without Containers (Library Mode) for NeMo Retriever Library
-
-[NeMo Retriever Library](overview.md) is typically deployed as a cluster of containers for robust, scalable production use. 
-
-!!! note
-
-    NeMo Retriever Library is also known as NVIDIA Ingest.
-
-In addition, you can use library mode, which is intended for the following cases:
-
-- Local development
-- Experimentation and testing
-- Small-scale workloads, such as workloads of fewer than 100 documents
-
-
-By default, library mode depends on NIMs that are hosted on build.nvidia.com. 
-In library mode you launch the main pipeline service directly within a Python process, 
-while all other services (such as embedding and storage) are hosted remotely in the cloud.
-
-To get started using library mode, you need the following:
-
-- Linux operating systems (Ubuntu 22.04 or later recommended) or MacOS
-- Python 3.12
-- We strongly advise using an isolated Python virtual env with [uv](https://docs.astral.sh/uv/getting-started/installation/).
-
-
-
-## Step 1: Prepare Your Environment
-
-Use the following procedure to prepare your environment.
-
-1. Run the following code to create your NV Ingest Python environment.
-
-    ```
-       uv venv --python 3.12 nvingest && \
-         source nvingest/bin/activate && \
-         uv pip install nv-ingest==26.1.2 nv-ingest-api==26.1.2 nv-ingest-client==26.1.2
-    ```
-
-    By default, the pipeline uses **LanceDB** as the vector database (no extra package required). To use **Milvus** (e.g. milvus-lite) instead, also install `milvus-lite==2.4.12` and pass `milvus_uri="milvus.db"` in `vdb_upload`. For details, see [Data Upload](data-store.md).
-
-    !!! tip
-
-        To confirm that you have activated your virtual environment, run `which python` and confirm that you see `nvingest` in the result. You can do this before any python command that you run.
-
-2. Set or create a .env file that contains your NVIDIA Build API key and other environment variables.
-
-    !!! note
-
-        If you have an NGC API key, you can use it here. For more information, refer to [Generate Your NGC Keys](ngc-api-key.md) and [Environment Configuration Variables](environment-config.md).
-
-    - To set your variables, use the following code.
-
-        ```
-        export NVIDIA_API_KEY=nvapi-<your key>
-        ```
-    - To add your variables to a .env file, include the following.
-
-        ```
-        NVIDIA_API_KEY=nvapi-<your key>
-        ```
-
-
-## Step 2: Ingest Documents
-
-You can submit jobs programmatically by using Python.
-
-!!! tip
-
-    For more Python examples, refer to [NV-Ingest: Python Client Quick Start Guide](https://github.com/NVIDIA/nv-ingest/blob/main/client/client_examples/examples/python_client_usage.ipynb).
-
-
-If you have a very high number of CPUs, and see the process hang without progress, 
-we recommend that you use `taskset` to limit the number of CPUs visible to the process. 
-Use the following code.
-
-```
-taskset -c 0-3 python your_ingestion_script.py
-```
-
-On a 4 CPU core low end laptop, the following code should take about 10 seconds.
-
-```python
-import time
-
-from nv_ingest.framework.orchestration.ray.util.pipeline.pipeline_runners import run_pipeline
-from nv_ingest_client.client import Ingestor, NvIngestClient
-from nv_ingest_api.util.message_brokers.simple_message_broker import SimpleClient
-from nv_ingest_client.util.process_json_files import ingest_json_results_to_blob
-
-def main():
-    # Start the pipeline subprocess for library mode
-    run_pipeline(block=False, disable_dynamic_scaling=True, run_in_subprocess=True)
-
-    client = NvIngestClient(
-        message_client_allocator=SimpleClient,
-        message_client_port=7671,
-        message_client_hostname="localhost",
-    )
-
-    # Optional: use Milvus (e.g. milvus-lite) by providing milvus_uri and installing milvus-lite.
-    # By default, LanceDB is used and no milvus_uri is needed.
-    # milvus_uri = "milvus.db"
-    collection_name = "test"
-    sparse = False
-
-    # do content extraction from files
-    ingestor = (
-        Ingestor(client=client)
-        .files("data/multimodal_test.pdf")
-        .extract(
-            extract_text=True,
-            extract_tables=True,
-            extract_charts=True,
-            extract_images=True,
-            table_output_format="markdown",
-            extract_infographics=True,
-            # extract_method="nemotron_parse", #Slower, but maximally accurate, especially for PDFs with pages that are scanned images
-            text_depth="page",
-        )
-        .embed()
-        .vdb_upload(
-            collection_name=collection_name,
-            # milvus_uri=milvus_uri,  # Uncomment to use Milvus instead of LanceDB
-            sparse=sparse,
-            # for llama-3.2 embedder, use 1024 for e5-v5
-            dense_dim=2048,
-        )
-    )
-
-    print("Starting ingestion..")
-    t0 = time.time()
-
-    # Return both successes and failures
-    # Use for large batches where you want successful chunks/pages to be committed, while collecting detailed diagnostics for failures.
-    results, failures = ingestor.ingest(show_progress=True, return_failures=True)
-
-    # Return only successes
-    # results = ingestor.ingest(show_progress=True)
-
-    t1 = time.time()
-    print(f"Total time: {t1 - t0} seconds")
-
-    # results blob is directly inspectable
-    if results:
-        print(ingest_json_results_to_blob(results[0]))
-
-    # (optional) Review any failures that were returned
-    if failures:
-        print(f"There were {len(failures)} failures. Sample: {failures[0]}")
-
-if __name__ == "__main__":
-    main()
-```
-
-!!! note
-
-    For advanced visual parsing with library mode, uncomment `extract_method="nemotron_parse"` in the previous code. For more information, refer to [Advanced Visual Parsing](nemoretriever-parse.md).
-
-
-You can see the extracted text that represents the content of the ingested test document.
-
-```shell
-Starting ingestion..
-Total time: 9.243880033493042 seconds
-
-TestingDocument
-A sample document with headings and placeholder text
-Introduction
-This is a placeholder document that can be used for any purpose. It contains some 
-headings and some placeholder text to fill the space. The text is not important and contains 
-no real value, but it is useful for testing. Below, we will have some simple tables and charts 
-that we can use to confirm Ingest is working as expected.
-Table 1
-This table describes some animals, and some activities they might be doing in specific 
-locations.
-Animal Activity Place
-Gira@e Driving a car At the beach
-Lion Putting on sunscreen At the park
-Cat Jumping onto a laptop In a home o@ice
-Dog Chasing a squirrel In the front yard
-Chart 1
-This chart shows some gadgets, and some very fictitious costs.
-
-... document extract continues ...
-```
-
-## Step 3: Query Ingested Content
-
-To query for relevant snippets of the ingested content, and use them with an LLM to generate answers, use the following code. With the default LanceDB backend, use the LanceDB retrieval API (see [Data Upload](data-store.md)). The example below shows retrieval when using Milvus (e.g. milvus-lite).
-
-```python
-import os
-from openai import OpenAI
-from nv_ingest_client.util.milvus import nvingest_retrieval
-
-# Only needed when using Milvus (e.g. milvus-lite) instead of LanceDB
-milvus_uri = "milvus.db"
-collection_name = "test"
-sparse = False
-
-queries = ["Which animal is responsible for the typos?"]
-
-retrieved_docs = nvingest_retrieval(
-    queries,
-    collection_name,
-    milvus_uri=milvus_uri,
-    hybrid=sparse,
-    top_k=1,
-)
-
-# simple generation example
-extract = retrieved_docs[0][0]["entity"]["text"]
-client = OpenAI(
-  base_url = "https://integrate.api.nvidia.com/v1",
-  api_key = os.environ["NVIDIA_API_KEY"]
-)
-
-prompt = f"Using the following content: {extract}\n\n Answer the user query: {queries[0]}"
-print(f"Prompt: {prompt}")
-completion = client.chat.completions.create(
-  model="nvidia/llama-3.1-nemotron-nano-vl-8b-v1",
-  messages=[{"role":"user","content": prompt}],
-)
-response = completion.choices[0].message.content
-
-print(f"Answer: {response}")
-```
-
-```shell
-Prompt: Using the following content: Table 1
-| This table describes some animals, and some activities they might be doing in specific locations. | This table describes some animals, and some activities they might be doing in specific locations. | This table describes some animals, and some activities they might be doing in specific locations. |
-| Animal | Activity | Place |
-| Giraffe | Driving a car | At the beach |
-| Lion | Putting on sunscreen | At the park |
-| Cat | Jumping onto a laptop | In a home office |
-| Dog | Chasing a squirrel | In the front yard |
-
- Answer the user query: Which animal is responsible for the typos?
-Answer: A clever query!
-
-Based on the provided Table 1, I'd make an educated inference to answer your question. Since the activities listed are quite unconventional for the respective animals (e.g., a giraffe driving a car, a lion putting on sunscreen), it's likely that the table is using humor or hypothetical scenarios.
-
-Given this context, the question "Which animal is responsible for the typos?" is probably a tongue-in-cheek inquiry, as there's no direct information in the table about typos or typing activities.
-
-However, if we were to make a playful connection, we could look for an animal that's:
-
-1. Typically found in a setting where typing might occur (e.g., an office).
-2. Engaging in an activity that could potentially lead to typos (e.g., interacting with a typing device).
-
-Based on these loose criteria, I'd jokingly point to:
-
-**Cat** as the potential culprit, since it's:
-        * Located "In a home office"
-        * Engaged in "Jumping onto a laptop", which could theoretically lead to accidental keystrokes or typos if the cat were to start "walking" on the keyboard!
-
-Please keep in mind that this response is purely humorous and interpretative, as the table doesn't explicitly mention typos or provide a straightforward answer to the question.
-```
-
-
-
-## Logging Configuration
-
-Nemo Retriever extraction uses [Ray](https://docs.ray.io/en/latest/index.html) for logging. 
-For details, refer to [Configure Ray Logging](ray-logging.md).
-
-By default, library mode runs in quiet mode to minimize startup noise. 
-Quiet mode automatically configures the following environment variables.
-
-| Variable                             | Quiet Mode Value | Description |
-|--------------------------------------|------------------|-------------|
-| `INGEST_RAY_LOG_LEVEL`               | `PRODUCTION`     | Sets Ray logging to ERROR level to reduce noise. |
-| `RAY_ACCEL_ENV_VAR_OVERRIDE_ON_ZERO` | `0`              | Silences Ray accelerator warnings |
-| `OTEL_SDK_DISABLED`                  | `true`           | Disables OpenTelemetry trace export errors |
-
-
-If you want to see detailed startup logs for debugging, use one of the following options:
-
-- Set `quiet=False` when you run the pipeline as shown following.
-
-    ```python
-    run_pipeline(block=False, disable_dynamic_scaling=True, run_in_subprocess=True, quiet=False)
-    ```
-
-- Set the environment variables manually before you run the pipeline as shown following.
-
-    ```bash
-    export INGEST_RAY_LOG_LEVEL=DEVELOPMENT  # or DEBUG for maximum verbosity
-    ```
-
-
-
-## Library Mode Communication and Advanced Examples
-
-Communication in library mode is handled through a simplified, 3-way handshake message broker called `SimpleBroker`.
-
-Attempting to run a library-mode process co-located with a Docker Compose deployment does not work by default. 
-The Docker Compose deployment typically creates a firewall rule or port mapping that captures traffic to port `7671`,
-which prevents the `SimpleBroker` from receiving messages. 
-Always ensure that you use library mode in isolation, without an active containerized deployment listening on the same port.
-
-
-### Example `launch_libmode_service.py`
-
-This example launches the pipeline service in a subprocess, 
-and keeps it running until it is interrupted (for example, by pressing `Ctrl+C`). 
-It listens for ingestion requests on port `7671` from an external client.
-
-```python
-import logging
-import os
-
-from nv_ingest.framework.orchestration.ray.util.pipeline.pipeline_runners import run_pipeline
-from nv_ingest_api.util.logging.configuration import configure_logging as configure_local_logging
-
-# Configure the logger
-logger = logging.getLogger(__name__)
-
-local_log_level = os.getenv("INGEST_LOG_LEVEL", "DEFAULT")
-if local_log_level in ("DEFAULT",):
-    local_log_level = "INFO"
-
-configure_local_logging(local_log_level)
-
-
-def main():
-    """
-    Launch the libmode pipeline service using the embedded default configuration.
-    """
-    try:
-        # Start pipeline and block until interrupted
-        # Note: stdout/stderr cannot be passed when run_in_subprocess=True (not picklable)
-        # Use quiet=False to see verbose startup logs
-        _ = run_pipeline(
-            block=True,
-            disable_dynamic_scaling=True,
-            run_in_subprocess=True,
-        )
-    except KeyboardInterrupt:
-        logger.info("Keyboard interrupt received. Shutting down...")
-    except Exception as e:
-        logger.error(f"An unexpected error occurred: {e}", exc_info=True)
-
-
-if __name__ == "__main__":
-    main()
-```
-
-### Example `launch_libmode_and_run_ingestor.py`
-
-This example starts the pipeline service in-process, 
-and immediately runs an ingestion client against it in the same parent process.
-
-```python
-import logging
-import os
-import time
-
-from nv_ingest.framework.orchestration.ray.util.pipeline.pipeline_runners import run_pipeline
-from nv_ingest_api.util.logging.configuration import configure_logging as configure_local_logging
-from nv_ingest_api.util.message_brokers.simple_message_broker import SimpleClient
-from nv_ingest_client.client import Ingestor
-from nv_ingest_client.client import NvIngestClient
-
-# Configure the logger
-logger = logging.getLogger(__name__)
-
-local_log_level = os.getenv("INGEST_LOG_LEVEL", "INFO")
-if local_log_level in ("DEFAULT",):
-    local_log_level = "INFO"
-
-configure_local_logging(local_log_level)
-
-
-def run_ingestor():
-    """
-    Set up and run the ingestion process to send traffic against the pipeline.
-    """
-    logger.info("Setting up Ingestor client...")
-    client = NvIngestClient(
-        message_client_allocator=SimpleClient, message_client_port=7671, message_client_hostname="localhost"
-    )
-
-    ingestor = (
-        Ingestor(client=client)
-        .files("./data/multimodal_test.pdf")
-        .extract(
-            extract_text=True,
-            extract_tables=True,
-            extract_charts=True,
-            extract_images=True,
-            table_output_format="markdown",
-            extract_infographics=False,
-            text_depth="page",
-        )
-        .split(chunk_size=1024, chunk_overlap=150)
-        .embed()
-    )
-
-    try:
-        results, _ = ingestor.ingest(show_progress=False, return_failures=True)
-        logger.info("Ingestion completed successfully.")
-    except Exception as e:
-        logger.error(f"Ingestion failed: {e}")
-        raise
-
-    print("\nIngest done.")
-    print(f"Got {len(results)} results.")
-
-
-def main():
-    """
-    Launch the libmode pipeline service and run the ingestor against it.
-    Uses the embedded default libmode pipeline configuration.
-    """
-    pipeline = None
-    try:
-        # Start pipeline in subprocess
-        # Note: stdout/stderr cannot be passed when run_in_subprocess=True (not picklable)
-        # Use quiet=False to see verbose startup logs
-        pipeline = run_pipeline(
-            block=False,
-            disable_dynamic_scaling=True,
-            run_in_subprocess=True,
-        )
-        time.sleep(10)
-        run_ingestor()
-        # Run other code...
-    except KeyboardInterrupt:
-        logger.info("Keyboard interrupt received. Shutting down...")
-    except Exception as e:
-        logger.error(f"Error running pipeline: {e}")
-    finally:
-        if pipeline:
-            pipeline.stop()
-            logger.info("Shutting down pipeline...")
-
-
-if __name__ == "__main__":
-    main()
-```
-
-
-
-## The `run_pipeline` Function Reference
-
-The `run_pipeline` function is the main entry point to start the Nemo Retriever Extraction pipeline. 
-It can run in-process or as a subprocess.
-
-The `run_pipeline` function accepts the following parameters.
-
-| Parameter                | Type                   | Default | Required? | Description                                     |
-|--------------------------|------------------------|---------|-----------|-------------------------------------------------|
-| pipeline_config            | PipelineConfigSchema | —       | Yes       | A configuration object that specifies how the pipeline should be constructed. |
-| run_in_subprocess        | bool                   | False   | Yes       | `True` to launch the pipeline in a separate Python subprocess. `False` to run in the current process. |
-| block                    | bool                   | True    | Yes       | `True` to run the pipeline synchronously. The function returns after it finishes. `False` to return an interface for external pipeline control. |
-| disable_dynamic_scaling  | bool                   | None    | No        | `True` to disable autoscaling regardless of global settings. `None` to use the global default behavior. |
-| dynamic_memory_threshold | float                  | None    | No        | A value between `0.0` and `1.0`. If dynamic scaling is enabled, triggers autoscaling when memory usage crosses this threshold. |
-| stdout                   | TextIO                 | None    | No        | Redirect the subprocess `stdout` to a file or stream. If `None`, defaults to `/dev/null`. |
-| stderr                   | TextIO                 | None    | No        | Redirect subprocess `stderr` to a file or stream. If `None`, defaults to `/dev/null`. |
-| libmode                  | bool                   | True    | No        | `True` to load the default library mode pipeline configuration when `ingest_config` is `None`. |
-| quiet                    | bool                   | None    | No        | `True` to suppress verbose startup logs (PRODUCTION preset). `None` defaults to `True` when `libmode=True`. Set to `False` for verbose output. |
-
-
-The `run_pipeline` function returns the following values, depending on the parameters that you set:
-
-- **run_in_subprocess=False and block=True**  — The function returns a `float` that represents the elapsed time in seconds.
-- **run_in_subprocess=False and block=False** — The function returns a `RayPipelineInterface` object.
-- **run_in_subprocess=True  and block=True**  — The function returns `0.0`.
-- **run_in_subprocess=True  and block=False** — The function returns a `RayPipelineInterface` object.
-
-
-The `run_pipeline` throws the following errors:
-
-- **RuntimeError** — A subprocess failed to start, or exited with error.
-- **Exception** — Any other failure during pipeline setup or execution.
-
-
-
-## Related Topics
-
-- [Prerequisites](prerequisites.md)
-- [Support Matrix](support-matrix.md)
-- [Deploy With Docker Compose (Self-Hosted)](quickstart-guide.md)
-- [Deploy With Helm](helm.md)
-- [Notebooks](notebooks.md)
-- [Enterprise RAG Blueprint](https://build.nvidia.com/nvidia/multimodal-pdf-data-extraction-for-enterprise-rag)
diff --git a/docs/docs/extraction/releasenotes-nv-ingest.md b/docs/docs/extraction/releasenotes-nv-ingest.md
index fb4b847f8..d3b71b4a5 100644
--- a/docs/docs/extraction/releasenotes-nv-ingest.md
+++ b/docs/docs/extraction/releasenotes-nv-ingest.md
@@ -4,68 +4,38 @@ This documentation contains the release notes for [NeMo Retriever Library](overv
 
 !!! note
 
-    NeMo Retriever Library is also known as NVIDIA Ingest.
-
-
-
-## Release 26.01 (26.1.2)
-
-The NeMo Retriever Library 26.01 release adds new hardware and software support, and other improvements.
-
-To upgrade the Helm Charts for this version, refer to [NV-Ingest Helm Charts](https://github.com/NVIDIA/nv-ingest/blob/release/26.1.2/helm/README.md).
-
-
-### Highlights 
-
-This release contains the following key changes:
-
-- Added functional support for [H200 NVL](https://www.nvidia.com/en-us/data-center/h200/). For details, refer to [Support Matrix](support-matrix.md).
-- All Helm deployments for Kubernetes now use [NVIDIA NIM Operator](https://docs.nvidia.com/nim-operator/latest/index.html). For details, refer to [NV-Ingest Helm Charts](https://github.com/NVIDIA/nv-ingest/blob/release/26.1.2/helm/README.md). 
-- Updated RIVA NIM to version 1.4.0. For details, refer to [Extract Speech](audio.md).
-- Updated VLM NIM to [nemotron-nano-12b-v2-vl](https://build.nvidia.com/nvidia/nemotron-nano-12b-v2-vl/modelcard). For details, refer to [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images).
-- Added VLM caption prompt customization parameters, including reasoning control. For details, refer to [Caption Images and Control Reasoning](nv-ingest-python-api.md#caption-images-and-control-reasoning).
-- Added support for the [nemotron-parse](https://build.nvidia.com/nvidia/nemotron-parse/modelcard) model which replaces the [nemoretriever-parse](https://build.nvidia.com/nvidia/nemoretriever-parse/modelcard) model. For details, refer to [Advanced Visual Parsing](nemoretriever-parse.md).
-- Support is now deprecated for [paddleocr](https://build.nvidia.com/baidu/paddleocr/modelcard).
-- The `meta-llama/Llama-3.2-1B` tokenizer is now pre-downloaded so that you can run token-based splitting without making a network request. For details, refer to [Split Documents](chunking.md).
-- For scanned PDFs, added specialized extraction strategies. For details, refer to [PDF Extraction Strategies](nv-ingest-python-api.md#pdf-extraction-strategies).
-- [LanceDB](https://lancedb.com/) is now the default vector database backend; Milvus remains fully supported. For details, refer to [Data Upload](data-store.md).
-- The V2 API is now available and is the default processing pipeline. The response format remains backwards-compatible. You can enable the v2 API by using `message_client_kwargs={"api_version": "v2"}`.For details, refer to [API Reference](api-docs).
-- Large PDFs are now automatically split into chunks and processed in parallel, delivering faster ingestion for long documents. For details, refer to [PDF Pre-Splitting](v2-api-guide.md).
-- Issues maintaining extraction quality while processing very large files are now resolved with the V2 API. For details, refer to [V2 API Guide](v2-api-guide.md).
-- Updated the embedding task to support embedding on custom content fields like the results of summarization functions. For details, refer to [Use the Python API](nv-ingest-python-api.md).
-- User-defined function summarization is now using `nemotron-mini-4b-instruct` which provides significant speed improvements. For details, refer to [User-defined Functions](user-defined-functions.md) and [NV-Ingest UDF Examples](https://github.com/NVIDIA/nv-ingest/blob/release/26.1.2/examples/udfs/README.md).
-- In the `Ingestor.extract` method, the defaults for `extract_text` and `extract_images` are now set to `true` for consistency with `extract_tables` and `extract_charts`. For details, refer to [Use the Python API](nv-ingest-python-api.md).
-- The `table-structure` profile is no longer available. The table-structure profile is now part of the default profile. For details, refer to [Profile Information](quickstart-guide.md#profile-information).
-- New documentation [Why Throughput Is Dataset-Dependent](throughput-is-dataset-dependent.md).
-- New documentation [Add User-defined Stages](user-defined-stages.md).
-- New documentation [Add User-defined Functions](user-defined-functions.md).
-- New documentation [Resource Scaling Modes](scaling-modes.md).
-- New documentation [NimClient Usage](nimclient.md).
-- New documentation [Use the API (V2)](v2-api-guide.md).
-
-
-
-### Fixed Known Issues
-
-The following are the known issues that are fixed in this version:
-
-- A10G support is restored. To use A10G hardware, use release 26.1.2 or later. For details, refer to [Support Matrix](support-matrix.md).
-- L40S support is restored. To use L40S hardware, use release 26.1.2 or later. For details, refer to [Support Matrix](support-matrix.md).
-- The page number field in the content metadata now starts at 1 instead of 0 so each page number is no longer off by one from what you would expect. For details, refer to [Content Metadata](content-metadata.md).
-- Support for batches that include individual files greater than approximately 400MB is restored. This includes audio files and pdfs.
-
-
-
-## All Known Issues
-
-The following are the known issues for NeMo Retriever Library:
-
-- Advanced visual parsing is not supported on RTX Pro 6000, B200, or H200 NVL. For details, refer to [Advanced Visual Parsing](advanced-visual-parsing.md) and [Support Matrix](support-matrix.md).
-- The Page Elements NIM (`nemoretriever-page-elements-v3:1.7.0`) may intermittently fail during inference under high-concurrency workloads. This happens when Triton’s dynamic batching combines requests that exceed the model’s maximum batch size, a situation more commonly seen in multi-GPU setups or large ingestion runs. In these cases, extraction fails for the impacted documents. A correction is planned for `nemoretriever-page-elements-v3:1.7.1`.
-
+    NVIDIA Ingest (nv-ingest) has been renamed to the NeMo Retriever Library.   
+
+## 26.03 Release Notes (26.3.0)
+
+NVIDIA® NeMo Retriever Library version 26.03 adds broader hardware and software support along with many pipeline, evaluation, and deployment enhancements.
+
+To upgrade the Helm charts for this release, refer to the [NeMo Retriever Library Helm Charts](https://github.com/NVIDIA/NeMo-Retriever/blob/release/26.3.0/helm/README.md).
+
+Highlights for the 26.03 release include:
+
+- NV-Ingest GitHub repo renamed to NeMo-Retriever  
+- NeMo Retriever Extraction pipeline renamed to NeMo Retriever Library  
+- NeMo Retriever Library now supports two deployment options:  
+  - A new no-container, pip-installable in-process library for development (available on PyPI)  
+  - Existing production-ready Helm chart with NIMs  
+- Added documentation notes on Air-gapped deployment support  
+- Added documentation notes on OpenShift support  
+- Added support for RTX4500 Pro Blackwell SKU  
+- Added support for llama-nemotron-embed-vl-v2 in text and text+image modes  
+- New extract methods `pdfium_hybrid` and `ocr` target scanned PDFs to improve text and layout extraction from image-based pages  
+- VLM-based image caption enhancements:  
+  - Infographics can be captioned  
+  - Reasoning mode is configurable  
+- Enabled hybrid search with Lancedb  
+- Added retrieval_bench subfolder with generalizable agentic retrieval pipeline  
+- The project now uses UV as the primary environment and package manager instead of Conda, resulting in faster installs and simpler dependency handling  
+- Default Redis TTL increased from 1–2 hours to 48 hours so long-running jobs (e.g., VLM captioning) don’t expire before completion  
+- NeMo Retriever Library currently does not support image captioning via VLM; this feature will be added in the next release
 
 ## Release Notes for Previous Versions
 
+| [26.1.2](https://docs.nvidia.com/nemo/retriever/26.1.2/extraction/releasenotes-nv-ingest/)
 | [26.1.1](https://docs.nvidia.com/nemo/retriever/26.1.1/extraction/releasenotes-nv-ingest/)
 | [25.9.0](https://docs.nvidia.com/nemo/retriever/25.9.0/extraction/releasenotes-nv-ingest/) 
 | [25.6.3](https://docs.nvidia.com/nemo/retriever/25.6.3/extraction/releasenotes-nv-ingest/) 
@@ -74,13 +44,10 @@ The following are the known issues for NeMo Retriever Library:
 | [25.3.0](https://docs.nvidia.com/nemo/retriever/25.3.0/extraction/releasenotes-nv-ingest/) 
 | [24.12.1](https://docs.nvidia.com/nemo/retriever/25.3.0/extraction/releasenotes-nv-ingest/) 
 | [24.12.0](https://docs.nvidia.com/nemo/retriever/25.3.0/extraction/releasenotes-nv-ingest/) 
-|
-
-
 
 ## Related Topics
 
 - [Prerequisites](prerequisites.md)
 - [Deploy Without Containers (Library Mode)](quickstart-library-mode.md)
 - [Deploy With Docker Compose (Self-Hosted)](quickstart-guide.md)
-- [Deploy With Helm](helm.md)
+- [Deploy With Helm](helm.md)
\ No newline at end of file
diff --git a/docs/docs/extraction/support-matrix.md b/docs/docs/extraction/support-matrix.md
index 5dbc49508..e1005f528 100644
--- a/docs/docs/extraction/support-matrix.md
+++ b/docs/docs/extraction/support-matrix.md
@@ -6,18 +6,18 @@ Before you begin using [NeMo Retriever Library](overview.md), ensure that you ha
 
     NeMo Retriever Library is also known as NVIDIA Ingest.
 
-
 ## Core and Advanced Pipeline Features
 
-The Nemo Retriever extraction core pipeline features run on a single A10G or better GPU. 
+The NeMo Retriever Library core pipeline features run on a single A10G or better GPU. 
+
 The core pipeline features include the following:
 
-- llama3.2-nv-embedqa-1b-v2 — Embedding model for converting text chunks into vectors.
-- nemoretriever-page-elements-v3 — Detects and classifies images on a page as a table, chart or infographic.
-- nemoretriever-table-structure-v1 — Detects rows, columns, and cells within a table to preserve table structure and convert to Markdown format. 
-- nemoretriever-graphic-elements-v1 — Detects graphic elements within chart images such as titles, legends, axes, and numerical values. 
-- nemoretriever-ocr-v1 — Image OCR model to detect and extract text from images.
-- retrieval — Enables embedding and indexing into LanceDB (default) or Milvus.
+- llama-nemotron-embed-1b-v2 — Embedding model for converting text chunks into vectors.
+- nemotron-page-elements-v3 — Detects and classifies images on a page as a table, chart or infographic.
+- nemotron-table-structure-v1 — Detects rows, columns, and cells within a table to preserve table structure and convert to Markdown format. 
+- nemotron-graphic-elements-v1 — Detects graphic elements within chart images such as titles, legends, axes, and numerical values. 
+- nemotron-ocr-v1 — Image OCR model to detect and extract text from images.
+- retrieval — Enables embedding and indexing into Milvus.
 
 Advanced features require additional GPU support and disk space. 
 This includes the following:
diff --git a/docs/docs/extraction/user-defined-functions.md b/docs/docs/extraction/user-defined-functions.md
index 74a710698..95a8873ee 100644
--- a/docs/docs/extraction/user-defined-functions.md
+++ b/docs/docs/extraction/user-defined-functions.md
@@ -5,7 +5,7 @@ This guide covers how to write, validate, and submit UDFs using both the CLI and
 
 !!! note
 
-    NeMo Retriever Library is also known as NVIDIA Ingest.
+   NVIDIA Ingest (nv-ingest) has been renamed to the NeMo Retriever Library.