UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio by jagadeeswaran-zipstack · Pull Request #1783 · Zipstack/unstract

jagadeeswaran-zipstack · 2026-02-08T20:27:48Z

What

Add support for CSV, TXT, and Excel files in Prompt Studio for enterprise users. These file types are stored as-is (no PDF conversion) and displayed in Raw View only.

Why

Users need to process structured data files (CSV, Excel) and plain text files in Prompt Studio
These file types don't benefit from PDF conversion and should be handled natively
Enables extraction from a wider variety of document formats

How

Upload flow: Original files always stored in main directory. For convertible types (DOCX, images, PPT), a converted PDF is also stored in converted/ subdirectory for preview
CSV/TXT/Excel: Skip PDF conversion entirely, stored as-is in main directory
Preview resolution: fetch_contents_ide checks converted/{name}.pdf first for enterprise; falls back to original file
Frontend: Auto-switches to Raw View for non-PDF file types; allows direct upload of CSV/TXT/Excel without confirmation modal

Files Changed:

backend/prompt_studio/prompt_studio_core_v2/views.py - Upload stores original + converted; fetch resolves converted PDF
backend/utils/file_storage/helpers/prompt_studio_file_helper.py - Add converted/ dir, upload_converted_for_ide(), handle CSV/Excel MIME
frontend/.../DocumentManager.jsx - Use mimeType for rendering, auto-switch to Raw View
frontend/.../ManageDocsModal.jsx - Allow CSV/TXT/Excel in upload validation

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

No - Changes are enterprise-gated via get_plugin("file_converter"). OSS behavior is unchanged (PDF-only). Existing DOCX/image/PPT uploads now store both original AND converted PDF, maintaining backward compatibility for preview while enabling original file indexing.

Database Migrations

None

Env Config

None

Related Issues or PRs

UN-3149
Companion PR in unstract-cloud: https://github.com/Zipstack/unstract-cloud/pull/1267

Notes on Testing

Upload DOCX → Verify file.docx in main dir + converted/file.pdf exists → PDF View shows PDF → Index uses original DOCX
Upload CSV → Verify data.csv in main dir, no converted file → Auto-switches to Raw View → Index uses CSV
Upload TXT → Same as CSV flow
Upload Excel (.xlsx/.xlsm) → Same as CSV flow, shows placeholder message in PDF View
Upload PDF → No change from existing behavior
Delete document → Verify cleanup includes converted/ directory

Checklist

I have read and understood the Contribution Guidelines.

- Store original files in main directory, converted PDFs in converted/ subdir - CSV/TXT/Excel skip PDF conversion entirely and show only Raw View - Backend resolves converted PDF transparently for preview (fetch_contents_ide) - Frontend auto-switches to Raw View for non-PDF file types - Add upload_converted_for_ide() helper method - Include converted/ in directory creation and delete cleanup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

coderabbitai · 2026-02-08T20:28:07Z

Summary by CodeRabbit

Release Notes

New Features
- Added support for converting and previewing multiple document formats (CSV, TXT, Excel) alongside PDFs
- Implemented intelligent format handling with automatic fallback to Raw View for non-preview-enabled formats
- Streamlined upload process with direct upload support for common file types

Walkthrough

Adds a conversion-to-PDF preview flow: backend stores converted PDFs under a new converted/ subdirectory while preserving original files; frontend uses MIME type to choose PDF preview or a non-PDF placeholder and allows direct uploads for selected MIME types.

Changes

Cohort / File(s)	Summary
Backend PDF Conversion Logic `backend/prompt_studio/prompt_studio_core_v2/views.py`	`fetch_contents_ide` attempts to return a converted PDF from `converted/{name}.pdf` for ORIGINAL view when a file_converter_plugin exists (falls back on FileNotFound). `upload_for_ide` may convert non-PDFs (when plugin indicates) to PDF, store converted preview under `converted/`, then persist the original file and create the document record with the original filename.
File Storage & Management `backend/utils/file_storage/helpers/prompt_studio_file_helper.py`	Adds `converted/` subdirectory creation and removal during setup/delete. New public method `upload_converted_for_ide` stores converted files under `converted/`. `fetch_file_contents` extended to treat `text/plain`/`text/csv` as text and return placeholders for Excel MIME types. Deletion now cascades to remove converted files.
Frontend Document Preview `frontend/src/components/custom-tools/document-manager/DocumentManager.jsx`	Switched rendering to use `mimeType`: PDFs render via PdfViewer; non-PDFs are handled as non-base64 text (no blob URL) and show a preview-unavailable placeholder while auto-switching to Raw View. Content-available logic updated to consider non-empty `mimeType` for non-PDFs.
Frontend Upload Control `frontend/src/components/custom-tools/manage-docs-modal/ManageDocsModal.jsx`	Introduces `DIRECT_UPLOAD_TYPES` whitelist (PDF, plain text, CSV, common Excel MIME types). Whitelisted MIME types upload immediately; others trigger confirmation/alternate flow.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant BackendAPI
    participant Converter
    participant FileStorage
    participant Database

    Client->>BackendAPI: upload_for_ide(file)
    BackendAPI->>BackendAPI: check file_converter_plugin & should_convert_to_pdf
    alt convert to PDF
        BackendAPI->>Converter: convert file -> PDF
        Converter-->>BackendAPI: converted PDF
        BackendAPI->>FileStorage: upload converted PDF to converted/{name}.pdf
        BackendAPI->>FileStorage: store original file in main dir
    else store original
        BackendAPI->>FileStorage: store original file in main dir
    end
    BackendAPI->>Database: create document record (original filename)
    BackendAPI-->>Client: upload response

    Client->>BackendAPI: fetch_contents_ide(file, view=ORIGINAL)
    alt converted PDF exists
        BackendAPI->>FileStorage: fetch converted/{name}.pdf
        FileStorage-->>BackendAPI: converted PDF data
    else fetch original
        BackendAPI->>FileStorage: fetch original file
        FileStorage-->>BackendAPI: original data
    end
    BackendAPI-->>Client: file content response

    Client->>Client: render by mimeType (PDF → PdfViewer, non-PDF → placeholder / Raw View)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main feature: support for CSV, TXT, and Excel files in Prompt Studio, which aligns directly with the primary change across all modified files.
Description check	✅ Passed	The description comprehensively covers all required template sections: What (feature overview), Why (justification), How (detailed implementation), breaking changes assessment, database/env config, related issues, and testing notes with specific scenarios.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/csv-txt-excel-prompt-studio

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

frontend/src/components/custom-tools/manage-docs-modal/ManageDocsModal.jsx (1)
592-614: ⚠️ Potential issue | 🟡 Minor

Stale error message: "Only PDF files are allowed" is no longer accurate.

When ConfirmMultiDoc is not available and the file type isn't in DIRECT_UPLOAD_TYPES, the error message still says "Only PDF files are allowed" (Line 607). Now that CSV, TXT, and Excel files are also accepted, this message is misleading.
Proposed fix
           if (!ConfirmMultiDoc) {
             setAlertDetails({
               type: "error",
-              content: "Only PDF files are allowed",
+              content: "Only PDF, CSV, TXT, and Excel files are allowed",
             });
           }
backend/utils/file_storage/helpers/prompt_studio_file_helper.py (1)
178-201: ⚠️ Potential issue | 🟠 Major

Verify that text_content_string is always defined before Line 201.

If a file's MIME type is in allowed_content_types but doesn't match any of the explicit branches (PDF, text/plain, text/csv, Excel), execution reaches the else at Line 198, which only logs a warning. Control then falls to Line 201 where text_content_string is referenced — but it was never assigned, causing an UnboundLocalError.

This is a pre-existing issue, but the new branches make it more visible. Consider initializing text_content_string = "" before the if/elif chain or adding a return/raise in the else branch.
Proposed fix
     def fetch_file_contents(
         org_id: str,
         user_id: str,
         tool_id: str,
         file_name: str,
         allowed_content_types: list[str],
     ) -> dict[str, Any]:
         """Method to fetch file contents from the remote location.
         The path is constructed in runtime based on the args
         """
+        text_content_string: str = ""
         fs_instance = EnvHelper.get_storage(
backend/prompt_studio/prompt_studio_core_v2/views.py (1)
576-605: ⚠️ Potential issue | 🟡 Minor

Conversion failure will surface as an unhandled 500 error.

If process_file (Line 581) raises an exception (e.g., corrupted file, unsupported format variant), there's no try/except around the conversion block, so the entire upload request fails with a generic 500. Consider wrapping the conversion in a try/except and either falling back to storing the original without a converted preview, or returning a descriptive error message.
Proposed approach
             if file_converter_service.should_convert_to_pdf(file_type):
+                try:
                     converted_data, converted_name = file_converter_service.process_file(
                         uploaded_file, file_name
                     )
                     PromptStudioFileHelper.upload_converted_for_ide(
                         org_id=UserSessionUtils.get_organization_id(request),
                         user_id=custom_tool.created_by.user_id,
                         tool_id=str(custom_tool.tool_id),
                         file_name=converted_name,
                         file_data=converted_data,
                     )
+                except Exception:
+                    logger.warning(
+                        f"Failed to convert {file_name} to PDF for preview, "
+                        "storing original only"
+                    )
                 # Reset uploaded_file for storing original in main dir
                 uploaded_file.seek(0)
                 file_data = uploaded_file

🤖 Fix all issues with AI agents

In `@backend/prompt_studio/prompt_studio_core_v2/views.py`:
- Line 577: The conversion branch uses the browser-supplied
uploaded_file.content_type (referenced as file_type at the if with
file_converter_plugin) instead of the server-detected MIME from FileValidator,
causing inconsistent behavior; fix by persisting the server-detected MIME (e.g.,
add a property or return value like detected_mime on FileValidator or its
validate() result) or re-run magic.from_buffer() in the view before the
conversion decision, then use that server-detected MIME (detected_mime) in the
conditional (the same place that checks file_converter_plugin and file_type) so
conversion vs direct upload is based on the validated content type.

🧹 Nitpick comments (3)

frontend/src/components/custom-tools/manage-docs-modal/ManageDocsModal.jsx (1)
593-600: DIRECT_UPLOAD_TYPES is re-created on every beforeUpload call inside the onload callback.

This is a minor allocation concern. Consider hoisting it to module scope as a constant since it never changes.
Proposed refactor

Move it outside the component, near the top of the file:
const DIRECT_UPLOAD_TYPES = new Set([
  "application/pdf",
  "text/plain",
  "text/csv",
  "application/vnd.ms-excel",
  "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
  "application/vnd.ms-excel.sheet.macroenabled.12",
]);
backend/utils/file_storage/helpers/prompt_studio_file_helper.py (1)

86-116: upload_converted_for_ide largely duplicates upload_for_ide.

The two methods share the same storage-init and path-resolution logic. Consider extracting a private helper that accepts a subdirectory parameter (e.g., None for main dir, "converted" for the converted subdir) to reduce duplication.

frontend/src/components/custom-tools/document-manager/DocumentManager.jsx (1)

357-370: Consider disabling the "PDF View" tab for non-PDF files instead of showing a placeholder.

Users can still manually switch to the PDF View tab for non-PDF files and see a "preview not available" placeholder. Disabling or hiding the tab would be a cleaner UX, though the current approach is functional.

coderabbitai · 2026-02-08T20:33:23Z

backend/prompt_studio/prompt_studio_core_v2/views.py

            file_type = uploaded_file.content_type
-            # Convert non-PDF files
+
            if file_converter_plugin and file_type != "application/pdf":


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's see the full context around line 577 in views.py find . -name "views.py" -path "*prompt_studio_core_v2*" -type f

Repository: Zipstack/unstract

Length of output: 115

🏁 Script executed:

# Get the full context around line 577 wc -l backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 117

🏁 Script executed:

# Read the context around line 577 sed -n '550,610p' backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 3014

🏁 Script executed:

# Search for should_convert_to_pdf implementation rg -n "should_convert_to_pdf" --type=py -B3 -A5

Repository: Zipstack/unstract

Length of output: 1108

🏁 Script executed:

# Look for file_converter plugin definition rg -n "file_converter" --type=py -A2 | head -100

Repository: Zipstack/unstract

Length of output: 2773

🏁 Script executed:

# Check for MIME type validation or magic byte checking rg -n "content_type|magic|sniff" --type=py | grep -v test | grep -v migration | head -50

Repository: Zipstack/unstract

Length of output: 5750

🏁 Script executed:

# Look for allowed_content_types usage rg -n "allowed_content_types" --type=py -B2 -A2

Repository: Zipstack/unstract

Length of output: 3946

🏁 Script executed:

# Find the file_converter plugin implementation find . -path "*file_converter*" -name "*.py" -type f | grep -v test | grep -v migration | head -20

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check FileKey.FILE_UPLOAD_ALLOWED_MIME to understand supported formats rg -n "FILE_UPLOAD_ALLOWED_MIME|ExtentedFileInformationKey" --type=py -A10 | head -80

Repository: Zipstack/unstract

Length of output: 4511

🏁 Script executed:

# Look for should_convert_to_pdf implementation in file_converter find . -path "*file_converter*" -name "*.py" -type f -exec grep -l "should_convert_to_pdf" {} \;

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check the upload_for_ide method more completely to see if there's validation sed -n '568,620p' backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 2535

🏁 Script executed:

# Search for where file_converter plugin is loaded and its definition rg -n "get_plugin.*file_converter" --type=py -B2 -A2

Repository: Zipstack/unstract

Length of output: 1079

🏁 Script executed:

# Check if there's any file extension or magic byte validation in upload methods rg -n "upload_for_ide|upload_converted_for_ide" --type=py -B3 -A8 | head -60

Repository: Zipstack/unstract

Length of output: 4941

🏁 Script executed:

# Look for FileViewTypes to understand the upload logic better rg -n "FileViewTypes" --type=py -B2 -A2

Repository: Zipstack/unstract

Length of output: 2391

🏁 Script executed:

# Check FileUploadIdeSerializer to see if there's MIME validation rg -n "FileUploadIdeSerializer" --type=py -A15

Repository: Zipstack/unstract

Length of output: 5493

🏁 Script executed:

# Look at the file_converter plugin to understand should_convert_to_pdf behavior find . -path "*plugins*" -name "*.py" -type f | xargs grep -l "should_convert_to_pdf" 2>/dev/null

Repository: Zipstack/unstract

Length of output: 45

🏁 Script executed:

# Check if there's any magic byte detection or file extension validation rg -n "magic\.|mimetypes|sniff" backend/prompt_studio/prompt_studio_core_v2/ --type=py

Repository: Zipstack/unstract

Length of output: 191

🏁 Script executed:

# Get the FILE_UPLOAD_ALLOWED_EXT and details about FileKey constants rg -n "FILE_UPLOAD_ALLOWED" --type=py -B2 -A2 | head -100

Repository: Zipstack/unstract

Length of output: 2646

🏁 Script executed:

# Check if ExtentedFileInformationKey is from a plugin find . -path "*file_converter*" -name "constants.py" -type f

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Search for ExtentedFileInformationKey definition rg -n "ExtentedFileInformationKey\|class.*FileInformationKey" --type=py -A15 | head -150

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Try to find ExtentedFileInformationKey in the entire codebase rg -n "ExtentedFileInformationKey" --type=py -B5 -A15 | head -120

Repository: Zipstack/unstract

Length of output: 1953

🏁 Script executed:

# Look for plugins directory and file_converter find . -type d -name "plugins" | head -5

Repository: Zipstack/unstract

Length of output: 138

🏁 Script executed:

# Search for get_extented_file_information_key implementation rg -n "get_extented_file_information_key" --type=py -A8

Repository: Zipstack/unstract

Length of output: 973

🏁 Script executed:

# Look for the file_converter plugin constants find ./backend/plugins -path "*file_converter*" -name "constants.py" -type f

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Look at the structure of the plugins directory ls -la ./backend/plugins/ | head -30

Repository: Zipstack/unstract

Length of output: 661

🏁 Script executed:

# Search for file_converter in the plugins directory find ./backend/plugins -path "*file_converter*" -type f -name "*.py" | head -20

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Look for file_converter in the workers plugins find ./workers/plugins -path "*file_converter*" -type f -name "*.py" | head -20

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Search the entire codebase for file_converter plugin rg -n "file_converter" --type=py | grep -i "constant\|mime\|extension" | head -30

Repository: Zipstack/unstract

Length of output: 183

🏁 Script executed:

# Check if file_converter is an external plugin by searching plugin references rg -n "should_convert_to_pdf\|get_extented_file_information_key" --type=py -B3 -A10

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Look for FileValidator implementation rg -n "class FileValidator" --type=py -A20

Repository: Zipstack/unstract

Length of output: 1521

🏁 Script executed:

# Check if there's any file extension checking alongside MIME type rg -n "FILE_UPLOAD_ALLOWED_EXT" --type=py -B2 -A2

Repository: Zipstack/unstract

Length of output: 1787

🏁 Script executed:

# Look for any additional validation in the upload_for_ide or before line 577 sed -n '560,577p' backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 838

🏁 Script executed:

# Look at the FileValidator implementation more closely cat -n backend/utils/FileValidator.py | head -100

Repository: Zipstack/unstract

Length of output: 4505

🏁 Script executed:

# Check what happens in should_convert_to_pdf - does it do its own MIME checking? rg -n "process_file|should_convert_to_pdf" --type=py -A5 | head -60

Repository: Zipstack/unstract

Length of output: 4561

🏁 Script executed:

# Look for any file extension-based checks alongside MIME type checks rg -n "\.split.*\." backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check what FILE_UPLOAD_ALLOWED_EXT contains (look for ExtentedFileInformationKey constant definition) rg -n "FILE_UPLOAD_ALLOWED_EXT\|FILE_UPLOAD_ALLOWED_MIME" --type=py -B3 -A3

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Verify the file extension is being checked sed -n '192,204p' backend/prompt_studio/prompt_studio_core_v2/serializers.py

Repository: Zipstack/unstract

Length of output: 509

🏁 Script executed:

# Check if file_name contains the actual extension being used sed -n '571,580p' backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 567

🏁 Script executed:

# Look at what happens with the file extension in the rest of the code rg -n "file_name\|\.split\|splitext" backend/prompt_studio/prompt_studio_core_v2/views.py | head -20

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Verify what constants FileKey resolves to by checking imports and fallback cat -n backend/prompt_studio/prompt_studio_core_v2/serializers.py | sed -n '28,34p'

Repository: Zipstack/unstract

Length of output: 321

🏁 Script executed:

# Check the FileValidator more thoroughly - especially the magic.from_buffer call sed -n '72,85p' backend/utils/FileValidator.py

Repository: Zipstack/unstract

Length of output: 670

🏁 Script executed:

# Key insight: Check if file_type is being used for conversion decision after validation # Let's see if should_convert_to_pdf receives the browser-supplied MIME or actual file MIME sed -n '575,590p' backend/prompt_studio/prompt_studio_core_v2/views.py

Repository: Zipstack/unstract

Length of output: 968

🏁 Script executed:

# Search for ExtentedFileInformationKey to understand what file types are actually supported rg -n "FILE_UPLOAD_ALLOWED" --type=py -B5 -A5 | grep -A5 -B5 "EXT\|MIME" | head -80

Repository: Zipstack/unstract

Length of output: 4765

🏁 Script executed:

# Check what attributes are available on uploaded_file after validation rg -n "class InMemoryUploadedFile\|class UploadedFile" --type=py -A10 | head -40

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Confirm: does FileValidator's magic.from_buffer result get stored anywhere? rg -n "magic\.from_buffer\|\.content_type" backend/utils/FileValidator.py

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Check if there's a way to get the actual detected MIME type at line 577 rg -n "file_type\|content_type" backend/prompt_studio/prompt_studio_core_v2/views.py | head -20

Repository: Zipstack/unstract

Length of output: 43

🏁 Script executed:

# Get the last few lines of FileValidator to see the full validation flow sed -n '72,110p' backend/utils/FileValidator.py

Repository: Zipstack/unstract

Length of output: 1672

🏁 Script executed:

# Check if magic.from_buffer result is stored anywhere grep -n "magic\|_check_file" backend/utils/FileValidator.py

Repository: Zipstack/unstract

Length of output: 481

🏁 Script executed:

# Understand the complete validation: check the __call__ method sed -n '100,120p' backend/utils/FileValidator.py

Repository: Zipstack/unstract

Length of output: 656

file_type at line 577 relies on browser-supplied MIME, not actual file content detection.

The FileValidator class performs server-side MIME sniffing using magic.from_buffer() to validate uploads, which correctly identifies actual file types. However, after validation passes, this detected MIME type is discarded. The conversion logic at line 577 then uses uploaded_file.content_type (browser-supplied), not the server-detected MIME.

This creates an inconsistency: a CSV file with content_type="application/octet-stream" (common browser behavior) will pass validation (actual MIME detected as text/csv), but the conversion decision will use the unreliable browser value. If the file converter plugin supports multiple formats beyond PDF, the conversion-vs-direct-upload logic could behave unexpectedly. Consider either storing the detected MIME type during validation or re-detecting it before the conversion decision.

🤖 Prompt for AI Agents

In `@backend/prompt_studio/prompt_studio_core_v2/views.py` at line 577, The conversion branch uses the browser-supplied uploaded_file.content_type (referenced as file_type at the if with file_converter_plugin) instead of the server-detected MIME from FileValidator, causing inconsistent behavior; fix by persisting the server-detected MIME (e.g., add a property or return value like detected_mime on FileValidator or its validate() result) or re-run magic.from_buffer() in the view before the conversion decision, then use that server-detected MIME (detected_mime) in the conditional (the same place that checks file_converter_plugin and file_type) so conversion vs direct upload is based on the validated content type.

github-actions · 2026-02-08T20:41:37Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 66 passed, 0 failed (66 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

sonarqubecloud · 2026-02-08T20:42:21Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

frontend/src/components/custom-tools/document-manager/DocumentManager.jsx (1)
155-161: ⚠️ Potential issue | 🟠 Major

Bug: fileData is not reset when switching documents, causing stale mimeType.

When selectedDoc changes, fileUrl and extractTxt are reset but fileData (which holds mimeType) is not. This means between the reset and the new response arriving (or if the new request fails), the previous document's mimeType persists. Since isContentAvailable (line 482-485) and renderDoc (line 358) both depend on fileData.mimeType, stale state can cause incorrect rendering — e.g., showing the non-PDF placeholder for what should be a PDF, or marking content as available when the fetch actually failed.
Proposed fix
  useEffect(() => {
    setFileUrl("");
    setExtractTxt("");
+   setFileData({});
    Object.keys(viewTypes).forEach((item) => {
      handleFetchContent(viewTypes[item]);
    });
  }, [selectedDoc]);
Also reset in handleFetchContent:
  const handleFetchContent = (viewType) => {
    if (viewType === viewTypes.original) {
      setFileUrl("");
      setFileErrMsg("");
+     setFileData({});
    }

🤖 Fix all issues with AI agents

In `@frontend/src/components/custom-tools/document-manager/DocumentManager.jsx`:
- Around line 482-485: The isContentAvailable prop can remain as written but you
must prevent stale fileData from making it true after a failed fetch: update
handleGetDocsError to clear the previous fileData (e.g., reset fileData to
null/empty object) and/or set an explicit fetch-failed flag (e.g.,
fileFetchError) and include that in the isContentAvailable check; specifically,
modify handleGetDocsError to reset the state that holds the prior mimeType
(refer to fileData and handleGetDocsError) so that isContentAvailable (which
depends on fileUrl and fileData.mimeType) cannot be true after a fetch failure.
- Around line 259-261: The FileReader.onerror handler currently throws an
exception inside the async callback (reader.onerror) which becomes unhandled;
instead, update the component's error state inside that callback (e.g., call
your state setter such as setError or setFileError with a descriptive message
and the event/error details) and avoid throwing — locate reader.onerror in
DocumentManager.jsx and replace the throw new Error("Fail to load the file")
with a call that sets the component error state and optionally logs the event so
the UI can render the error.

🧹 Nitpick comments (1)

frontend/src/components/custom-tools/document-manager/DocumentManager.jsx (1)
356-370: Unused docName parameter in renderDoc.

After the refactor to mimeType-based rendering, docName is no longer used inside renderDoc. Consider removing it to avoid confusion.
Proposed cleanup
-  const renderDoc = (docName, fileUrl, highlightData) => {
+  const renderDoc = (fileUrl, highlightData) => {
And update the call site at line 489:
-          {renderDoc(selectedDoc?.document_name, blobFileUrl, highlightData)}
+          {renderDoc(blobFileUrl, highlightData)}

coderabbitai · 2026-02-08T20:43:49Z

frontend/src/components/custom-tools/document-manager/DocumentManager.jsx

+        reader.onerror = () => {
+          throw new Error("Fail to load the file");
+        };


⚠️ Potential issue | 🟡 Minor

Unhandled throw inside reader.onerror callback.

Throwing inside an async callback (FileReader.onerror) won't be caught by any surrounding try/catch or .catch() — it results in an unhandled error. This should set the error state instead.

Proposed fix

reader.onerror = () => { - throw new Error("Fail to load the file"); + setFileErrMsg("Failed to load the file"); };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

reader.onerror = () => {

throw new Error("Fail to load the file");

};

reader.onerror = () => {

setFileErrMsg("Failed to load the file");

};

🤖 Prompt for AI Agents

In `@frontend/src/components/custom-tools/document-manager/DocumentManager.jsx` around lines 259 - 261, The FileReader.onerror handler currently throws an exception inside the async callback (reader.onerror) which becomes unhandled; instead, update the component's error state inside that callback (e.g., call your state setter such as setError or setFileError with a descriptive message and the event/error details) and avoid throwing — locate reader.onerror in DocumentManager.jsx and replace the throw new Error("Fail to load the file") with a call that sets the component error state and optionally logs the event so the UI can render the error.

coderabbitai · 2026-02-08T20:43:49Z

frontend/src/components/custom-tools/document-manager/DocumentManager.jsx

+          isContentAvailable={
+            fileUrl?.length > 0 ||
+            (fileData.mimeType && fileData.mimeType !== "application/pdf")
+          }


⚠️ Potential issue | 🟠 Major

isContentAvailable can be true even when document fetch fails.

This condition considers content available when fileData.mimeType is set to a non-PDF value. However, since fileData isn't cleared on fetch errors (see handleGetDocsError), a previously loaded non-PDF document's mimeType can leak into the next document's state. This ties back to the stale fileData issue above — once that's fixed, this logic should work correctly.

🤖 Prompt for AI Agents

In `@frontend/src/components/custom-tools/document-manager/DocumentManager.jsx` around lines 482 - 485, The isContentAvailable prop can remain as written but you must prevent stale fileData from making it true after a failed fetch: update handleGetDocsError to clear the previous fileData (e.g., reset fileData to null/empty object) and/or set an explicit fetch-failed flag (e.g., fileFetchError) and include that in the isContentAvailable check; specifically, modify handleGetDocsError to reset the state that holds the prior mimeType (refer to fileData and handleGetDocsError) so that isContentAvailable (which depends on fileUrl and fileData.mimeType) cannot be true after a fetch failure.

backend/prompt_studio/prompt_studio_core_v2/views.py

coderabbitai bot reviewed Feb 8, 2026

View reviewed changes

jagadeeswaran-zipstack changed the title ~~[FEAT] Support CSV, TXT, and Excel files in Prompt Studio (Enterprise)~~ UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio (Enterprise) Feb 8, 2026

jagadeeswaran-zipstack changed the title ~~UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio (Enterprise)~~ UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio Feb 8, 2026

lint error fix

c4fe4e2

jagadeeswaran-zipstack requested review from Deepak-Kesavan, athul-rs, harini-venkataraman and vishnuszipstack February 8, 2026 20:42

coderabbitai bot reviewed Feb 8, 2026

View reviewed changes

harini-venkataraman reviewed Feb 9, 2026

View reviewed changes

backend/prompt_studio/prompt_studio_core_v2/views.py Show resolved Hide resolved

harini-venkataraman approved these changes Feb 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio#1783

UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio#1783
jagadeeswaran-zipstack wants to merge 2 commits intomainfrom
feature/csv-txt-excel-prompt-studio

jagadeeswaran-zipstack commented Feb 8, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 8, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

sonarqubecloud bot commented Feb 8, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 8, 2026

Uh oh!

coderabbitai bot Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jagadeeswaran-zipstack commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Files Changed:

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

Database Migrations

Env Config

Related Issues or PRs

Notes on Testing

Checklist

Uh oh!

coderabbitai bot commented Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 8, 2026

Test Results

Uh oh!

sonarqubecloud bot commented Feb 8, 2026

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jagadeeswaran-zipstack commented Feb 8, 2026 •

edited

Loading

coderabbitai bot commented Feb 8, 2026 •

edited

Loading