-
Notifications
You must be signed in to change notification settings - Fork 580
UN-3149 [FEAT] Support CSV, TXT, and Excel files in Prompt Studio #1783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -25,7 +25,6 @@ import { ManageDocsModal } from "../manage-docs-modal/ManageDocsModal"; | |||||||||||||
| import { PdfViewer } from "../pdf-viewer/PdfViewer"; | ||||||||||||||
| import { TextViewerPre } from "../text-viewer-pre/TextViewerPre"; | ||||||||||||||
| import usePostHogEvents from "../../../hooks/usePostHogEvents"; | ||||||||||||||
| import { TextViewer } from "../text-viewer/TextViewer"; | ||||||||||||||
|
|
||||||||||||||
| let items = [ | ||||||||||||||
| { | ||||||||||||||
|
|
@@ -247,17 +246,27 @@ function DocumentManager({ generateIndex, handleUpdateTool, handleDocChange }) { | |||||||||||||
|
|
||||||||||||||
| const processGetDocsResponse = (data, viewType, mimeType) => { | ||||||||||||||
| if (viewType === viewTypes.original) { | ||||||||||||||
| const base64String = data || ""; | ||||||||||||||
| const blob = base64toBlobWithMime(base64String, mimeType); | ||||||||||||||
| setFileData({ blob, mimeType }); | ||||||||||||||
| const reader = new FileReader(); | ||||||||||||||
| reader.readAsDataURL(blob); | ||||||||||||||
| reader.onload = () => { | ||||||||||||||
| setFileUrl(reader.result); | ||||||||||||||
| }; | ||||||||||||||
| reader.onerror = () => { | ||||||||||||||
| throw new Error("Fail to load the file"); | ||||||||||||||
| }; | ||||||||||||||
| if (mimeType === "application/pdf") { | ||||||||||||||
| // Existing flow: base64 → blob → PdfViewer | ||||||||||||||
| const base64String = data || ""; | ||||||||||||||
| const blob = base64toBlobWithMime(base64String, mimeType); | ||||||||||||||
| setFileData({ blob, mimeType }); | ||||||||||||||
| const reader = new FileReader(); | ||||||||||||||
| reader.readAsDataURL(blob); | ||||||||||||||
| reader.onload = () => { | ||||||||||||||
| setFileUrl(reader.result); | ||||||||||||||
| }; | ||||||||||||||
| reader.onerror = () => { | ||||||||||||||
| throw new Error("Fail to load the file"); | ||||||||||||||
| }; | ||||||||||||||
|
Comment on lines
+259
to
+261
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unhandled Throwing inside an async callback ( Proposed fix reader.onerror = () => {
- throw new Error("Fail to load the file");
+ setFileErrMsg("Failed to load the file");
};📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||
| } else { | ||||||||||||||
| // Non-PDF file (CSV, TXT, Excel, or non-convertible) | ||||||||||||||
| // data is text, not base64 | ||||||||||||||
| setFileUrl(""); | ||||||||||||||
| setFileData({ blob: null, mimeType }); | ||||||||||||||
| // Auto-switch to Raw View for non-PDF files | ||||||||||||||
| setActiveKey("2"); | ||||||||||||||
| } | ||||||||||||||
| } else if (viewType === viewTypes.extract) { | ||||||||||||||
| setExtractTxt(data?.data); | ||||||||||||||
| } | ||||||||||||||
|
|
@@ -345,16 +354,19 @@ function DocumentManager({ generateIndex, handleUpdateTool, handleDocChange }) { | |||||||||||||
| }; | ||||||||||||||
|
|
||||||||||||||
| const renderDoc = (docName, fileUrl, highlightData) => { | ||||||||||||||
| const fileType = docName?.split(".").pop().toLowerCase(); // Get the file extension | ||||||||||||||
| switch (fileType) { | ||||||||||||||
| case "pdf": | ||||||||||||||
| return <PdfViewer fileUrl={fileUrl} highlightData={highlightData} />; | ||||||||||||||
| case "txt": | ||||||||||||||
| case "md": | ||||||||||||||
| return <TextViewer fileUrl={fileUrl} />; | ||||||||||||||
| default: | ||||||||||||||
| return <div>Unsupported file type: {fileType}</div>; | ||||||||||||||
| // Use mimeType from response for rendering decisions | ||||||||||||||
| if (fileData.mimeType === "application/pdf") { | ||||||||||||||
| return <PdfViewer fileUrl={fileUrl} highlightData={highlightData} />; | ||||||||||||||
| } | ||||||||||||||
| // Non-PDF: show placeholder message | ||||||||||||||
| return ( | ||||||||||||||
| <div className="text-viewer-layout"> | ||||||||||||||
| <Typography.Text type="secondary"> | ||||||||||||||
| Document preview is not available for this file type. Please index the | ||||||||||||||
| document and switch to Raw View. | ||||||||||||||
| </Typography.Text> | ||||||||||||||
| </div> | ||||||||||||||
| ); | ||||||||||||||
| }; | ||||||||||||||
|
|
||||||||||||||
| return ( | ||||||||||||||
|
|
@@ -467,7 +479,10 @@ function DocumentManager({ generateIndex, handleUpdateTool, handleDocChange }) { | |||||||||||||
| <DocumentViewer | ||||||||||||||
| doc={selectedDoc?.document_name} | ||||||||||||||
| isLoading={isDocLoading} | ||||||||||||||
| isContentAvailable={fileUrl?.length > 0} | ||||||||||||||
| isContentAvailable={ | ||||||||||||||
| fileUrl?.length > 0 || | ||||||||||||||
| (fileData.mimeType && fileData.mimeType !== "application/pdf") | ||||||||||||||
| } | ||||||||||||||
|
Comment on lines
+482
to
+485
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This condition considers content available when 🤖 Prompt for AI Agents |
||||||||||||||
| setOpenManageDocsModal={setOpenManageDocsModal} | ||||||||||||||
| errMsg={fileErrMsg} | ||||||||||||||
| > | ||||||||||||||
|
|
||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 115
🏁 Script executed:
# Get the full context around line 577 wc -l backend/prompt_studio/prompt_studio_core_v2/views.pyRepository: Zipstack/unstract
Length of output: 117
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 3014
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1108
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 2773
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 5750
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 3946
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 4511
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 2535
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1079
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 4941
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 2391
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 5493
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 45
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 191
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 2646
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1953
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 138
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 973
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 661
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 183
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1521
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1787
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 838
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 4505
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 4561
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 509
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 567
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 321
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 670
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 968
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 4765
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 43
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 1672
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 481
🏁 Script executed:
Repository: Zipstack/unstract
Length of output: 656
file_typeat line 577 relies on browser-supplied MIME, not actual file content detection.The
FileValidatorclass performs server-side MIME sniffing usingmagic.from_buffer()to validate uploads, which correctly identifies actual file types. However, after validation passes, this detected MIME type is discarded. The conversion logic at line 577 then usesuploaded_file.content_type(browser-supplied), not the server-detected MIME.This creates an inconsistency: a CSV file with
content_type="application/octet-stream"(common browser behavior) will pass validation (actual MIME detected astext/csv), but the conversion decision will use the unreliable browser value. If the file converter plugin supports multiple formats beyond PDF, the conversion-vs-direct-upload logic could behave unexpectedly. Consider either storing the detected MIME type during validation or re-detecting it before the conversion decision.🤖 Prompt for AI Agents