From 094aba99cfac399d8e467e9a573e0be92122494b Mon Sep 17 00:00:00 2001 From: "donghyuck, son" Date: Wed, 29 Apr 2026 15:13:38 +0900 Subject: [PATCH] =?UTF-8?q?[ai-assisted]=20feat(thumbnail):=20=EB=AC=B8?= =?UTF-8?q?=EC=84=9C=20=EC=8D=B8=EB=84=A4=EC=9D=BC=20renderer=20=ED=99=95?= =?UTF-8?q?=EC=9E=A5?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Issue: #371 AI-Assisted: Yes PPTX 썸네일은 LibreOffice 외부 프로세스 대신 Apache POI slide renderer를 사용하도록 정리했다. DOCX/HWP/HWPX는 새 별도 renderer를 core에 두지 않고 thumbnail starter에서 textract `FileContentExtractionService` 기반 preview renderer로 등록한다. `studio.thumbnail.libre-office.*`와 `pptx.timeout` 설정/metadata/docs를 제거하고, 앱 예시는 POI/textract preview 기준으로 갱신했다. Async thumbnail follow-up: - 저장된 썸네일이 없으면 `/thumbnail`은 `X-Thumbnail-Status: pending` placeholder 이미지를 즉시 반환하고 starter background executor에서 생성한다. - queue reject 시 실제 작업이 없는데 pending을 반환하지 않고 unavailable 경로로 빠지도록 했다. - deterministic failure/unsupported 결과는 source 단위 bounded TTL 캐시에 memoize하고, 동일 source 동시 요청은 하나의 background job으로 합친다. - attachment 삭제와 queued generation 완료 사이 race에서 stale thumbnail이 다시 저장되지 않도록 deletion marker와 striped lock을 적용했다. - 204 응답에는 `X-Thumbnail-Status: unavailable`과 `Cache-Control: no-store`를 내려준다. Review follow-up: - format marker interface를 추가해 사용자가 직접 등록한 DOCX/HWP/HWPX renderer가 기본 preview renderer를 type 기준으로 대체할 수 있게 했다. - PPTX package를 POI 파싱 전에 ZIP entry/aggregate byte budget으로 pre-scan한다. - HWP/HWPX parser에 entry 및 aggregate extraction budget을 적용했다. - DOCX parser도 POI 파싱 전 ZIP entry/aggregate byte budget으로 pre-scan한다. - attachment starter README의 optional dependency 안내를 PDF/PPTX/DOCX-HWP-HWPX 기준으로 분리했다. Subagent usage: - code-reviewer: renderer replacement, dependency docs, async queue reject, stale thumbnail race findings. - security-auditor: PPTX package bound, HWP aggregate budget, deterministic failure memoization findings. Validation: - ./gradlew :studio-platform-textract:test --tests 'studio.one.platform.textract.extractor.impl.DocxFileParserTest' :studio-application-modules:attachment-service:test --tests 'studio.one.application.attachment.thumbnail.ThumbnailServiceImplTest' --tests 'studio.one.application.web.controller.AttachmentControllerTest' :starter:studio-application-starter-attachment:test => BUILD SUCCESSFUL - ./gradlew :studio-platform-textract:test :studio-platform-thumbnail:test :starter:studio-platform-thumbnail-starter:test :studio-application-modules:attachment-service:test :starter:studio-application-starter-attachment:test => BUILD SUCCESSFUL - ./gradlew test => BUILD SUCCESSFUL - git diff --check => passed - /Users/donghyuck.son/git/studio-one/studio-one-api-server ./gradlew compileJava => BUILD SUCCESSFUL - /Users/donghyuck.son/git/studio-one/studio-one-api-server git diff --check => passed --- CHANGELOG.md | 5 + README.md | 11 + .../README.md | 15 +- .../AttachmentAutoConfiguration.java | 26 +- .../README.md | 21 +- .../build.gradle.kts | 3 + ...tractDocumentPreviewThumbnailRenderer.java | 215 ++++++++++++++++ .../ThumbnailAutoConfiguration.java | 11 + .../autoconfigure/ThumbnailProperties.java | 33 ++- ...bnailTextractPreviewAutoConfiguration.java | 44 ++++ ...itional-spring-configuration-metadata.json | 48 ++++ ...ot.autoconfigure.AutoConfiguration.imports | 1 + .../ThumbnailAutoConfigurationTest.java | 152 +++++++++++- .../attachment-service/README.md | 11 +- .../attachment/thumbnail/ThumbnailData.java | 14 ++ .../thumbnail/ThumbnailPlaceholder.java | 56 +++++ .../thumbnail/ThumbnailServiceImpl.java | 213 +++++++++++++++- .../web/controller/AttachmentController.java | 15 +- .../thumbnail/ThumbnailServiceImplTest.java | 234 ++++++++++++++++++ .../controller/AttachmentControllerTest.java | 22 ++ .../extractor/impl/DocxFileParser.java | 33 +++ .../extractor/impl/HwpHwpxFileParser.java | 132 ++++++++-- .../extractor/impl/DocxFileParserTest.java | 54 ++++ .../extractor/impl/HwpHwpxFileParserTest.java | 47 +++- studio-platform-thumbnail/README.md | 3 +- studio-platform-thumbnail/build.gradle.kts | 2 + .../thumbnail/ThumbnailGenerationService.java | 6 +- .../platform/thumbnail/ThumbnailOptions.java | 11 +- .../renderer/DocxThumbnailRenderer.java | 6 + .../renderer/HwpThumbnailRenderer.java | 6 + .../renderer/HwpxThumbnailRenderer.java | 6 + .../renderer/PptxThumbnailRenderer.java | 119 +++++++++ .../DocumentThumbnailRendererTest.java | 117 +++++++++ 33 files changed, 1649 insertions(+), 43 deletions(-) create mode 100644 starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/TextractDocumentPreviewThumbnailRenderer.java create mode 100644 starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailTextractPreviewAutoConfiguration.java create mode 100644 studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailPlaceholder.java create mode 100644 studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/DocxThumbnailRenderer.java create mode 100644 studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpThumbnailRenderer.java create mode 100644 studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpxThumbnailRenderer.java create mode 100644 studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/PptxThumbnailRenderer.java create mode 100644 studio-platform-thumbnail/src/test/java/studio/one/platform/thumbnail/DocumentThumbnailRendererTest.java diff --git a/CHANGELOG.md b/CHANGELOG.md index 796d5223..3687ee78 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,11 @@ ## 2026-04-28 ### 변경됨 +- 이슈 #371 대응으로 `studio-platform-thumbnail`에 PPTX/DOCX/HWP/HWPX 문서 썸네일 renderer를 추가했다. +- PPTX는 Apache POI slide renderer로 실제 slide thumbnail을 생성하고, DOCX/HWP/HWPX는 `FileContentExtractionService`의 구조화 추출 결과로 preview thumbnail을 생성한다. +- `studio.thumbnail.renderers..*` configuration metadata 및 README 예시를 추가했다. +- 문서 썸네일 renderer는 PPTX package pre-scan, HWP/HWPX aggregate extraction budget, deterministic failure memoization으로 반복 파싱과 압축 확장 위험을 줄였다. +- attachment 썸네일은 저장된 썸네일이 없을 때 pending placeholder 이미지를 즉시 반환하고, 실제 생성은 백그라운드 executor에서 수행하도록 변경했다. 변환 실패는 bounded TTL 캐시에 memoize하고 동일 source의 동시 요청은 하나의 background job으로 합친다. - 이슈 #368 대응으로 독립 `studio-platform-thumbnail` SPI와 `studio-platform-thumbnail-starter`를 추가해 image/PDF 썸네일 생성을 attachment 도메인 밖으로 분리했다. - attachment 썸네일 endpoint와 저장소 구조는 유지하되, 기존 `ThumbnailServiceImpl`은 `ThumbnailGenerationService`에 위임하도록 변경했다. - 썸네일 생성 기본값은 `studio.thumbnail.*`로 이동하고, `studio.attachment.thumbnail.default-size/default-format` 및 기존 `studio.features.attachment.thumbnail.default-size/default-format`는 fallback과 deprecation warning을 유지한다. diff --git a/README.md b/README.md index 03be7e58..d4023b81 100644 --- a/README.md +++ b/README.md @@ -294,6 +294,17 @@ studio: renderers: pdf: enabled: false + pptx: + enabled: false + slide: 0 + docx: + enabled: false + hwp: + enabled: false + hwpx: + enabled: false + # PPTX는 Apache POI slide renderer를 사용하고, + # DOCX/HWP/HWPX는 textract 결과로 preview 썸네일을 만든다. user: password-policy: min-length: 12 diff --git a/starter/studio-application-starter-attachment/README.md b/starter/studio-application-starter-attachment/README.md index 28849e04..9e6d5853 100644 --- a/starter/studio-application-starter-attachment/README.md +++ b/starter/studio-application-starter-attachment/README.md @@ -16,6 +16,10 @@ dependencies { implementation("org.springframework.boot:spring-boot-starter-data-jpa") // PDF 썸네일을 사용할 때만 implementation("org.apache.pdfbox:pdfbox") + // PPTX 썸네일을 사용할 때만 + implementation("org.apache.poi:poi-ooxml") + // DOCX/HWP/HWPX preview 썸네일을 사용할 때만 + implementation(project(":starter:studio-platform-textract-starter")) } ``` @@ -66,6 +70,15 @@ studio: pdf: enabled: false # PDFBox classpath가 있고 명시적으로 true일 때만 등록 page: 0 + pptx: + enabled: false # POI 기반 opt-in renderer + slide: 0 + docx: + enabled: false # textract preview 기반 opt-in renderer + hwp: + enabled: false # textract preview 기반 opt-in renderer + hwpx: + enabled: false # textract preview 기반 opt-in renderer ``` ### 스토리지 타입 @@ -131,7 +144,7 @@ attachment-service를 직접 사용하는 경우에는 스타터 없이 모듈 - `studio.features.attachment.enabled=false`로 전체 비활성화할 수 있다. - 운영 환경에서는 `studio.attachment.storage.base-dir`와 `studio.attachment.thumbnail.base-dir`를 애플리케이션 전용 private 경로로 명시하고 쓰기 권한을 확인한다. 경로를 비우면 tmp 하위 기본 경로를 사용한다. -- PDF 썸네일은 PDFBox가 classpath에 있고 `studio.thumbnail.renderers.pdf.enabled=true`를 명시했을 때만 생성된다. 없으면 image renderer만 등록된다. +- PDF/PPTX/DOCX/HWP/HWPX 썸네일은 `studio.thumbnail.renderers..enabled=true`를 명시했을 때만 생성된다. PDF는 PDFBox, PPTX는 POI, DOCX/HWP/HWPX는 textract `FileContentExtractionService`가 필요하며, 조건이 없으면 image renderer만 등록되거나 해당 source를 지원하지 않는 것으로 처리된다. 저장된 썸네일이 없으면 `/thumbnail`은 `X-Thumbnail-Status: pending` 헤더와 함께 placeholder 이미지를 즉시 반환하고, starter가 등록한 `attachmentThumbnailExecutor`에서 실제 생성을 수행한다. 직접 `ThumbnailServiceImpl`를 구성할 때도 비동기 동작이 필요하면 executor를 받는 생성자를 사용한다. 변환 불가 문서는 bounded TTL 실패 상태를 memoize하고 이후 `X-Thumbnail-Status: unavailable` 204를 반환한다. DOCX/HWP/HWPX preview는 textract parser 표면을 사용하므로 필요한 경우에만 켜고, `studio.textract.max-extract-size`는 압축 입력 크기 제한으로 보수적으로 설정한다. DOCX parser는 압축 해제 work에 별도 entry/total budget을 적용한다. - DB 스토리지 사용 시 `TB_APPLICATION_ATTACHMENT_DATA` 테이블(BLOB 컬럼 포함)이 준비되어 있어야 한다. - 업로드 최대 크기는 컨트롤러 수준에서 50 MB로 제한된다. - 권한 스코프(`features:attachment/*`)를 인가 서버 또는 ACL에 등록해야 한다. diff --git a/starter/studio-application-starter-attachment/src/main/java/studio/one/application/attachment/autoconfigure/AttachmentAutoConfiguration.java b/starter/studio-application-starter-attachment/src/main/java/studio/one/application/attachment/autoconfigure/AttachmentAutoConfiguration.java index e6a3ac96..31f76ac7 100644 --- a/starter/studio-application-starter-attachment/src/main/java/studio/one/application/attachment/autoconfigure/AttachmentAutoConfiguration.java +++ b/starter/studio-application-starter-attachment/src/main/java/studio/one/application/attachment/autoconfigure/AttachmentAutoConfiguration.java @@ -25,6 +25,8 @@ import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; +import java.util.concurrent.Executor; +import java.util.concurrent.RejectedExecutionException; import jakarta.persistence.EntityManagerFactory; @@ -43,6 +45,7 @@ import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Primary; import org.springframework.core.env.Environment; +import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor; import org.springframework.data.jpa.repository.config.EnableJpaRepositories; import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate; import org.springframework.util.StringUtils; @@ -212,6 +215,22 @@ ThumbnailStorage thumbnailStorage( return new LocalThumbnailStore(baseDir); } + @Bean + @ConditionalOnAttachmentThumbnailEnabled + @ConditionalOnMissingBean(name = "attachmentThumbnailExecutor") + Executor attachmentThumbnailExecutor() { + ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); + executor.setThreadNamePrefix("studio-thumbnail-"); + executor.setCorePoolSize(1); + executor.setMaxPoolSize(2); + executor.setQueueCapacity(100); + executor.setRejectedExecutionHandler((runnable, pool) -> { + throw new RejectedExecutionException("attachment thumbnail queue is full"); + }); + executor.initialize(); + return executor; + } + @Bean @ConditionalOnAttachmentThumbnailEnabled @ConditionalOnBean(ThumbnailGenerationService.class) @@ -220,11 +239,16 @@ ThumbnailService thumbnailService( AttachmentService attachmentService, ThumbnailStorage thumbnailStorage, ThumbnailGenerationService thumbnailGenerationService, + @Qualifier("attachmentThumbnailExecutor") Executor attachmentThumbnailExecutor, ObjectProvider i18nProvider) { I18n i18n = I18nUtils.resolve(i18nProvider); log.info(LogUtils.format(i18n, I18nKeys.AutoConfig.Feature.Service.DETAILS, FEATURE_NAME, LogUtils.blue(ThumbnailServiceImpl.class, true), LogUtils.red(State.CREATED.toString()))); - return new ThumbnailServiceImpl(attachmentService, thumbnailStorage, thumbnailGenerationService); + return new ThumbnailServiceImpl( + attachmentService, + thumbnailStorage, + thumbnailGenerationService, + attachmentThumbnailExecutor); } private String resolveBaseDir(AttachmentProperties.Storage storage, Repository repository) { diff --git a/starter/studio-platform-thumbnail-starter/README.md b/starter/studio-platform-thumbnail-starter/README.md index ed8d4679..20974d0c 100644 --- a/starter/studio-platform-thumbnail-starter/README.md +++ b/starter/studio-platform-thumbnail-starter/README.md @@ -2,15 +2,22 @@ `studio-platform-thumbnail`의 renderer와 `ThumbnailGenerationService`를 자동 구성하는 스타터다. -PDF renderer는 보안상 기본 비활성화이며, 사용하려면 애플리케이션 런타임에 PDFBox 의존성을 직접 추가하고 설정을 켠다. +PDF/PPTX/DOCX/HWP/HWPX renderer는 보안상 기본 비활성화이며, 사용하려면 애플리케이션 런타임에 필요한 의존성과 설정을 준비한다. ```kotlin dependencies { implementation(project(":starter:studio-platform-thumbnail-starter")) + // PDF renderer에 필요 implementation("org.apache.pdfbox:pdfbox") + // PPTX renderer에 필요 + implementation("org.apache.poi:poi-ooxml") + // DOCX/HWP/HWPX preview renderer에 필요 + implementation(project(":starter:studio-platform-textract-starter")) } ``` +PPTX renderer는 Apache POI로 slide를 직접 그린다. DOCX/HWP/HWPX renderer는 textract의 구조화 추출 결과를 preview 이미지로 그리므로, 문서 레이아웃의 정확한 rasterize가 아니라 검색/관리 화면용 대표 preview다. + ```yaml studio: features: @@ -29,9 +36,21 @@ studio: pdf: enabled: false page: 0 + pptx: + enabled: false + slide: 0 + docx: + enabled: false + hwp: + enabled: false + hwpx: + enabled: false ``` - `ImageThumbnailRenderer`는 기본 등록된다. - `PdfThumbnailRenderer`는 PDFBox가 classpath에 있고 `studio.thumbnail.renderers.pdf.enabled=true`를 명시했을 때만 등록된다. PDF는 복잡한 외부 입력을 파싱/렌더링하므로 기본값은 false다. +- `PptxThumbnailRenderer`는 POI OOXML이 classpath에 있고 `studio.thumbnail.renderers.pptx.enabled=true`일 때 등록된다. +- DOCX/HWP/HWPX preview renderer는 `FileContentExtractionService` bean이 있고 각 renderer를 명시적으로 enabled 했을 때 등록된다. 지원 renderer가 없거나 추출할 수 없는 문서는 attachment `/thumbnail`에서 204를 반환한다. +- DOCX/HWP/HWPX preview renderer는 textract parser 표면을 함께 사용하므로 필요한 경우에만 켜고, `studio.textract.max-extract-size`를 운영 환경에 맞게 보수적으로 유지한다. - `studio.attachment.thumbnail.default-size/default-format`와 `studio.features.attachment.thumbnail.default-size/default-format`는 migration window 동안 `studio.thumbnail.default-size/default-format`의 fallback으로만 읽고 WARN을 출력한다. - attachment 모듈은 endpoint와 저장소 계약을 유지하고, 실제 생성은 이 스타터가 제공하는 `ThumbnailGenerationService`를 사용한다. diff --git a/starter/studio-platform-thumbnail-starter/build.gradle.kts b/starter/studio-platform-thumbnail-starter/build.gradle.kts index ea8b7b9c..dd2257ea 100644 --- a/starter/studio-platform-thumbnail-starter/build.gradle.kts +++ b/starter/studio-platform-thumbnail-starter/build.gradle.kts @@ -26,7 +26,10 @@ dependencies { api(project(":studio-platform-thumbnail")) api(project(":studio-platform-autoconfigure")) implementation("org.springframework.boot:spring-boot-starter-validation") + compileOnly(project(":studio-platform-textract")) compileOnly("org.apache.pdfbox:pdfbox:${property("apachePdfBoxVersion")}") + testImplementation(project(":studio-platform-textract")) testImplementation("org.apache.pdfbox:pdfbox:${property("apachePdfBoxVersion")}") + testImplementation("org.apache.poi:poi-ooxml:${property("apachePoiVersion")}") } diff --git a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/TextractDocumentPreviewThumbnailRenderer.java b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/TextractDocumentPreviewThumbnailRenderer.java new file mode 100644 index 00000000..e187dfd9 --- /dev/null +++ b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/TextractDocumentPreviewThumbnailRenderer.java @@ -0,0 +1,215 @@ +package studio.one.platform.thumbnail.autoconfigure; + +import java.awt.Color; +import java.awt.Font; +import java.awt.FontMetrics; +import java.awt.Graphics2D; +import java.awt.RenderingHints; +import java.awt.image.BufferedImage; +import java.io.ByteArrayInputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Locale; +import java.util.Set; + +import studio.one.platform.textract.model.ParsedFile; +import studio.one.platform.textract.service.FileContentExtractionService; +import studio.one.platform.thumbnail.ThumbnailFormats; +import studio.one.platform.thumbnail.ThumbnailGenerationException; +import studio.one.platform.thumbnail.ThumbnailImages; +import studio.one.platform.thumbnail.ThumbnailOptions; +import studio.one.platform.thumbnail.ThumbnailRenderer; +import studio.one.platform.thumbnail.ThumbnailResult; +import studio.one.platform.thumbnail.ThumbnailSource; +import studio.one.platform.thumbnail.renderer.DocxThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.HwpThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.HwpxThumbnailRenderer; + +class TextractDocumentPreviewThumbnailRenderer implements ThumbnailRenderer { + + private static final int CANVAS_WIDTH = 480; + private static final int CANVAS_HEIGHT = 640; + private static final int PADDING = 36; + private static final int MAX_LINES = 14; + private static final int MAX_CHARS = 1200; + + private final FileContentExtractionService extractionService; + private final String label; + private final Set contentTypes; + private final Set extensions; + + protected TextractDocumentPreviewThumbnailRenderer( + FileContentExtractionService extractionService, + String label, + Set contentTypes, + Set extensions) { + this.extractionService = extractionService; + this.label = label; + this.contentTypes = Set.copyOf(contentTypes); + this.extensions = Set.copyOf(extensions); + } + + static DocxThumbnailRenderer docx(FileContentExtractionService extractionService) { + return new TextractDocxPreviewThumbnailRenderer(extractionService); + } + + static HwpThumbnailRenderer hwp(FileContentExtractionService extractionService) { + return new TextractHwpPreviewThumbnailRenderer(extractionService); + } + + static HwpxThumbnailRenderer hwpx(FileContentExtractionService extractionService) { + return new TextractHwpxPreviewThumbnailRenderer(extractionService); + } + + @Override + public boolean supports(ThumbnailSource source) { + String contentType = source.contentType().toLowerCase(Locale.ROOT); + if (contentTypes.contains(contentType)) { + return true; + } + String filename = source.filename().toLowerCase(Locale.ROOT); + return extensions.stream().anyMatch(filename::endsWith); + } + + @Override + public ThumbnailResult render(ThumbnailSource source, ThumbnailOptions options) { + try { + ParsedFile parsed = extractionService.parseStructured( + source.contentType(), + source.filename(), + new ByteArrayInputStream(source.bytes())); + BufferedImage preview = drawPreview(source.filename(), parsed); + BufferedImage scaled = ThumbnailImages.scale(preview, options.size()); + byte[] bytes = ThumbnailImages.write(scaled, options.format()); + return new ThumbnailResult(bytes, ThumbnailFormats.contentType(options.format()), options.format()); + } catch (Exception ex) { + throw new ThumbnailGenerationException("Failed to render document preview thumbnail source", ex); + } + } + + private BufferedImage drawPreview(String filename, ParsedFile parsed) { + BufferedImage image = new BufferedImage(CANVAS_WIDTH, CANVAS_HEIGHT, BufferedImage.TYPE_INT_RGB); + Graphics2D graphics = image.createGraphics(); + try { + graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); + graphics.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_ON); + graphics.setColor(new Color(248, 250, 252)); + graphics.fillRect(0, 0, CANVAS_WIDTH, CANVAS_HEIGHT); + graphics.setColor(new Color(45, 65, 89)); + graphics.fillRect(0, 0, CANVAS_WIDTH, 88); + + graphics.setColor(Color.WHITE); + graphics.setFont(new Font(Font.SANS_SERIF, Font.BOLD, 30)); + graphics.drawString(label, PADDING, 55); + + graphics.setFont(new Font(Font.SANS_SERIF, Font.PLAIN, 15)); + graphics.drawString(trim(filename, 36), PADDING + 110, 55); + + graphics.setColor(new Color(226, 232, 240)); + graphics.fillRect(PADDING, 118, CANVAS_WIDTH - (PADDING * 2), 2); + + graphics.setColor(new Color(15, 23, 42)); + graphics.setFont(new Font(Font.SANS_SERIF, Font.BOLD, 20)); + graphics.drawString(trim(title(filename, parsed), 34), PADDING, 158); + + graphics.setFont(new Font(Font.SANS_SERIF, Font.PLAIN, 17)); + graphics.setColor(new Color(51, 65, 85)); + FontMetrics metrics = graphics.getFontMetrics(); + int y = 200; + for (String line : previewLines(parsed.plainText(), metrics, CANVAS_WIDTH - (PADDING * 2))) { + graphics.drawString(line, PADDING, y); + y += 28; + } + + graphics.setColor(new Color(100, 116, 139)); + graphics.setFont(new Font(Font.SANS_SERIF, Font.PLAIN, 14)); + String footer = parsed.blocks().size() + " blocks / " + parsed.tables().size() + + " tables / " + parsed.images().size() + " images"; + graphics.drawString(footer, PADDING, CANVAS_HEIGHT - 34); + return image; + } finally { + graphics.dispose(); + } + } + + private String title(String filename, ParsedFile parsed) { + return parsed.plainText().lines() + .map(String::trim) + .filter(line -> !line.isBlank()) + .findFirst() + .orElse(filename == null || filename.isBlank() ? label + " document" : filename); + } + + private List previewLines(String text, FontMetrics metrics, int maxWidth) { + String normalized = text == null ? "" : text.replaceAll("\\s+", " ").trim(); + if (normalized.length() > MAX_CHARS) { + normalized = normalized.substring(0, MAX_CHARS); + } + if (normalized.isBlank()) { + return List.of("No preview text extracted."); + } + List lines = new ArrayList<>(); + StringBuilder current = new StringBuilder(); + for (String word : normalized.split(" ")) { + String candidate = current.isEmpty() ? word : current + " " + word; + if (metrics.stringWidth(candidate) > maxWidth && !current.isEmpty()) { + lines.add(current.toString()); + current.setLength(0); + current.append(word); + if (lines.size() == MAX_LINES) { + return lines; + } + } else { + current.setLength(0); + current.append(candidate); + } + } + if (!current.isEmpty() && lines.size() < MAX_LINES) { + lines.add(current.toString()); + } + return lines; + } + + private String trim(String value, int maxLength) { + if (value == null || value.length() <= maxLength) { + return value == null ? "" : value; + } + return value.substring(0, Math.max(0, maxLength - 3)) + "..."; + } + + private static final class TextractDocxPreviewThumbnailRenderer + extends TextractDocumentPreviewThumbnailRenderer implements DocxThumbnailRenderer { + + private TextractDocxPreviewThumbnailRenderer(FileContentExtractionService extractionService) { + super( + extractionService, + "DOCX", + Set.of("application/vnd.openxmlformats-officedocument.wordprocessingml.document"), + Set.of(".docx")); + } + } + + private static final class TextractHwpPreviewThumbnailRenderer + extends TextractDocumentPreviewThumbnailRenderer implements HwpThumbnailRenderer { + + private TextractHwpPreviewThumbnailRenderer(FileContentExtractionService extractionService) { + super( + extractionService, + "HWP", + Set.of("application/x-hwp", "application/haansofthwp", "application/vnd.hancom.hwp"), + Set.of(".hwp")); + } + } + + private static final class TextractHwpxPreviewThumbnailRenderer + extends TextractDocumentPreviewThumbnailRenderer implements HwpxThumbnailRenderer { + + private TextractHwpxPreviewThumbnailRenderer(FileContentExtractionService extractionService) { + super( + extractionService, + "HWPX", + Set.of("application/x-hwpx", "application/vnd.hancom.hwpx", "application/hwpx"), + Set.of(".hwpx")); + } + } +} diff --git a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfiguration.java b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfiguration.java index 84463bcf..f5660fe7 100644 --- a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfiguration.java +++ b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfiguration.java @@ -24,6 +24,7 @@ import studio.one.platform.thumbnail.ThumbnailRendererFactory; import studio.one.platform.thumbnail.renderer.ImageThumbnailRenderer; import studio.one.platform.thumbnail.renderer.PdfThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.PptxThumbnailRenderer; import studio.one.platform.util.I18nUtils; import studio.one.platform.util.LogUtils; @@ -59,6 +60,16 @@ PdfThumbnailRenderer pdfThumbnailRenderer() { return new PdfThumbnailRenderer(props.getRenderers().getPdf().getPage()); } + @Bean + @Order(300) + @ConditionalOnClass(name = "org.apache.poi.xslf.usermodel.XMLSlideShow") + @ConditionalOnProperty(prefix = "studio.thumbnail.renderers.pptx", name = "enabled", havingValue = "true") + @ConditionalOnMissingBean(PptxThumbnailRenderer.class) + PptxThumbnailRenderer pptxThumbnailRenderer() { + logCreated(PptxThumbnailRenderer.class); + return new PptxThumbnailRenderer(props.getRenderers().getPptx().getSlide()); + } + @Bean @ConditionalOnMissingBean ThumbnailRendererFactory thumbnailRendererFactory(ObjectProvider rendererProvider) { diff --git a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailProperties.java b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailProperties.java index 8f296a9a..d291736b 100644 --- a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailProperties.java +++ b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailProperties.java @@ -128,8 +128,11 @@ public static long parseToBytes(String value, String propertyName) { private static String normalizeDataSize(String value) { String trimmed = value.trim(); String upper = trimmed.toUpperCase(Locale.ROOT); + if (upper.endsWith("KB") || upper.endsWith("MB") || upper.endsWith("GB") || upper.endsWith("TB")) { + return upper; + } if (upper.endsWith("K") || upper.endsWith("M") || upper.endsWith("G") || upper.endsWith("T")) { - return trimmed + "B"; + return upper + "B"; } return trimmed; } @@ -143,6 +146,18 @@ public static class Renderers { @Valid private PdfRenderer pdf = new PdfRenderer(); + + @Valid + private PptxRenderer pptx = new PptxRenderer(); + + @Valid + private Renderer docx = disabledRenderer(); + + @Valid + private Renderer hwp = disabledRenderer(); + + @Valid + private Renderer hwpx = disabledRenderer(); } @Getter @@ -161,4 +176,20 @@ public PdfRenderer() { setEnabled(false); } } + + @Getter + @Setter + public static class PptxRenderer extends Renderer { + private int slide = 0; + + public PptxRenderer() { + setEnabled(false); + } + } + + private static Renderer disabledRenderer() { + Renderer renderer = new Renderer(); + renderer.setEnabled(false); + return renderer; + } } diff --git a/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailTextractPreviewAutoConfiguration.java b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailTextractPreviewAutoConfiguration.java new file mode 100644 index 00000000..60007772 --- /dev/null +++ b/starter/studio-platform-thumbnail-starter/src/main/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailTextractPreviewAutoConfiguration.java @@ -0,0 +1,44 @@ +package studio.one.platform.thumbnail.autoconfigure; + +import org.springframework.boot.autoconfigure.AutoConfiguration; +import org.springframework.boot.autoconfigure.condition.ConditionalOnBean; +import org.springframework.boot.autoconfigure.condition.ConditionalOnClass; +import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean; +import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty; +import org.springframework.context.annotation.Bean; +import org.springframework.core.annotation.Order; + +import studio.one.platform.textract.service.FileContentExtractionService; +import studio.one.platform.thumbnail.renderer.DocxThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.HwpThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.HwpxThumbnailRenderer; + +@AutoConfiguration(after = ThumbnailAutoConfiguration.class) +@ConditionalOnClass(name = "studio.one.platform.textract.service.FileContentExtractionService") +@ConditionalOnBean(type = "studio.one.platform.textract.service.FileContentExtractionService") +public class ThumbnailTextractPreviewAutoConfiguration { + + @Bean + @Order(400) + @ConditionalOnProperty(prefix = "studio.thumbnail.renderers.docx", name = "enabled", havingValue = "true") + @ConditionalOnMissingBean(DocxThumbnailRenderer.class) + DocxThumbnailRenderer docxThumbnailRenderer(FileContentExtractionService extractionService) { + return TextractDocumentPreviewThumbnailRenderer.docx(extractionService); + } + + @Bean + @Order(410) + @ConditionalOnProperty(prefix = "studio.thumbnail.renderers.hwp", name = "enabled", havingValue = "true") + @ConditionalOnMissingBean(HwpThumbnailRenderer.class) + HwpThumbnailRenderer hwpThumbnailRenderer(FileContentExtractionService extractionService) { + return TextractDocumentPreviewThumbnailRenderer.hwp(extractionService); + } + + @Bean + @Order(420) + @ConditionalOnProperty(prefix = "studio.thumbnail.renderers.hwpx", name = "enabled", havingValue = "true") + @ConditionalOnMissingBean(HwpxThumbnailRenderer.class) + HwpxThumbnailRenderer hwpxThumbnailRenderer(FileContentExtractionService extractionService) { + return TextractDocumentPreviewThumbnailRenderer.hwpx(extractionService); + } +} diff --git a/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/additional-spring-configuration-metadata.json b/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/additional-spring-configuration-metadata.json index 22dff0ea..3729cd32 100644 --- a/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/additional-spring-configuration-metadata.json +++ b/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/additional-spring-configuration-metadata.json @@ -74,6 +74,36 @@ "defaultValue": 0, "description": "Zero-based PDF page index to render as the thumbnail source." }, + { + "name": "studio.thumbnail.renderers.pptx.enabled", + "type": "java.lang.Boolean", + "defaultValue": false, + "description": "Enable Apache POI-based PPTX thumbnail rendering when POI OOXML is on the classpath." + }, + { + "name": "studio.thumbnail.renderers.pptx.slide", + "type": "java.lang.Integer", + "defaultValue": 0, + "description": "Zero-based PPTX slide index rendered as the thumbnail source." + }, + { + "name": "studio.thumbnail.renderers.docx.enabled", + "type": "java.lang.Boolean", + "defaultValue": false, + "description": "Enable DOCX preview thumbnail rendering from FileContentExtractionService parsed text. Requires the textract service bean." + }, + { + "name": "studio.thumbnail.renderers.hwp.enabled", + "type": "java.lang.Boolean", + "defaultValue": false, + "description": "Enable HWP preview thumbnail rendering from FileContentExtractionService parsed text. Requires the textract service bean." + }, + { + "name": "studio.thumbnail.renderers.hwpx.enabled", + "type": "java.lang.Boolean", + "defaultValue": false, + "description": "Enable HWPX preview thumbnail rendering from FileContentExtractionService parsed text. Requires the textract service bean." + }, { "name": "studio.attachment.thumbnail.default-size", "type": "java.lang.Integer", @@ -91,6 +121,24 @@ "reason": "Thumbnail generation defaults moved to the platform thumbnail service.", "replacement": "studio.thumbnail.default-format" } + }, + { + "name": "studio.features.attachment.thumbnail.default-size", + "type": "java.lang.Integer", + "deprecation": { + "level": "warning", + "reason": "Thumbnail generation defaults moved to the platform thumbnail service.", + "replacement": "studio.thumbnail.default-size" + } + }, + { + "name": "studio.features.attachment.thumbnail.default-format", + "type": "java.lang.String", + "deprecation": { + "level": "warning", + "reason": "Thumbnail generation defaults moved to the platform thumbnail service.", + "replacement": "studio.thumbnail.default-format" + } } ] } diff --git a/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports b/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports index b86ba89a..e23574d4 100644 --- a/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports +++ b/starter/studio-platform-thumbnail-starter/src/main/resources/META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports @@ -1 +1,2 @@ studio.one.platform.thumbnail.autoconfigure.ThumbnailAutoConfiguration +studio.one.platform.thumbnail.autoconfigure.ThumbnailTextractPreviewAutoConfiguration diff --git a/starter/studio-platform-thumbnail-starter/src/test/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfigurationTest.java b/starter/studio-platform-thumbnail-starter/src/test/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfigurationTest.java index f942cc85..a2966635 100644 --- a/starter/studio-platform-thumbnail-starter/src/test/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfigurationTest.java +++ b/starter/studio-platform-thumbnail-starter/src/test/java/studio/one/platform/thumbnail/autoconfigure/ThumbnailAutoConfigurationTest.java @@ -2,11 +2,13 @@ import static org.assertj.core.api.Assertions.assertThat; +import java.io.ByteArrayInputStream; import java.lang.reflect.Field; import java.util.List; import java.util.Set; import org.apache.pdfbox.pdmodel.PDDocument; +import org.apache.poi.xslf.usermodel.XMLSlideShow; import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; import org.junit.jupiter.api.extension.ExtendWith; @@ -18,12 +20,22 @@ import org.springframework.boot.test.system.OutputCaptureExtension; import studio.one.platform.autoconfigure.ConfigurationPropertyMigration; +import studio.one.platform.textract.extractor.DocumentFormat; +import studio.one.platform.textract.extractor.FileParser; +import studio.one.platform.textract.extractor.FileParserFactory; +import studio.one.platform.textract.model.ParsedFile; +import studio.one.platform.textract.service.FileContentExtractionService; import studio.one.platform.thumbnail.ThumbnailGenerationOptions; import studio.one.platform.thumbnail.ThumbnailGenerationService; +import studio.one.platform.thumbnail.ThumbnailOptions; import studio.one.platform.thumbnail.ThumbnailRenderer; import studio.one.platform.thumbnail.ThumbnailRendererFactory; +import studio.one.platform.thumbnail.ThumbnailResult; +import studio.one.platform.thumbnail.ThumbnailSource; +import studio.one.platform.thumbnail.renderer.DocxThumbnailRenderer; import studio.one.platform.thumbnail.renderer.ImageThumbnailRenderer; import studio.one.platform.thumbnail.renderer.PdfThumbnailRenderer; +import studio.one.platform.thumbnail.renderer.PptxThumbnailRenderer; @ExtendWith(OutputCaptureExtension.class) class ThumbnailAutoConfigurationTest { @@ -31,7 +43,8 @@ class ThumbnailAutoConfigurationTest { private final ApplicationContextRunner contextRunner = new ApplicationContextRunner() .withConfiguration(AutoConfigurations.of( ValidationAutoConfiguration.class, - ThumbnailAutoConfiguration.class)); + ThumbnailAutoConfiguration.class, + ThumbnailTextractPreviewAutoConfiguration.class)); @BeforeEach void resetWarnings() throws Exception { @@ -72,6 +85,18 @@ void createsServiceWithHumanReadableMaxSourceSize(CapturedOutput output) { }); } + @Test + void acceptsLowercaseHumanReadableMaxSourceSize() { + contextRunner + .withPropertyValues("studio.thumbnail.max-source-size=10mb") + .run(context -> { + assertThat(context).hasNotFailed(); + ThumbnailGenerationOptions options = context.getBean(ThumbnailGenerationService.class) + .generationOptions(); + assertThat(options.maxSourceBytes()).isEqualTo(10 * 1024 * 1024); + }); + } + @Test void pdfRendererIsOptIn() { contextRunner @@ -104,6 +129,108 @@ void pdfRendererIsConditionalOnPdfboxClasspath() { }); } + @Test + void documentRenderersAreOptIn() { + contextRunner + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).doesNotHaveBean(PptxThumbnailRenderer.class); + assertThat(context).doesNotHaveBean(TextractDocumentPreviewThumbnailRenderer.class); + }); + } + + @Test + void pptxRendererIsRegisteredWhenExplicitlyEnabled() { + contextRunner + .withPropertyValues("studio.thumbnail.renderers.pptx.enabled=true") + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).hasSingleBean(PptxThumbnailRenderer.class); + }); + } + + @Test + void pptxRendererIsConditionalOnPoiClasspath() { + contextRunner + .withPropertyValues("studio.thumbnail.renderers.pptx.enabled=true") + .withClassLoader(new FilteredClassLoader(XMLSlideShow.class)) + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).doesNotHaveBean(PptxThumbnailRenderer.class); + }); + } + + @Test + void textractPreviewRenderersRequireTextractServiceBean() { + contextRunner + .withPropertyValues( + "studio.thumbnail.renderers.docx.enabled=true", + "studio.thumbnail.renderers.hwp.enabled=true", + "studio.thumbnail.renderers.hwpx.enabled=true") + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).doesNotHaveBean(TextractDocumentPreviewThumbnailRenderer.class); + }); + } + + @Test + void textractPreviewAutoConfigurationIsSafeWhenTextractClasspathIsMissing() { + contextRunner + .withPropertyValues("studio.thumbnail.renderers.docx.enabled=true") + .withClassLoader(new FilteredClassLoader(FileContentExtractionService.class)) + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).doesNotHaveBean(TextractDocumentPreviewThumbnailRenderer.class); + }); + } + + @Test + void textractPreviewRenderersAreRegisteredWhenExplicitlyEnabledAndTextractServiceExists() { + contextRunner + .withBean(FileContentExtractionService.class, ThumbnailAutoConfigurationTest::textExtractionService) + .withPropertyValues( + "studio.thumbnail.renderers.docx.enabled=true", + "studio.thumbnail.renderers.hwp.enabled=true", + "studio.thumbnail.renderers.hwpx.enabled=true") + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context.getBeansOfType(TextractDocumentPreviewThumbnailRenderer.class)) + .hasSize(3); + }); + } + + @Test + void customDocxRendererByTypePreventsDefaultPreviewRenderer() { + DocxThumbnailRenderer customRenderer = new TestDocxRenderer(); + + contextRunner + .withBean("customDocxRenderer", DocxThumbnailRenderer.class, () -> customRenderer) + .withBean(FileContentExtractionService.class, ThumbnailAutoConfigurationTest::textExtractionService) + .withPropertyValues("studio.thumbnail.renderers.docx.enabled=true") + .run(context -> { + assertThat(context).hasNotFailed(); + assertThat(context).hasSingleBean(DocxThumbnailRenderer.class); + assertThat(context).getBean(DocxThumbnailRenderer.class).isSameAs(customRenderer); + }); + } + + @Test + void textractPreviewRendererGeneratesPngFromParsedText() throws Exception { + DocxThumbnailRenderer renderer = + TextractDocumentPreviewThumbnailRenderer.docx(textExtractionService()); + + ThumbnailResult result = renderer.render( + new ThumbnailSource( + "application/vnd.openxmlformats-officedocument.wordprocessingml.document", + "sample.docx", + "docx".getBytes()), + new ThumbnailOptions(96, "png", 25_000_000, 1024 * 1024)); + + assertThat(result.contentType()).isEqualTo("image/png"); + assertThat(javax.imageio.ImageIO.read(new ByteArrayInputStream(result.bytes())).getWidth()) + .isLessThanOrEqualTo(96); + } + @Test void preservesUserDefinedRendererAndGenerationService() { ThumbnailRenderer customRenderer = new TestRenderer(); @@ -198,4 +325,27 @@ public studio.one.platform.thumbnail.ThumbnailResult render( throw new UnsupportedOperationException(); } } + + private static final class TestDocxRenderer extends TestRenderer implements DocxThumbnailRenderer { + } + + private static FileContentExtractionService textExtractionService() { + FileParser parser = new FileParser() { + @Override + public boolean supports(String contentType, String filename) { + return true; + } + + @Override + public String parse(byte[] bytes, String contentType, String filename) { + return "Hello document\nSecond line"; + } + + @Override + public ParsedFile parseStructured(byte[] bytes, String contentType, String filename) { + return ParsedFile.textOnly(DocumentFormat.DOCX, parse(bytes, contentType, filename), filename); + } + }; + return new FileContentExtractionService(new FileParserFactory(List.of(parser))); + } } diff --git a/studio-application-modules/attachment-service/README.md b/studio-application-modules/attachment-service/README.md index 5b0772ec..d8b0ba3f 100644 --- a/studio-application-modules/attachment-service/README.md +++ b/studio-application-modules/attachment-service/README.md @@ -47,6 +47,15 @@ studio: pdf: enabled: false # PDFBox classpath가 있고 명시적으로 true일 때만 등록 page: 0 + pptx: + enabled: false # POI 기반 opt-in renderer + slide: 0 + docx: + enabled: false # textract preview 기반 opt-in renderer + hwp: + enabled: false # textract preview 기반 opt-in renderer + hwpx: + enabled: false # textract preview 기반 opt-in renderer ``` ### 동작 방식 @@ -54,7 +63,7 @@ studio: - `persistence`가 `jpa` 면 `AttachmentJpaRepository` + `JpaFileStore`(database 선택 시) 사용, `jdbc` 면 `JdbcAttachmentRepository` + `JdbcFileStore`. - `attachment.storage.type=filesystem` → `LocalFileStore`에 바이너리 저장. - `attachment.storage.type=database` → 선택한 persistence 저장소에 바이너리를 넣고, `cache-enabled=true` 시 `LocalFileStore`로 읽기 캐시. -- `ThumbnailGenerationService`가 있으면 image 썸네일을 생성한다. PDF 썸네일은 `studio.thumbnail.renderers.pdf.enabled=true`로 명시 opt-in 했을 때만 생성한다. 지원 renderer가 없거나 변환할 수 없는 문서는 `/thumbnail`에서 204를 반환한다. +- `ThumbnailGenerationService`가 있으면 image 썸네일을 생성한다. PDF/PPTX/DOCX/HWP/HWPX 썸네일은 각 `studio.thumbnail.renderers..enabled=true`로 명시 opt-in 했을 때만 생성한다. PDF는 PDFBox, PPTX는 POI, DOCX/HWP/HWPX는 textract `FileContentExtractionService`가 필요하다. 저장된 썸네일이 없으면 `/thumbnail`은 `X-Thumbnail-Status: pending` 헤더와 함께 placeholder 이미지를 즉시 반환하고, starter가 등록한 `attachmentThumbnailExecutor`에서 실제 생성을 수행한 뒤 저장한다. 직접 `ThumbnailServiceImpl`를 생성하는 테스트/커스텀 구성에서 요청 스레드 생성을 피하려면 executor를 받는 생성자를 사용한다. 지원 renderer가 없거나 변환할 수 없는 문서는 bounded TTL 실패 상태를 memoize하고 이후 `X-Thumbnail-Status: unavailable` 204를 반환한다. DOCX/HWP/HWPX preview는 textract parser 표면을 사용하므로 필요한 경우에만 켜고, `studio.textract.max-extract-size`는 압축 입력 크기 제한으로 보수적으로 설정한다. DOCX parser는 압축 해제 work에 별도 entry/total budget을 적용한다. - 운영 환경에서는 `studio.attachment.storage.base-dir`와 `studio.attachment.thumbnail.base-dir`를 애플리케이션 전용 private 경로로 명시한다. 기본 tmp 경로는 로컬 개발 편의용이다. - `web.enabled=true` 시 `AttachmentMgmtController`/`AttachmentController`/`MeAttachmentController`가 등록되며 `base-path`/`mgmt-base-path`/`self-base` 로 경로가 결정된다. - `ObjectTypeRuntimeService` 빈이 있을 경우 업로드 시 `validateUpload`로 정책 검증을 수행한다(없으면 생략). diff --git a/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailData.java b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailData.java index 70c401a6..35fb4f52 100644 --- a/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailData.java +++ b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailData.java @@ -3,10 +3,16 @@ public class ThumbnailData { private final byte[] bytes; private final String contentType; + private final String status; public ThumbnailData(byte[] bytes, String contentType) { + this(bytes, contentType, "ready"); + } + + public ThumbnailData(byte[] bytes, String contentType, String status) { this.bytes = bytes; this.contentType = contentType; + this.status = status; } public byte[] getBytes() { @@ -16,4 +22,12 @@ public byte[] getBytes() { public String getContentType() { return contentType; } + + public String getStatus() { + return status; + } + + public boolean isPending() { + return "pending".equalsIgnoreCase(status); + } } diff --git a/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailPlaceholder.java b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailPlaceholder.java new file mode 100644 index 00000000..82d870a4 --- /dev/null +++ b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailPlaceholder.java @@ -0,0 +1,56 @@ +package studio.one.application.attachment.thumbnail; + +import java.awt.Color; +import java.awt.Font; +import java.awt.FontMetrics; +import java.awt.Graphics2D; +import java.awt.RenderingHints; +import java.awt.image.BufferedImage; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import javax.imageio.ImageIO; + +final class ThumbnailPlaceholder { + + private ThumbnailPlaceholder() { + } + + static ThumbnailData pending(int size) { + int resolvedSize = Math.max(16, size); + BufferedImage image = new BufferedImage(resolvedSize, resolvedSize, BufferedImage.TYPE_INT_RGB); + Graphics2D graphics = image.createGraphics(); + try { + graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); + graphics.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_ON); + graphics.setColor(new Color(241, 245, 249)); + graphics.fillRect(0, 0, resolvedSize, resolvedSize); + graphics.setColor(new Color(148, 163, 184)); + graphics.drawRect(0, 0, resolvedSize - 1, resolvedSize - 1); + + graphics.setColor(new Color(71, 85, 105)); + int fontSize = Math.max(9, Math.min(14, resolvedSize / 9)); + graphics.setFont(new Font(Font.SANS_SERIF, Font.BOLD, fontSize)); + drawCentered(graphics, "Pending", resolvedSize, resolvedSize); + } finally { + graphics.dispose(); + } + return new ThumbnailData(writePng(image), "image/png", "pending"); + } + + private static void drawCentered(Graphics2D graphics, String text, int width, int height) { + FontMetrics metrics = graphics.getFontMetrics(); + int x = Math.max(0, (width - metrics.stringWidth(text)) / 2); + int y = Math.max(metrics.getAscent(), (height - metrics.getHeight()) / 2 + metrics.getAscent()); + graphics.drawString(text, x, y); + } + + private static byte[] writePng(BufferedImage image) { + try (ByteArrayOutputStream out = new ByteArrayOutputStream()) { + ImageIO.write(image, "png", out); + return out.toByteArray(); + } catch (IOException ex) { + throw new IllegalStateException("Failed to write thumbnail placeholder", ex); + } + } +} diff --git a/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImpl.java b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImpl.java index 6a718e81..a116eead 100644 --- a/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImpl.java +++ b/studio-application-modules/attachment-service/src/main/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImpl.java @@ -2,27 +2,47 @@ import java.io.ByteArrayInputStream; import java.io.InputStream; +import java.time.Duration; +import java.time.Instant; +import java.util.Iterator; +import java.util.LinkedHashMap; import java.util.List; +import java.util.Map; import java.util.Optional; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executor; +import java.util.concurrent.RejectedExecutionException; -import lombok.RequiredArgsConstructor; import lombok.extern.slf4j.Slf4j; import studio.one.application.attachment.domain.model.Attachment; import studio.one.application.attachment.service.AttachmentService; import studio.one.platform.thumbnail.ThumbnailGenerationOptions; import studio.one.platform.thumbnail.ThumbnailGenerationService; +import studio.one.platform.thumbnail.ThumbnailGenerationException; import studio.one.platform.thumbnail.ThumbnailOptions; import studio.one.platform.thumbnail.ThumbnailRendererFactory; import studio.one.platform.thumbnail.ThumbnailResult; import studio.one.platform.thumbnail.renderer.ImageThumbnailRenderer; -@RequiredArgsConstructor @Slf4j public class ThumbnailServiceImpl implements ThumbnailService { private final AttachmentService attachmentService; private final ThumbnailStorage thumbnailStorage; private final ThumbnailGenerationService thumbnailGenerationService; + private final FailureMemo failedThumbnails = new FailureMemo(10_000, Duration.ofMinutes(10)); + private final DeletionMemo deletedThumbnails = new DeletionMemo(10_000, Duration.ofMinutes(10)); + private final Set runningThumbnails = ConcurrentHashMap.newKeySet(); + private final Object[] generationLocks = createLocks(64); + private final Executor generationExecutor; + + public ThumbnailServiceImpl( + AttachmentService attachmentService, + ThumbnailStorage thumbnailStorage, + ThumbnailGenerationService thumbnailGenerationService) { + this(attachmentService, thumbnailStorage, thumbnailGenerationService, Runnable::run); + } /** * @deprecated Use {@link #ThumbnailServiceImpl(AttachmentService, ThumbnailStorage, @@ -35,7 +55,18 @@ public ThumbnailServiceImpl( ThumbnailStorage thumbnailStorage, int defaultSize, String defaultFormat) { - this(attachmentService, thumbnailStorage, legacyGenerationService(defaultSize, defaultFormat)); + this(attachmentService, thumbnailStorage, legacyGenerationService(defaultSize, defaultFormat), Runnable::run); + } + + public ThumbnailServiceImpl( + AttachmentService attachmentService, + ThumbnailStorage thumbnailStorage, + ThumbnailGenerationService thumbnailGenerationService, + Executor generationExecutor) { + this.attachmentService = attachmentService; + this.thumbnailStorage = thumbnailStorage; + this.thumbnailGenerationService = thumbnailGenerationService; + this.generationExecutor = generationExecutor == null ? Runnable::run : generationExecutor; } @Override @@ -49,13 +80,46 @@ public Optional getOrCreate(Attachment attachment, int size, Stri attachment.getAttachmentId(), options.size(), options.format()); + ThumbnailSourceKey sourceKey = ThumbnailSourceKey.from(attachment, options.format()); try (InputStream cached = thumbnailStorage.load(key)) { byte[] bytes = cached.readAllBytes(); return Optional.of(new ThumbnailData(bytes, contentTypeFor(options.format()))); } catch (RuntimeException | java.io.IOException ignored) { // cache miss or load error -> generate } + if (failedThumbnails.contains(sourceKey)) { + return Optional.empty(); + } + if (!enqueueGeneration(attachment, key, sourceKey, options)) { + return Optional.empty(); + } + return Optional.of(ThumbnailPlaceholder.pending(options.size())); + } + private boolean enqueueGeneration( + Attachment attachment, + ThumbnailKey key, + ThumbnailSourceKey sourceKey, + ThumbnailOptions options) { + if (!runningThumbnails.add(sourceKey)) { + return true; + } + try { + generationExecutor.execute(() -> generateAndStore(attachment, key, sourceKey, options)); + return true; + } catch (RejectedExecutionException ex) { + runningThumbnails.remove(sourceKey); + log.warn("Thumbnail generation queue rejected id={}: {}", + attachment.getAttachmentId(), ex.getMessage()); + return false; + } + } + + private void generateAndStore( + Attachment attachment, + ThumbnailKey key, + ThumbnailSourceKey sourceKey, + ThumbnailOptions options) { try (InputStream source = attachmentService.getInputStream(attachment)) { Optional result = thumbnailGenerationService.generate( attachment.getContentType(), @@ -64,14 +128,26 @@ public Optional getOrCreate(Attachment attachment, int size, Stri options.size(), options.format()); if (result.isEmpty()) { - return Optional.empty(); + failedThumbnails.add(sourceKey); + return; } ThumbnailResult thumbnail = result.get(); - thumbnailStorage.save(key, new ByteArrayInputStream(thumbnail.bytes())); - return Optional.of(new ThumbnailData(thumbnail.bytes(), thumbnail.contentType())); + AttachmentIdentity identity = AttachmentIdentity.from(attachment); + synchronized (lockFor(identity)) { + if (deletedThumbnails.contains(identity)) { + return; + } + thumbnailStorage.save(key, new ByteArrayInputStream(thumbnail.bytes())); + } + failedThumbnails.remove(sourceKey); + } catch (ThumbnailGenerationException e) { + failedThumbnails.add(sourceKey); + log.warn("Thumbnail generate failed for id={}: {}", attachment.getAttachmentId(), e.getMessage()); } catch (Exception e) { + failedThumbnails.add(sourceKey); log.warn("Thumbnail generate failed for id={}: {}", attachment.getAttachmentId(), e.getMessage()); - return Optional.empty(); + } finally { + runningThumbnails.remove(sourceKey); } } @@ -81,7 +157,14 @@ public void deleteAll(Attachment attachment) { return; } try { - thumbnailStorage.deleteAll(attachment.getObjectType(), attachment.getAttachmentId()); + AttachmentIdentity identity = AttachmentIdentity.from(attachment); + synchronized (lockFor(identity)) { + thumbnailStorage.deleteAll(attachment.getObjectType(), attachment.getAttachmentId()); + deletedThumbnails.add(identity); + failedThumbnails.removeAttachment(attachment.getObjectType(), attachment.getAttachmentId()); + runningThumbnails.removeIf(key -> key.objectType() == attachment.getObjectType() + && key.attachmentId() == attachment.getAttachmentId()); + } } catch (RuntimeException ex) { log.debug("Thumbnail deleteAll failed for id={}: {}", attachment.getAttachmentId(), ex.getMessage()); } @@ -99,4 +182,118 @@ private static ThumbnailGenerationService legacyGenerationService(int defaultSiz new ThumbnailRendererFactory(List.of(new ImageThumbnailRenderer())), new ThumbnailGenerationOptions(defaultSize, defaultFormat, 16, 512, 50L * 1024L * 1024L, 25_000_000L)); } + + private Object lockFor(AttachmentIdentity identity) { + int index = Math.floorMod(identity.hashCode(), generationLocks.length); + return generationLocks[index]; + } + + private static Object[] createLocks(int count) { + Object[] locks = new Object[count]; + for (int i = 0; i < locks.length; i++) { + locks[i] = new Object(); + } + return locks; + } + + private record AttachmentIdentity(int objectType, long attachmentId) { + + private static AttachmentIdentity from(Attachment attachment) { + return new AttachmentIdentity(attachment.getObjectType(), attachment.getAttachmentId()); + } + } + + private record ThumbnailSourceKey(int objectType, long attachmentId, String format) { + + private static ThumbnailSourceKey from(Attachment attachment, String format) { + return new ThumbnailSourceKey( + attachment.getObjectType(), + attachment.getAttachmentId(), + format == null ? "" : format.toLowerCase(java.util.Locale.ROOT)); + } + } + + private static final class DeletionMemo { + + private final int maxEntries; + private final long ttlMillis; + private final LinkedHashMap entries = new LinkedHashMap<>(16, 0.75f, true); + + private DeletionMemo(int maxEntries, Duration ttl) { + this.maxEntries = maxEntries; + this.ttlMillis = ttl.toMillis(); + } + + private synchronized boolean contains(AttachmentIdentity key) { + long now = Instant.now().toEpochMilli(); + Long deletedAt = entries.get(key); + if (deletedAt == null) { + return false; + } + if (now - deletedAt > ttlMillis) { + entries.remove(key); + return false; + } + return true; + } + + private synchronized void add(AttachmentIdentity key) { + entries.put(key, Instant.now().toEpochMilli()); + trim(); + } + + private void trim() { + Iterator> iterator = entries.entrySet().iterator(); + while (entries.size() > maxEntries && iterator.hasNext()) { + iterator.next(); + iterator.remove(); + } + } + } + + private static final class FailureMemo { + + private final int maxEntries; + private final long ttlMillis; + private final LinkedHashMap entries = new LinkedHashMap<>(16, 0.75f, true); + + private FailureMemo(int maxEntries, Duration ttl) { + this.maxEntries = maxEntries; + this.ttlMillis = ttl.toMillis(); + } + + private synchronized boolean contains(ThumbnailSourceKey key) { + long now = Instant.now().toEpochMilli(); + Long failedAt = entries.get(key); + if (failedAt == null) { + return false; + } + if (now - failedAt > ttlMillis) { + entries.remove(key); + return false; + } + return true; + } + + private synchronized void add(ThumbnailSourceKey key) { + entries.put(key, Instant.now().toEpochMilli()); + trim(); + } + + private synchronized void remove(ThumbnailSourceKey key) { + entries.remove(key); + } + + private synchronized void removeAttachment(int objectType, long attachmentId) { + entries.keySet().removeIf(key -> key.objectType() == objectType && key.attachmentId() == attachmentId); + } + + private void trim() { + Iterator> iterator = entries.entrySet().iterator(); + while (entries.size() > maxEntries && iterator.hasNext()) { + iterator.next(); + iterator.remove(); + } + } + } } diff --git a/studio-application-modules/attachment-service/src/main/java/studio/one/application/web/controller/AttachmentController.java b/studio-application-modules/attachment-service/src/main/java/studio/one/application/web/controller/AttachmentController.java index 26ff24cb..4d7dfe46 100644 --- a/studio-application-modules/attachment-service/src/main/java/studio/one/application/web/controller/AttachmentController.java +++ b/studio-application-modules/attachment-service/src/main/java/studio/one/application/web/controller/AttachmentController.java @@ -100,12 +100,23 @@ public ResponseEntity thumbnail( int requestedSize = size == null ? 0 : size; var result = thumbnailService.getOrCreate(attachment, requestedSize, format); if (result.isEmpty()) { - return ResponseEntity.noContent().build(); + HttpHeaders headers = new HttpHeaders(); + headers.add("X-Thumbnail-Status", "unavailable"); + headers.setCacheControl(CacheControl.noStore().getHeaderValue()); + return ResponseEntity.noContent() + .headers(headers) + .build(); } ThumbnailData data = result.get(); StreamingResponseBody body = out -> out.write(data.getBytes()); HttpHeaders headers = new HttpHeaders(); - headers.setCacheControl(CacheControl.maxAge(3600, java.util.concurrent.TimeUnit.SECONDS).getHeaderValue()); + headers.add("X-Thumbnail-Status", data.getStatus()); + if (data.isPending()) { + headers.setCacheControl(CacheControl.noStore().getHeaderValue()); + headers.add(HttpHeaders.RETRY_AFTER, "3"); + } else { + headers.setCacheControl(CacheControl.maxAge(3600, java.util.concurrent.TimeUnit.SECONDS).getHeaderValue()); + } headers.setContentType(AttachmentWebSupport.resolveMediaType(data.getContentType())); headers.setContentLength(data.getBytes().length); return ResponseEntity.ok() diff --git a/studio-application-modules/attachment-service/src/test/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImplTest.java b/studio-application-modules/attachment-service/src/test/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImplTest.java index 5c579c8d..8a94a447 100644 --- a/studio-application-modules/attachment-service/src/test/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImplTest.java +++ b/studio-application-modules/attachment-service/src/test/java/studio/one/application/attachment/thumbnail/ThumbnailServiceImplTest.java @@ -3,6 +3,8 @@ import static org.assertj.core.api.Assertions.assertThat; import static org.mockito.ArgumentMatchers.any; import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.never; +import static org.mockito.Mockito.times; import static org.mockito.Mockito.verify; import static org.mockito.Mockito.when; @@ -13,7 +15,11 @@ import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStream; +import java.util.ArrayDeque; import java.util.List; +import java.util.Queue; +import java.util.concurrent.Executor; +import java.util.concurrent.RejectedExecutionException; import javax.imageio.ImageIO; @@ -25,7 +31,11 @@ import studio.one.application.attachment.service.AttachmentService; import studio.one.platform.thumbnail.ThumbnailGenerationOptions; import studio.one.platform.thumbnail.ThumbnailGenerationService; +import studio.one.platform.thumbnail.ThumbnailOptions; +import studio.one.platform.thumbnail.ThumbnailRenderer; import studio.one.platform.thumbnail.ThumbnailRendererFactory; +import studio.one.platform.thumbnail.ThumbnailSource; +import studio.one.platform.thumbnail.ThumbnailGenerationException; import studio.one.platform.thumbnail.renderer.ImageThumbnailRenderer; import studio.one.platform.thumbnail.renderer.PdfThumbnailRenderer; @@ -44,6 +54,7 @@ void imageAttachmentUsesPlatformGenerationService() throws Exception { var result = service.getOrCreate(attachment, 64, "png"); assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); assertThat(result.get().getContentType()).isEqualTo("image/png"); verify(storage).save(any(ThumbnailKey.class), any(InputStream.class)); } @@ -61,6 +72,7 @@ void pdfAttachmentUsesPlatformGenerationServiceWhenPdfboxIsAvailable() throws Ex var result = service.getOrCreate(attachment, 64, "png"); assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); assertThat(result.get().getContentType()).isEqualTo("image/png"); verify(storage).save(any(ThumbnailKey.class), any(InputStream.class)); } @@ -75,7 +87,197 @@ void unsupportedAttachmentReturnsEmptyWithoutWritingStorage() throws Exception { when(storage.load(any())).thenThrow(new IllegalStateException("miss")); when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream("text".getBytes())); + var result = service.getOrCreate(attachment, 64, "png"); + + assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); + verify(storage, never()).save(any(), any()); + } + + @Test + void rendererFailureReturnsEmptyWithoutWritingStorage() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + ThumbnailGenerationService generationService = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new FailingRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)); + ThumbnailServiceImpl service = new ThumbnailServiceImpl(attachmentService, storage, generationService); + Attachment attachment = attachment(14L, "sample.docx", + "application/vnd.openxmlformats-officedocument.wordprocessingml.document"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream("docx".getBytes())); + + var result = service.getOrCreate(attachment, 64, "png"); + + assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); + verify(storage, never()).save(any(), any()); + } + + @Test + void deterministicRendererFailureIsMemoizedAfterCacheMiss() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + ThumbnailGenerationService generationService = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new FailingRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)); + ThumbnailServiceImpl service = new ThumbnailServiceImpl(attachmentService, storage, generationService); + Attachment attachment = attachment(15L, "sample.docx", + "application/vnd.openxmlformats-officedocument.wordprocessingml.document"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream("docx".getBytes())); + + assertThat(service.getOrCreate(attachment, 64, "png")).isPresent(); + assertThat(service.getOrCreate(attachment, 64, "png")).isEmpty(); + assertThat(service.getOrCreate(attachment, 96, "png")).isEmpty(); + + verify(attachmentService, times(1)).getInputStream(attachment); + verify(storage, never()).save(any(), any()); + } + + @Test + void sourceLoadFailureIsMemoizedAfterBackgroundAttempt() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + ThumbnailServiceImpl service = newService(attachmentService, storage); + Attachment attachment = attachment(17L, "sample.png", "image/png"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenThrow(new IOException("source missing")); + + assertThat(service.getOrCreate(attachment, 64, "png")).isPresent(); + assertThat(service.getOrCreate(attachment, 64, "png")).isEmpty(); + + verify(attachmentService, times(1)).getInputStream(attachment); + verify(storage, never()).save(any(), any()); + } + + @Test + void asyncExecutorReturnsPendingBeforeGenerationRuns() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + CapturingExecutor executor = new CapturingExecutor(); + ThumbnailGenerationService generationService = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new ImageThumbnailRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)); + ThumbnailServiceImpl service = new ThumbnailServiceImpl( + attachmentService, + storage, + generationService, + executor); + Attachment attachment = attachment(18L, "sample.png", "image/png"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream(imageBytes())); + + var result = service.getOrCreate(attachment, 64, "png"); + + assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); + verify(attachmentService, never()).getInputStream(attachment); + + executor.runNext(); + + verify(attachmentService, times(1)).getInputStream(attachment); + verify(storage).save(any(ThumbnailKey.class), any(InputStream.class)); + } + + @Test + void concurrentRequestsForDifferentSizesShareOneBackgroundJob() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + CapturingExecutor executor = new CapturingExecutor(); + ThumbnailGenerationService generationService = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new ImageThumbnailRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)); + ThumbnailServiceImpl service = new ThumbnailServiceImpl( + attachmentService, + storage, + generationService, + executor); + Attachment attachment = attachment(20L, "sample.png", "image/png"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream(imageBytes())); + + assertThat(service.getOrCreate(attachment, 64, "png")).isPresent(); + assertThat(service.getOrCreate(attachment, 96, "png")).isPresent(); + + assertThat(executor.taskCount()).isEqualTo(1); + verify(attachmentService, never()).getInputStream(attachment); + + executor.runNext(); + + verify(attachmentService, times(1)).getInputStream(attachment); + verify(storage).save(any(ThumbnailKey.class), any(InputStream.class)); + } + + @Test + void rejectedAsyncGenerationDoesNotRunOnRequestThread() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + ThumbnailServiceImpl service = new ThumbnailServiceImpl( + attachmentService, + storage, + new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new ImageThumbnailRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)), + runnable -> { + throw new RejectedExecutionException("queue full"); + }); + Attachment attachment = attachment(19L, "sample.png", "image/png"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + + assertThat(service.getOrCreate(attachment, 64, "png")).isEmpty(); + + verify(attachmentService, never()).getInputStream(attachment); + verify(storage, never()).save(any(), any()); + } + + @Test + void deleteAllPreventsQueuedBackgroundTaskFromSavingStaleThumbnail() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + CapturingExecutor executor = new CapturingExecutor(); + ThumbnailServiceImpl service = new ThumbnailServiceImpl( + attachmentService, + storage, + new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new ImageThumbnailRenderer())), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)), + executor); + Attachment attachment = attachment(21L, "sample.png", "image/png"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream(imageBytes())); + + assertThat(service.getOrCreate(attachment, 64, "png")).isPresent(); + + service.deleteAll(attachment); + executor.runNext(); + + verify(storage).deleteAll(2001, 21L); + verify(storage, never()).save(any(), any()); + } + + @Test + void unsupportedAttachmentIsMemoizedAfterCacheMiss() throws Exception { + AttachmentService attachmentService = mock(AttachmentService.class); + ThumbnailStorage storage = mock(ThumbnailStorage.class); + ThumbnailServiceImpl service = newService(attachmentService, storage); + Attachment attachment = attachment(16L, "sample.txt", "text/plain"); + + when(storage.load(any())).thenThrow(new IllegalStateException("miss")); + when(attachmentService.getInputStream(attachment)).thenReturn(new ByteArrayInputStream("text".getBytes())); + + assertThat(service.getOrCreate(attachment, 64, "png")).isPresent(); assertThat(service.getOrCreate(attachment, 64, "png")).isEmpty(); + + verify(attachmentService, times(1)).getInputStream(attachment); + verify(storage, never()).save(any(), any()); } @Test @@ -92,6 +294,7 @@ void legacyConstructorStillSupportsImageThumbnails() throws Exception { var result = service.getOrCreate(attachment, 64, "png"); assertThat(result).isPresent(); + assertThat(result.get().getStatus()).isEqualTo("pending"); verify(storage).save(any(ThumbnailKey.class), any(InputStream.class)); } @@ -132,4 +335,35 @@ private static byte[] pdfBytes() throws IOException { return out.toByteArray(); } } + + private static final class FailingRenderer implements ThumbnailRenderer { + @Override + public boolean supports(ThumbnailSource source) { + return true; + } + + @Override + public studio.one.platform.thumbnail.ThumbnailResult render( + ThumbnailSource source, + ThumbnailOptions options) { + throw new ThumbnailGenerationException("conversion failed"); + } + } + + private static final class CapturingExecutor implements Executor { + private final Queue tasks = new ArrayDeque<>(); + + @Override + public void execute(Runnable command) { + tasks.add(command); + } + + private void runNext() { + tasks.remove().run(); + } + + private int taskCount() { + return tasks.size(); + } + } } diff --git a/studio-application-modules/attachment-service/src/test/java/studio/one/application/web/controller/AttachmentControllerTest.java b/studio-application-modules/attachment-service/src/test/java/studio/one/application/web/controller/AttachmentControllerTest.java index 1e7095e9..93e74b8e 100644 --- a/studio-application-modules/attachment-service/src/test/java/studio/one/application/web/controller/AttachmentControllerTest.java +++ b/studio-application-modules/attachment-service/src/test/java/studio/one/application/web/controller/AttachmentControllerTest.java @@ -100,6 +100,7 @@ void thumbnailReturnsSameResponseShape() throws Exception { assertEquals(200, response.getStatusCode().value()); assertEquals("image/png", response.getHeaders().getContentType().toString()); + assertEquals("ready", response.getHeaders().getFirst("X-Thumbnail-Status")); assertEquals(2L, response.getHeaders().getContentLength()); ByteArrayOutputStream out = new ByteArrayOutputStream(); ((org.springframework.web.servlet.mvc.method.annotation.StreamingResponseBody) response.getBody()) @@ -107,6 +108,25 @@ void thumbnailReturnsSameResponseShape() throws Exception { assertEquals(2, out.toByteArray().length); } + @Test + void thumbnailPendingReturnsImageWithNoStoreAndRetryAfter() throws Exception { + AttachmentController controller = new AttachmentController(attachmentService, thumbnailServiceProvider); + Attachment attachment = mock(Attachment.class); + ThumbnailService thumbnailService = mock(ThumbnailService.class); + + when(thumbnailServiceProvider.getIfAvailable()).thenReturn(thumbnailService); + when(attachmentService.getAttachmentById(88L)).thenReturn(attachment); + when(thumbnailService.getOrCreate(attachment, 128, "png")) + .thenReturn(Optional.of(new ThumbnailData(new byte[] { 1, 2 }, "image/png", "pending"))); + + ResponseEntity response = controller.thumbnail(88L, 128, "png"); + + assertEquals(200, response.getStatusCode().value()); + assertEquals("pending", response.getHeaders().getFirst("X-Thumbnail-Status")); + assertEquals("3", response.getHeaders().getFirst("Retry-After")); + assertEquals("no-store", response.getHeaders().getCacheControl()); + } + @Test void thumbnailOmittedSizeAndFormatUseServiceDefaults() throws Exception { AttachmentController controller = new AttachmentController(attachmentService, thumbnailServiceProvider); @@ -121,6 +141,8 @@ void thumbnailOmittedSizeAndFormatUseServiceDefaults() throws Exception { ResponseEntity response = controller.thumbnail(88L, null, null); assertEquals(204, response.getStatusCode().value()); + assertEquals("unavailable", response.getHeaders().getFirst("X-Thumbnail-Status")); + assertEquals("no-store", response.getHeaders().getCacheControl()); verify(thumbnailService).getOrCreate(attachment, 0, null); } } diff --git a/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/DocxFileParser.java b/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/DocxFileParser.java index fbc1fb06..8906d260 100644 --- a/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/DocxFileParser.java +++ b/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/DocxFileParser.java @@ -6,6 +6,7 @@ import java.util.LinkedHashMap; import java.util.List; import java.util.Map; +import java.util.zip.ZipInputStream; import java.util.stream.Collectors; import org.apache.poi.UnsupportedFileFormatException; @@ -35,6 +36,10 @@ public class DocxFileParser extends AbstractFileParser implements StructuredFileParser { + private static final int ZIP_BUFFER_SIZE = 8192; + private static final long MAX_ZIP_ENTRY_BYTES = 16L * 1024L * 1024L; + private static final long MAX_ZIP_TOTAL_BYTES = 64L * 1024L * 1024L; + @Override public boolean supports(String contentType, String filename) { @@ -46,6 +51,12 @@ public boolean supports(String contentType, String filename) { @Override public ParsedFile parseStructured(byte[] bytes, String contentType, String filename) throws FileParseException { + try { + validateZipBounds(bytes, filename); + } catch (IOException e) { + throw new FileParseException("Failed to parse DOCX file: " + safeFilename(filename), e); + } + try (ByteArrayInputStream in = new ByteArrayInputStream(bytes); XWPFDocument doc = new XWPFDocument(in)) { @@ -260,6 +271,28 @@ private Integer toPixels(double emu) { return Math.max(1, (int) Math.round(emu / Units.EMU_PER_PIXEL)); } + private void validateZipBounds(byte[] bytes, String filename) throws IOException { + byte[] buffer = new byte[ZIP_BUFFER_SIZE]; + long totalBytes = 0L; + try (ZipInputStream zip = new ZipInputStream(new ByteArrayInputStream(bytes))) { + while (zip.getNextEntry() != null) { + long entryBytes = 0L; + int read; + while ((read = zip.read(buffer)) != -1) { + entryBytes += read; + totalBytes += read; + if (entryBytes > MAX_ZIP_ENTRY_BYTES) { + throw new IOException("DOCX zip entry exceeds limit: " + safeFilename(filename)); + } + if (totalBytes > MAX_ZIP_TOTAL_BYTES) { + throw new IOException("DOCX zip content exceeds limit: " + safeFilename(filename)); + } + } + zip.closeEntry(); + } + } + } + private BlockType resolveParagraphType(XWPFParagraph paragraph, BlockType containerType) { if (containerType == BlockType.HEADER || containerType == BlockType.FOOTER || containerType == BlockType.FOOTNOTE) { return containerType; diff --git a/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParser.java b/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParser.java index 1f9b1888..32261e8b 100644 --- a/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParser.java +++ b/studio-platform-textract/src/main/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParser.java @@ -51,6 +51,9 @@ */ public class HwpHwpxFileParser extends AbstractFileParser implements StructuredFileParser { + private static final int BUFFER_SIZE = 8192; + private static final int DEFAULT_MAX_CONTAINER_ENTRY_BYTES = 16 * 1024 * 1024; + private static final int DEFAULT_MAX_CONTAINER_TOTAL_BYTES = 64 * 1024 * 1024; private static final int HWPTAG_BEGIN = 0x010; private static final int HWPTAG_BIN_DATA = HWPTAG_BEGIN + 2; private static final int HWPTAG_PARA_HEADER = HWPTAG_BEGIN + 50; @@ -59,6 +62,18 @@ public class HwpHwpxFileParser extends AbstractFileParser implements StructuredF (byte) 0xD0, (byte) 0xCF, 0x11, (byte) 0xE0, (byte) 0xA1, (byte) 0xB1, 0x1A, (byte) 0xE1 }; + private final int maxContainerEntryBytes; + private final int maxContainerTotalBytes; + + public HwpHwpxFileParser() { + this(DEFAULT_MAX_CONTAINER_ENTRY_BYTES, DEFAULT_MAX_CONTAINER_TOTAL_BYTES); + } + + HwpHwpxFileParser(int maxContainerEntryBytes, int maxContainerTotalBytes) { + this.maxContainerEntryBytes = maxContainerEntryBytes; + this.maxContainerTotalBytes = maxContainerTotalBytes; + } + @Override public boolean supports(String contentType, String filename) { String name = lower(filename); @@ -308,8 +323,9 @@ private Optional findHwpxImageRef(Element picture) { private ParsedFile parseHwp(byte[] bytes, String contentType, String filename) throws FileParseException { try (POIFSFileSystem fs = new POIFSFileSystem(new ByteArrayInputStream(bytes))) { + ExtractionBudget budget = new ExtractionBudget(maxContainerTotalBytes); DirectoryEntry root = fs.getRoot(); - byte[] header = readDocument(root, "FileHeader"); + byte[] header = readDocument(root, "FileHeader", budget); HwpFlags flags = parseHwpFlags(header); List warnings = new ArrayList<>(); if (flags.encrypted()) { @@ -327,18 +343,23 @@ private ParsedFile parseHwp(byte[] bytes, String contentType, String filename) t Map.of())); } - List binDataRefs = readHwpBinDataRefs(root, flags.compressed(), warnings); + List binDataRefs = readHwpBinDataRefs(root, flags.compressed(), warnings, budget); List blocks = new ArrayList<>(); StringBuilder plain = new StringBuilder(); for (int sectionIndex = 0; ; sectionIndex++) { - Optional section = readHwpSection(root, sectionIndex, flags.compressed(), flags.distribution()); + Optional section = readHwpSection( + root, + sectionIndex, + flags.compressed(), + flags.distribution(), + budget); if (section.isEmpty()) { break; } parseHwpSection(section.get(), sectionIndex, plain, blocks, warnings); } - List images = readHwpImages(root, binDataRefs); + List images = readHwpImages(root, binDataRefs, budget); return new ParsedFile( DocumentFormat.HWP, cleanText(plain.toString()), @@ -439,14 +460,18 @@ private boolean isHwpExtendedControl(int ch) { return (ch >= 1 && ch <= 8) || (ch >= 11 && ch <= 12) || (ch >= 14 && ch <= 23); } - private List readHwpBinDataRefs(DirectoryEntry root, boolean compressed, List warnings) { + private List readHwpBinDataRefs( + DirectoryEntry root, + boolean compressed, + List warnings, + ExtractionBudget budget) { if (!hasEntry(root, "DocInfo")) { return List.of(); } try { - byte[] docInfo = readDocument(root, "DocInfo"); + byte[] docInfo = readDocument(root, "DocInfo", budget); if (compressed) { - docInfo = inflate(docInfo); + docInfo = inflate(docInfo, budget); } List records = readHwpRecords(docInfo, warnings); List refs = new ArrayList<>(); @@ -493,7 +518,10 @@ private Optional readHwpString(byte[] data, int offset) { return Optional.of(new String(data, start, byteLength, StandardCharsets.UTF_16LE)); } - private List readHwpImages(DirectoryEntry root, List binDataRefs) throws IOException { + private List readHwpImages( + DirectoryEntry root, + List binDataRefs, + ExtractionBudget budget) throws IOException { if (!hasEntry(root, "BinData") || !(root.getEntry("BinData") instanceof DirectoryEntry binDataDir)) { return List.of(); } @@ -506,7 +534,7 @@ private List readHwpImages(DirectoryEntry root, List if (entry.isDirectoryEntry()) { continue; } - byte[] data = readDocument(binDataDir, entry.getName()); + byte[] data = readDocument(binDataDir, entry.getName(), budget); String ext = extensionOf(entry.getName()); int id = storageIdFromBinName(entry.getName()).orElse(images.size() + 1); BinDataRef ref = refsByStorageId.get(id); @@ -525,16 +553,21 @@ private List readHwpImages(DirectoryEntry root, List return images; } - private Optional readHwpSection(DirectoryEntry root, int sectionIndex, boolean compressed, boolean distribution) + private Optional readHwpSection( + DirectoryEntry root, + int sectionIndex, + boolean compressed, + boolean distribution, + ExtractionBudget budget) throws IOException { String[] candidates = distribution ? new String[] { "ViewText/Section" + sectionIndex, "BodyText/Section" + sectionIndex, "Section" + sectionIndex } : new String[] { "BodyText/Section" + sectionIndex, "Section" + sectionIndex }; for (String candidate : candidates) { - Optional raw = readDocumentPath(root, candidate); + Optional raw = readDocumentPath(root, candidate, budget); if (raw.isPresent()) { return Optional.of(compressed && !candidate.startsWith("ViewText/") - ? inflate(raw.get()) + ? inflate(raw.get(), budget) : raw.get()); } } @@ -588,11 +621,17 @@ private List readHwpRecords(byte[] data, List warnings) private Map readZipEntries(byte[] bytes) throws IOException { Map entries = new LinkedHashMap<>(); + ExtractionBudget budget = new ExtractionBudget(maxContainerTotalBytes); try (ZipInputStream zip = new ZipInputStream(new ByteArrayInputStream(bytes))) { ZipEntry entry; while ((entry = zip.getNextEntry()) != null) { if (!entry.isDirectory()) { - entries.put(entry.getName(), zip.readAllBytes()); + byte[] entryBytes = readBounded( + zip, + maxContainerEntryBytes, + entry.getName(), + budget); + entries.put(entry.getName(), entryBytes); } } } @@ -757,7 +796,8 @@ private boolean hasEntry(DirectoryEntry dir, String name) { } } - private Optional readDocumentPath(DirectoryEntry root, String path) throws IOException { + private Optional readDocumentPath(DirectoryEntry root, String path, ExtractionBudget budget) + throws IOException { String[] parts = path.split("/"); DirectoryEntry dir = root; for (int i = 0; i < parts.length - 1; i++) { @@ -769,28 +809,74 @@ private Optional readDocumentPath(DirectoryEntry root, String path) thro if (!hasEntry(dir, parts[parts.length - 1])) { return Optional.empty(); } - return Optional.of(readDocument(dir, parts[parts.length - 1])); + return Optional.of(readDocument(dir, parts[parts.length - 1], budget)); } - private byte[] readDocument(DirectoryEntry dir, String name) throws IOException { + private byte[] readDocument(DirectoryEntry dir, String name, ExtractionBudget budget) throws IOException { try (DocumentInputStream in = new DocumentInputStream((DocumentEntry) dir.getEntry(name))) { - return in.readAllBytes(); + return readBounded(in, maxContainerEntryBytes, name, budget); } } - private byte[] inflate(byte[] raw) throws IOException { + private byte[] inflate(byte[] raw, ExtractionBudget budget) throws IOException { try { - return inflate(raw, true); + return inflate(raw, true, budget); } catch (IOException e) { - return inflate(raw, false); + return inflate(raw, false, budget); } } - private byte[] inflate(byte[] raw, boolean nowrap) throws IOException { + private byte[] inflate(byte[] raw, boolean nowrap, ExtractionBudget budget) throws IOException { try (InputStream in = new InflaterInputStream(new ByteArrayInputStream(raw), new Inflater(nowrap)); ByteArrayOutputStream out = new ByteArrayOutputStream()) { - in.transferTo(out); - return out.toByteArray(); + byte[] buffer = new byte[BUFFER_SIZE]; + int total = 0; + int read; + while ((read = in.read(buffer)) != -1) { + total += read; + if (total > maxContainerEntryBytes) { + throw new IOException("HWP compressed stream exceeds max extracted bytes: " + + maxContainerEntryBytes); + } + out.write(buffer, 0, read); + } + byte[] inflated = out.toByteArray(); + budget.add(inflated.length, "HWP compressed stream"); + return inflated; + } + } + + private byte[] readBounded(InputStream in, int maxBytes, String name, ExtractionBudget budget) + throws IOException { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + byte[] buffer = new byte[BUFFER_SIZE]; + int total = 0; + int read; + while ((read = in.read(buffer)) != -1) { + total += read; + if (total > maxBytes) { + throw new IOException("HWP/HWPX entry exceeds max extracted bytes: " + name + " " + maxBytes); + } + budget.add(read, name); + out.write(buffer, 0, read); + } + return out.toByteArray(); + } + + private static final class ExtractionBudget { + private final int maxBytes; + private long usedBytes; + + private ExtractionBudget(int maxBytes) { + this.maxBytes = maxBytes; + } + + private void add(long bytes, String source) throws IOException { + usedBytes += bytes; + if (usedBytes > maxBytes) { + throw new IOException("HWP/HWPX container exceeds max extracted bytes: " + + source + " " + maxBytes); + } } } diff --git a/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/DocxFileParserTest.java b/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/DocxFileParserTest.java index 4ee10898..d2279b13 100644 --- a/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/DocxFileParserTest.java +++ b/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/DocxFileParserTest.java @@ -1,12 +1,15 @@ package studio.one.platform.textract.extractor.impl; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.assertTrue; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.math.BigInteger; import java.util.Base64; +import java.util.zip.ZipEntry; +import java.util.zip.ZipOutputStream; import org.apache.poi.util.Units; import org.apache.poi.xwpf.usermodel.Document; @@ -19,6 +22,7 @@ import org.junit.jupiter.api.Test; import studio.one.platform.textract.extractor.DocumentFormat; +import studio.one.platform.textract.extractor.FileParseException; import studio.one.platform.textract.model.BlockType; import studio.one.platform.textract.model.ExtractedImage; import studio.one.platform.textract.model.ExtractedTable; @@ -150,6 +154,26 @@ void parseStructuredExtractsTableCellEmbeddedImageWithCellSourceRef() throws Exc assertEquals("image1.png", image.binDataRef()); } + @Test + void parseStructuredRejectsOversizedZipEntryBeforePoiParsing() throws Exception { + byte[] bytes = oversizedZipEntry(); + + FileParseException exception = assertThrows(FileParseException.class, () -> new DocxFileParser() + .parseStructured(bytes, "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "large.docx")); + + assertTrue(exception.getCause().getMessage().contains("DOCX zip entry exceeds limit")); + } + + @Test + void parseStructuredRejectsOversizedZipTotalBeforePoiParsing() throws Exception { + byte[] bytes = oversizedZipTotal(); + + FileParseException exception = assertThrows(FileParseException.class, () -> new DocxFileParser() + .parseStructured(bytes, "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "large.docx")); + + assertTrue(exception.getCause().getMessage().contains("DOCX zip content exceeds limit")); + } + private byte[] docxWithParagraphAndTable() throws Exception { try (XWPFDocument document = new XWPFDocument(); ByteArrayOutputStream out = new ByteArrayOutputStream()) { @@ -242,4 +266,34 @@ private byte[] docxWithTableCellImage() throws Exception { return out.toByteArray(); } } + + private byte[] oversizedZipEntry() throws Exception { + try (ByteArrayOutputStream out = new ByteArrayOutputStream(); + ZipOutputStream zip = new ZipOutputStream(out)) { + zip.putNextEntry(new ZipEntry("word/document.xml")); + byte[] chunk = new byte[1024]; + for (int i = 0; i < 17 * 1024; i++) { + zip.write(chunk); + } + zip.closeEntry(); + zip.finish(); + return out.toByteArray(); + } + } + + private byte[] oversizedZipTotal() throws Exception { + try (ByteArrayOutputStream out = new ByteArrayOutputStream(); + ZipOutputStream zip = new ZipOutputStream(out)) { + byte[] chunk = new byte[1024]; + for (int entry = 0; entry < 65; entry++) { + zip.putNextEntry(new ZipEntry("word/part-" + entry + ".xml")); + for (int i = 0; i < 1024; i++) { + zip.write(chunk); + } + zip.closeEntry(); + } + zip.finish(); + return out.toByteArray(); + } + } } diff --git a/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParserTest.java b/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParserTest.java index 465a5088..b0fcaef2 100644 --- a/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParserTest.java +++ b/studio-platform-textract/src/test/java/studio/one/platform/textract/extractor/impl/HwpHwpxFileParserTest.java @@ -4,6 +4,7 @@ import static java.nio.charset.StandardCharsets.UTF_8; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; import static org.junit.jupiter.api.Assertions.assertTrue; import java.io.ByteArrayInputStream; @@ -17,6 +18,7 @@ import org.junit.jupiter.api.Test; import studio.one.platform.textract.extractor.DocumentFormat; +import studio.one.platform.textract.extractor.FileParseException; import studio.one.platform.textract.model.BlockType; import studio.one.platform.textract.model.ParseWarningSeverity; import studio.one.platform.textract.model.ParsedFile; @@ -82,6 +84,26 @@ void parseStructuredMarksEncryptedHwpAsErrorWarning() throws Exception { assertFalse(result.warnings().get(0).partialParse()); } + @Test + void parseHwpxRejectsOversizedExtractedZipEntry() throws Exception { + FileParseException ex = assertThrows( + FileParseException.class, + () -> parser.parseStructured(hwpxBytesWithLargeEntry(), "application/hwpx", "large.hwpx")); + + assertTrue(ex.getCause().getMessage().contains("exceeds max extracted bytes")); + } + + @Test + void parseHwpRejectsAggregateExtractedBytesOverLimit() throws Exception { + HwpHwpxFileParser boundedParser = new HwpHwpxFileParser(1024, 300); + + FileParseException ex = assertThrows( + FileParseException.class, + () -> boundedParser.parseStructured(hwpBytesWithSections(3), "application/x-hwp", "large.hwp")); + + assertTrue(ex.getCause().getMessage().contains("exceeds max extracted bytes")); + } + private byte[] hwpxBytes() throws Exception { Map entries = Map.of( "Contents/content.hpf", """ @@ -145,16 +167,39 @@ private byte[] hwpxBytesWithMissingSection() throws Exception { return out.toByteArray(); } + private byte[] hwpxBytesWithLargeEntry() throws Exception { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + try (ZipOutputStream zip = new ZipOutputStream(out)) { + zip.putNextEntry(new ZipEntry("Contents/content.hpf")); + zip.write("".getBytes(UTF_8)); + zip.closeEntry(); + zip.putNextEntry(new ZipEntry("Contents/section0.xml")); + zip.write(new byte[17 * 1024 * 1024]); + zip.closeEntry(); + } + return out.toByteArray(); + } + private byte[] hwpBytes() throws Exception { return hwpBytesWithFlags(0); } private byte[] hwpBytesWithFlags(int flags) throws Exception { + return hwpBytesWithSectionsAndFlags(1, flags); + } + + private byte[] hwpBytesWithSections(int sectionCount) throws Exception { + return hwpBytesWithSectionsAndFlags(sectionCount, 0); + } + + private byte[] hwpBytesWithSectionsAndFlags(int sectionCount, int flags) throws Exception { try (POIFSFileSystem fs = new POIFSFileSystem(); ByteArrayOutputStream out = new ByteArrayOutputStream()) { fs.getRoot().createDocument("FileHeader", new ByteArrayInputStream(fileHeader(flags))); DirectoryEntry bodyText = fs.getRoot().createDirectory("BodyText"); - bodyText.createDocument("Section0", new ByteArrayInputStream(section("한글 본문"))); + for (int i = 0; i < sectionCount; i++) { + bodyText.createDocument("Section" + i, new ByteArrayInputStream(section("한글 본문 " + i))); + } DirectoryEntry binData = fs.getRoot().createDirectory("BinData"); binData.createDocument("BIN0001.png", new ByteArrayInputStream(new byte[] { (byte) 0x89, 'P', 'N', 'G' })); fs.writeFilesystem(out); diff --git a/studio-platform-thumbnail/README.md b/studio-platform-thumbnail/README.md index 47796278..269f0817 100644 --- a/studio-platform-thumbnail/README.md +++ b/studio-platform-thumbnail/README.md @@ -15,5 +15,6 @@ Attachment와 독립적으로 동작하는 썸네일 생성 SPI 모듈이다. - `ImageThumbnailRenderer`: ImageIO 기반 image resize - `PdfThumbnailRenderer`: PDFBox가 classpath에 있을 때 첫 페이지를 image로 렌더링한 뒤 resize +- `PptxThumbnailRenderer`: Apache POI로 대표 slide를 렌더링한 뒤 resize -DOCX/PPTX/HWP/HWPX 같은 문서는 후속 renderer 구현으로 확장한다. +문서 renderer는 모두 opt-in으로 사용한다. PDF는 `pdfbox`, PPTX는 `poi-ooxml`이 classpath에 있어야 한다. DOCX/HWP/HWPX preview renderer는 starter에서 `FileContentExtractionService` bean이 있을 때 등록된다. diff --git a/studio-platform-thumbnail/build.gradle.kts b/studio-platform-thumbnail/build.gradle.kts index 2460f5a4..05b4779c 100644 --- a/studio-platform-thumbnail/build.gradle.kts +++ b/studio-platform-thumbnail/build.gradle.kts @@ -19,6 +19,8 @@ tasks.named("bootJar") { dependencies { api(project(":studio-platform")) compileOnly("org.apache.pdfbox:pdfbox:${property("apachePdfBoxVersion")}") + compileOnly("org.apache.poi:poi-ooxml:${property("apachePoiVersion")}") testImplementation("org.apache.pdfbox:pdfbox:${property("apachePdfBoxVersion")}") + testImplementation("org.apache.poi:poi-ooxml:${property("apachePoiVersion")}") } diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailGenerationService.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailGenerationService.java index 5c43895e..427df223 100644 --- a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailGenerationService.java +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailGenerationService.java @@ -27,7 +27,11 @@ public ThumbnailOptions resolveOptions(int size, String format) { int resolvedSize = size > 0 ? size : generationOptions.defaultSize(); resolvedSize = Math.max(generationOptions.minSize(), Math.min(generationOptions.maxSize(), resolvedSize)); String resolvedFormat = ThumbnailFormats.normalizeOrDefault(format, generationOptions.defaultFormat()); - return new ThumbnailOptions(resolvedSize, resolvedFormat, generationOptions.maxSourcePixels()); + return new ThumbnailOptions( + resolvedSize, + resolvedFormat, + generationOptions.maxSourcePixels(), + generationOptions.maxSourceBytes()); } public Optional generate( diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailOptions.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailOptions.java index 60f03edb..2e440753 100644 --- a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailOptions.java +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/ThumbnailOptions.java @@ -1,6 +1,8 @@ package studio.one.platform.thumbnail; -public record ThumbnailOptions(int size, String format, long maxSourcePixels) { +public record ThumbnailOptions(int size, String format, long maxSourcePixels, long maxSourceBytes) { + + private static final long DEFAULT_MAX_SOURCE_BYTES = 50L * 1024L * 1024L; public ThumbnailOptions { if (size <= 0) { @@ -9,6 +11,13 @@ public record ThumbnailOptions(int size, String format, long maxSourcePixels) { if (maxSourcePixels <= 0) { throw new IllegalArgumentException("Thumbnail max source pixels must be positive"); } + if (maxSourceBytes <= 0) { + throw new IllegalArgumentException("Thumbnail max source bytes must be positive"); + } format = ThumbnailFormats.normalize(format); } + + public ThumbnailOptions(int size, String format, long maxSourcePixels) { + this(size, format, maxSourcePixels, DEFAULT_MAX_SOURCE_BYTES); + } } diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/DocxThumbnailRenderer.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/DocxThumbnailRenderer.java new file mode 100644 index 00000000..902e409b --- /dev/null +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/DocxThumbnailRenderer.java @@ -0,0 +1,6 @@ +package studio.one.platform.thumbnail.renderer; + +import studio.one.platform.thumbnail.ThumbnailRenderer; + +public interface DocxThumbnailRenderer extends ThumbnailRenderer { +} diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpThumbnailRenderer.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpThumbnailRenderer.java new file mode 100644 index 00000000..167b1685 --- /dev/null +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpThumbnailRenderer.java @@ -0,0 +1,6 @@ +package studio.one.platform.thumbnail.renderer; + +import studio.one.platform.thumbnail.ThumbnailRenderer; + +public interface HwpThumbnailRenderer extends ThumbnailRenderer { +} diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpxThumbnailRenderer.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpxThumbnailRenderer.java new file mode 100644 index 00000000..a9917f09 --- /dev/null +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/HwpxThumbnailRenderer.java @@ -0,0 +1,6 @@ +package studio.one.platform.thumbnail.renderer; + +import studio.one.platform.thumbnail.ThumbnailRenderer; + +public interface HwpxThumbnailRenderer extends ThumbnailRenderer { +} diff --git a/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/PptxThumbnailRenderer.java b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/PptxThumbnailRenderer.java new file mode 100644 index 00000000..d79a16a3 --- /dev/null +++ b/studio-platform-thumbnail/src/main/java/studio/one/platform/thumbnail/renderer/PptxThumbnailRenderer.java @@ -0,0 +1,119 @@ +package studio.one.platform.thumbnail.renderer; + +import java.awt.Color; +import java.awt.Dimension; +import java.awt.Graphics2D; +import java.awt.RenderingHints; +import java.awt.image.BufferedImage; +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.util.Locale; +import java.util.zip.ZipEntry; +import java.util.zip.ZipInputStream; + +import org.apache.poi.xslf.usermodel.XMLSlideShow; +import org.apache.poi.xslf.usermodel.XSLFSlide; + +import studio.one.platform.thumbnail.ThumbnailFormats; +import studio.one.platform.thumbnail.ThumbnailGenerationException; +import studio.one.platform.thumbnail.ThumbnailImages; +import studio.one.platform.thumbnail.ThumbnailOptions; +import studio.one.platform.thumbnail.ThumbnailRenderLimits; +import studio.one.platform.thumbnail.ThumbnailRenderer; +import studio.one.platform.thumbnail.ThumbnailResult; +import studio.one.platform.thumbnail.ThumbnailSource; + +public class PptxThumbnailRenderer implements ThumbnailRenderer { + + private static final int BUFFER_SIZE = 8192; + private static final long MAX_ENTRY_BYTES = 16L * 1024L * 1024L; + + private final int slide; + + public PptxThumbnailRenderer(int slide) { + this.slide = Math.max(0, slide); + } + + @Override + public boolean supports(ThumbnailSource source) { + String contentType = source.contentType().toLowerCase(Locale.ROOT); + if (contentType.equals("application/vnd.openxmlformats-officedocument.presentationml.presentation")) { + return true; + } + return source.filename().toLowerCase(Locale.ROOT).endsWith(".pptx"); + } + + @Override + public ThumbnailResult render(ThumbnailSource source, ThumbnailOptions options) { + validatePackageBounds(source.bytes(), options.maxSourceBytes(), source.filename()); + try (XMLSlideShow presentation = new XMLSlideShow(new ByteArrayInputStream(source.bytes()))) { + if (presentation.getSlides().isEmpty()) { + throw new ThumbnailGenerationException("PPTX thumbnail source has no slides"); + } + int slideIndex = Math.min(slide, presentation.getSlides().size() - 1); + Dimension pageSize = presentation.getPageSize(); + int width = Math.max(1, pageSize.width); + int height = Math.max(1, pageSize.height); + ThumbnailRenderLimits.requirePixelsWithinLimit(width, height, options.maxSourcePixels(), source.filename()); + + BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB); + Graphics2D graphics = image.createGraphics(); + try { + graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); + graphics.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY); + graphics.setPaint(Color.WHITE); + graphics.fillRect(0, 0, width, height); + XSLFSlide selected = presentation.getSlides().get(slideIndex); + selected.draw(graphics); + } finally { + graphics.dispose(); + } + BufferedImage scaled = ThumbnailImages.scale(image, options.size()); + byte[] bytes = ThumbnailImages.write(scaled, options.format()); + return new ThumbnailResult(bytes, ThumbnailFormats.contentType(options.format()), options.format()); + } catch (IOException ex) { + throw new ThumbnailGenerationException("Failed to render PPTX thumbnail source", ex); + } catch (RuntimeException ex) { + if (ex instanceof ThumbnailGenerationException thumbnailEx) { + throw thumbnailEx; + } + throw new ThumbnailGenerationException("Failed to render PPTX thumbnail source", ex); + } + } + + private void validatePackageBounds(byte[] bytes, long maxSourceBytes, String filename) { + long maxEntryBytes = Math.min(MAX_ENTRY_BYTES, maxSourceBytes); + long totalBytes = 0; + try (ZipInputStream zip = new ZipInputStream(new ByteArrayInputStream(bytes))) { + ZipEntry entry; + while ((entry = zip.getNextEntry()) != null) { + if (entry.isDirectory()) { + continue; + } + long entryBytes = readBounded(zip, maxEntryBytes, entry.getName()); + totalBytes += entryBytes; + if (totalBytes > maxSourceBytes) { + throw new ThumbnailGenerationException("PPTX package exceeds max extracted bytes " + + maxSourceBytes + ": " + filename); + } + } + } catch (IOException ex) { + throw new ThumbnailGenerationException("Failed to validate PPTX package bounds", ex); + } + } + + private long readBounded(InputStream in, long maxBytes, String entryName) throws IOException { + byte[] buffer = new byte[BUFFER_SIZE]; + long total = 0; + int read; + while ((read = in.read(buffer)) != -1) { + total += read; + if (total > maxBytes) { + throw new ThumbnailGenerationException("PPTX package entry exceeds max extracted bytes " + + maxBytes + ": " + entryName); + } + } + return total; + } +} diff --git a/studio-platform-thumbnail/src/test/java/studio/one/platform/thumbnail/DocumentThumbnailRendererTest.java b/studio-platform-thumbnail/src/test/java/studio/one/platform/thumbnail/DocumentThumbnailRendererTest.java new file mode 100644 index 00000000..67cfa330 --- /dev/null +++ b/studio-platform-thumbnail/src/test/java/studio/one/platform/thumbnail/DocumentThumbnailRendererTest.java @@ -0,0 +1,117 @@ +package studio.one.platform.thumbnail; + +import static org.assertj.core.api.Assertions.assertThat; +import static org.assertj.core.api.Assertions.assertThatThrownBy; + +import java.awt.Color; +import java.awt.Dimension; +import java.awt.Rectangle; +import java.awt.image.BufferedImage; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.List; +import java.util.zip.ZipEntry; +import java.util.zip.ZipOutputStream; + +import javax.imageio.ImageIO; + +import org.apache.poi.xslf.usermodel.XMLSlideShow; +import org.apache.poi.xslf.usermodel.XSLFSlide; +import org.apache.poi.xslf.usermodel.XSLFTextBox; +import org.junit.jupiter.api.Test; + +import studio.one.platform.thumbnail.renderer.PptxThumbnailRenderer; + +class DocumentThumbnailRendererTest { + + @Test + void pptxRendererGeneratesThumbnailFromSlide() throws Exception { + ThumbnailGenerationService service = service(List.of(new PptxThumbnailRenderer(0))); + + ThumbnailResult result = service.generate( + new ThumbnailSource( + "application/vnd.openxmlformats-officedocument.presentationml.presentation", + "sample.pptx", + pptxFixture(new Dimension(960, 540))), + 96, + "png").orElseThrow(); + + BufferedImage image = ImageIO.read(new ByteArrayInputStream(result.bytes())); + assertThat(result.contentType()).isEqualTo("image/png"); + assertThat(image.getWidth()).isLessThanOrEqualTo(96); + assertThat(image.getHeight()).isLessThanOrEqualTo(96); + } + + @Test + void pptxRendererRejectsOversizedSlidePixels() throws Exception { + ThumbnailGenerationService service = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new PptxThumbnailRenderer(0))), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 1_000)); + + assertThatThrownBy(() -> service.generate( + new ThumbnailSource( + "application/vnd.openxmlformats-officedocument.presentationml.presentation", + "large.pptx", + pptxFixture(new Dimension(4000, 4000))), + 96, + "png")) + .isInstanceOf(ThumbnailGenerationException.class) + .hasMessageContaining("exceed max pixels"); + } + + @Test + void pptxRendererRejectsOversizedExtractedPackageEntryBeforePoiParsing() { + ThumbnailGenerationService service = new ThumbnailGenerationService( + new ThumbnailRendererFactory(List.of(new PptxThumbnailRenderer(0))), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024, 25_000_000)); + + assertThatThrownBy(() -> service.generate( + new ThumbnailSource( + "application/vnd.openxmlformats-officedocument.presentationml.presentation", + "large.pptx", + zipWithEntry("ppt/slides/slide1.xml", new byte[2048])), + 96, + "png")) + .isInstanceOf(ThumbnailGenerationException.class) + .hasMessageContaining("entry exceeds max extracted bytes"); + } + + @Test + void pptxRendererDoesNotSupportNonPptxSource() { + PptxThumbnailRenderer renderer = new PptxThumbnailRenderer(0); + + assertThat(renderer.supports(new ThumbnailSource("application/pdf", "sample.pdf", new byte[] {1}))) + .isFalse(); + } + + private static ThumbnailGenerationService service(List renderers) { + return new ThumbnailGenerationService( + new ThumbnailRendererFactory(renderers), + new ThumbnailGenerationOptions(128, "png", 16, 512, 1024 * 1024, 25_000_000)); + } + + private static byte[] pptxFixture(Dimension pageSize) throws Exception { + try (XMLSlideShow presentation = new XMLSlideShow(); + ByteArrayOutputStream output = new ByteArrayOutputStream()) { + presentation.setPageSize(pageSize); + XSLFSlide slide = presentation.createSlide(); + XSLFTextBox title = slide.createTextBox(); + title.setAnchor(new Rectangle(60, 60, Math.max(120, pageSize.width - 120), 120)); + title.setFillColor(Color.WHITE); + title.setLineColor(Color.WHITE); + title.setText("Thumbnail preview"); + presentation.write(output); + return output.toByteArray(); + } + } + + private static byte[] zipWithEntry(String name, byte[] data) throws Exception { + ByteArrayOutputStream output = new ByteArrayOutputStream(); + try (ZipOutputStream zip = new ZipOutputStream(output)) { + zip.putNextEntry(new ZipEntry(name)); + zip.write(data); + zip.closeEntry(); + } + return output.toByteArray(); + } +}