Skip to content

Commit 26caeda

Browse files
docs(blog): OKAIBOX 개발 일기 Day 2 포스트 추가
AI 시대 개발 방향성에 대한 고찰 - 문서 편집 인프라의 부재와 HWP/HWPX 편집 엔진 개발로의 피봇을 다룬 포스트 (ko/en/ja)
1 parent eceb7b2 commit 26caeda

4 files changed

Lines changed: 743 additions & 0 deletions

File tree

Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
---
2+
layout: post
3+
title: "AI Can Do Everything Now - So What Should We Build?"
4+
date: 2026-02-19 09:00:00 +0900
5+
categories: [Development, DevDiary]
6+
tags: [OKAIBOX, AIAgent, MCP, HWP, DocumentEditing, DevDiary, KoreanAI, HWPX, AIInfrastructure]
7+
author: "Kevin Park"
8+
lang: en
9+
excerpt: "In an age where AI can build almost anything with a click, what should developers actually work on? A document editing failure pointed me toward the answer."
10+
image: "/assets/images/posts/okaibox-dev-diary-day2/hero.jpg"
11+
mermaid: true
12+
series: "OKAIBOX Dev Diary"
13+
---
14+
15+
![Hero Image](/assets/images/posts/okaibox-dev-diary-day2/hero.jpg)
16+
*OKAIBOX Dev Diary Day 2 - What Should Developers Build in the Age of AI?*
17+
18+
I was supposed to unbox the LattePanda IOTA today.
19+
20+
In [Day 1](/2026/02/14/okaibox-dev-diary-day1-en/), I even promised I'd crack open the hardware and share a detailed BOM breakdown. But before touching hardware, I need to sort out some thoughts first.
21+
22+
## When AI Can Do Everything, What Should We Build?
23+
24+
I've been having a bit of an existential crisis lately.
25+
26+
It's 2026. Fire up Cursor, say "build me this app," and it just... appears. AI agents like OpenClaw have crossed 160K GitHub stars, controlling PCs directly. MCP is becoming a standard protocol. AI reads files, fixes code, searches the web... anyone can do this now.
27+
28+
So the question that keeps nagging at me:
29+
30+
**"What should I actually be building?"**
31+
32+
When AI writes code for you, does writing code still matter? Even if I build an AI agent like OKAIBOX, OpenClaw is already doing it well. What kind of development actually means something in this era?
33+
34+
I sat with this question for a few days, trying different things. And I found my answer in an unexpected place.
35+
36+
## AI Can't Even Edit a Single Document
37+
38+
I have this quotation template we use at work - a .docx file. I asked AI to "update the amount in item 3."
39+
40+
The text changed. But all the formatting was gone.
41+
42+
Table borders disappeared, fonts changed, merged cells came apart. The clean, professional quotation became a blob of plain text. I connected an MCP server directly and tried manipulating the XML structure, but style information kept getting stripped away.
43+
44+
HWP files (Korea's standard word processor format)? Can't even open them. "Unsupported format." That's it.
45+
46+
This got me thinking deeply.
47+
48+
## Understanding "Content" vs. Handling "Form"
49+
50+
AI has reached near-perfect capability with **content**. Understanding text, grasping context, generating new content - it genuinely excels at this.
51+
52+
But **form** is an entirely different problem.
53+
54+
A document isn't just a sequence of text. Fonts, margins, tables, merged cells, page breaks, headers/footers, style references... these intertwine in complex ways to create what we call a "document."
55+
56+
Crack open a single DOCX file and you see the reality. It looks like one file, but it's actually a ZIP archive containing dozens of XML files. `document.xml`, `styles.xml`, `numbering.xml`, `relationships.xml`... modifying a single piece of text means updating style references, numbering, and relationship data across multiple files.
57+
58+
```mermaid
59+
graph TD
60+
subgraph "Where AI excels"
61+
A[Plain Text] --> B[Content Understanding]
62+
B --> C[Text Modification]
63+
C --> D[Result Output]
64+
end
65+
66+
subgraph "Where AI struggles"
67+
E[Formatted Document] --> F[Extract Archive]
68+
F --> G[Parse XML]
69+
G --> H[Map Styles]
70+
H --> I[Modify Content]
71+
I --> J[Reapply Styles]
72+
J --> K[Reassemble XML]
73+
K --> L[Recompress Archive]
74+
end
75+
76+
style A fill:#e8f5e8
77+
style D fill:#e8f5e8
78+
style E fill:#ffebee
79+
style H fill:#fff3e0
80+
style J fill:#fff3e0
81+
```
82+
83+
Left side: 4 steps. Right side: 7 steps. But it's not just about the number of steps.
84+
85+
The critical points are **"Map Styles"** and **"Reapply Styles."** If even one piece of information gets lost here, the output breaks. And this isn't a problem of "intelligence" - it's a problem of **tooling**.
86+
87+
The AI model itself is smart enough. Explain a document structure and it understands. But there's no intermediate tool that connects that understanding to actual file manipulation. Like having the world's best hammer but no nails.
88+
89+
## The Empty Layer in AI Development
90+
91+
Let's zoom out.
92+
93+
The AI development ecosystem right now is buzzing with activity - chatbots, RAG systems, AI agents, automation tools. Everyone's building valuable things on top of AI.
94+
95+
But something stands out. There's a lot of action above AI, and it's pretty quiet below.
96+
97+
Think about it this way:
98+
99+
```mermaid
100+
graph TB
101+
subgraph "Where most AI development happens"
102+
App1[Chatbots]
103+
App2[RAG Systems]
104+
App3[AI Agents]
105+
App4[Automation Tools]
106+
end
107+
108+
subgraph "AI Model Layer"
109+
LLM[GPT / Claude / Gemini]
110+
end
111+
112+
subgraph "The empty layer"
113+
Infra1[Document Format Editing Engine]
114+
Infra2[Local File System Bridge]
115+
Infra3[Regional Service API Adapters]
116+
Infra4[Legacy Format Converters]
117+
end
118+
119+
App1 --> LLM
120+
App2 --> LLM
121+
App3 --> LLM
122+
App4 --> LLM
123+
LLM --> Infra1
124+
LLM --> Infra2
125+
LLM --> Infra3
126+
LLM --> Infra4
127+
128+
style App1 fill:#e1f5fe
129+
style App2 fill:#e1f5fe
130+
style App3 fill:#e1f5fe
131+
style App4 fill:#e1f5fe
132+
style LLM fill:#fff3e0
133+
style Infra1 fill:#ffebee
134+
style Infra2 fill:#ffebee
135+
style Infra3 fill:#ffebee
136+
style Infra4 fill:#ffebee
137+
```
138+
139+
Plenty of apps are being built on top. Big tech companies are competing to improve the AI models in the middle. But **the bottom layer** - the infrastructure that connects AI to the real world - is largely empty.
140+
141+
Even if AI perfectly understands "change the amount in quotation item 3," without **a docx editing engine that preserves formatting**, it simply can't execute.
142+
143+
This is the gap in today's AI development ecosystem. Vibrant on top, hollow underneath.
144+
145+
## In Korea, This Gap Is Even Wider
146+
147+
This infrastructure layer is thin globally, but in Korea the situation is more severe.
148+
149+
| Area | Global Status | Korea Status |
150+
| ------ | --------- | -------- |
151+
| Document Formats | DOCX editing libraries exist (incomplete) | No HWP/HWPX editing libraries |
152+
| Messaging | WhatsApp/Telegram APIs well-established | KakaoTalk bot API limited |
153+
| Government Automation | Mostly web-standard based | ActiveX/security programs required |
154+
| Financial Services | Open banking APIs common | Complex certificate/security modules |
155+
156+
HWP is the prime example. It's Korea's proprietary document format by Hancom, with a binary structure that's inherently difficult to parse. The official specification is only partially public, and building a proper editing library sometimes requires reverse engineering.
157+
158+
HWPX is better - it's the next-gen format based on XML with an open structure. But a proper read-write-edit library? Doesn't exist. Not in Python. Not in JavaScript.
159+
160+
Will any global AI company build HWP support? No. Korea is the only country that uses it. OpenAI, Google, Anthropic - none of them will ever touch HWP.
161+
162+
**This isn't a problem that solves itself by waiting.**
163+
164+
## OKAIBOX Changes Direction
165+
166+
In [Day 1](/2026/02/14/okaibox-dev-diary-day1-en/), I introduced OKAIBOX as "Korea's OpenClaw." A hardware-based AI agent. KakaoTalk integration, native Windows, Korean service optimization.
167+
168+
But that alone would just make it a Korean version of OpenClaw. Playing on the surface.
169+
170+
What's really needed is building the layer underneath.
171+
172+
```mermaid
173+
graph LR
174+
subgraph "Day 1: Above the Surface"
175+
A1[OKAIBOX Hardware] --> A2[Windows 11 IoT]
176+
A2 --> A3[KakaoTalk Integration]
177+
A2 --> A4[Korean Web Automation]
178+
A2 --> A5["Open HWP<br/>(via Hangul program)"]
179+
end
180+
181+
subgraph "Day 2: Below the Surface"
182+
B1[HWP/HWPX Editing Engine] --> B2[MCP Server]
183+
B2 --> B3[OKAIBOX]
184+
B2 --> B4[Claude / GPT]
185+
B2 --> B5[OpenClaw and<br/>all AI agents]
186+
B1 --> B6[Format-Preserving Edit]
187+
B1 --> B7[Table/Chart Manipulation]
188+
end
189+
190+
style B1 fill:#e8f5e8
191+
style B2 fill:#e1f5fe
192+
```
193+
194+
In Day 1, I planned to "run the Hangul program to handle HWP files." But that's just depending on the Hangul application. It's not AI handling documents directly - it's mimicking what humans already do.
195+
196+
The new direction: **build an engine that directly parses and edits HWP/HWPX files**, wrap it as an MCP server, and make it callable from any AI. Not just OKAIBOX - Claude, GPT, OpenClaw, anything can use this engine to handle Korean documents.
197+
198+
Not an app. **Infrastructure.**
199+
200+
From "Korea's OpenClaw" to "Korean document infrastructure for AI." That's the pivot.
201+
202+
## My Answer: Build the Infrastructure
203+
204+
Making it possible for AI to do what it currently **can't**. Not touching the AI model itself, but expanding the **contact surface** where AI meets the real world. That's the development I want to do.
205+
206+
Build an HWP parsing library, and every AI in the world can handle Korean documents. Build a format-preserving document editing engine, and AI can work with real business documents.
207+
208+
It's not easy. You have to read file format specs, implement binary parsing, handle edge cases one by one. It's not glamorous either. But one piece of infrastructure like this opens up countless possibilities for everything built on top.
209+
210+
This is the direction I found in the age of one-click everything.
211+
212+
## Updated Roadmap
213+
214+
I've revised the plan from Day 1.
215+
216+
**Priority 1: HWP/HWPX Editing Engine**
217+
- Analyze the HWPX file format (starting with XML-based format)
218+
- Implement read/write/edit library
219+
- Format preservation as the key goal
220+
221+
**Priority 2: MCP Server**
222+
- Wrap the engine with MCP protocol
223+
- Make it immediately usable from Claude, GPT, etc.
224+
- Include DOCX format-preserving editing
225+
226+
**Priority 3: OKAIBOX Hardware (in parallel)**
227+
- LattePanda IOTA setup continues
228+
- Hardware and software development in parallel
229+
230+
The LattePanda unboxing has been pushed to Day 3 or Day 4. I'll be tearing apart the HWPX file format in Day 3.
231+
232+
## Wrapping Up
233+
234+
What's the most valuable thing a developer can do in the age of AI?
235+
236+
Building tools that let AI reach places it currently can't. That's the answer I found.
237+
238+
OKAIBOX started with a shallow direction - "Korea's OpenClaw." But hitting real walls showed me what's actually needed. The reality that AI can't edit a single Hangul file. Filling that gap feels like the work I should be doing.
239+
240+
Next up, I'll be dissecting actual HWPX files. It's XML-based, so it should be more approachable than DOCX... right?
241+
242+
---
243+
244+
**Series**: OKAIBOX Dev Diary
245+
- **Previous**: [Day 1 - Everyone's Talking About OpenClaw, But I Was Already Building the Korean Version](/2026/02/14/okaibox-dev-diary-day1-en/)
246+
- **Current**: Day 2 - AI Can Do Everything Now - So What Should We Build?
247+
- **Next**: Day 3 - Tearing Apart the HWPX File Format (Coming Soon)

0 commit comments

Comments
 (0)