Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ListenHub CLI

Command-line interface for [ListenHub](https://listenhub.ai) — create podcasts, text-to-speech, explainer videos, slides, AI images, and music from your terminal.
Command-line interface for [ListenHub](https://listenhub.ai) — create podcasts, text-to-speech, explainer videos, slides, AI images, music, and videos from your terminal.

[中文文档](README.zh-CN.md)

Expand Down Expand Up @@ -76,6 +76,15 @@ listenhub tts create --text "Hello, world" --lang en
| `listenhub image list` | List AI images |
| `listenhub image get <id>` | Get image details |

### Video Generation

| Command | Description |
| -------------------------- | ------------------------------ |
| `listenhub video create` | Create a video generation task |
| `listenhub video list` | List video tasks |
| `listenhub video get <id>` | Get video task details |
| `listenhub video estimate` | Estimate credit cost |

### Other

| Command | Description |
Expand Down Expand Up @@ -109,6 +118,9 @@ listenhub music cover --audio ./song.mp3
# Local image for reference (jpg, png, webp, gif; max 10MB)
listenhub image create --prompt "inspired by this" --reference ./photo.jpg

# Local video for reference (mp4, mov; max 50MB)
listenhub video create --prompt "same style" --reference-video ./clip.mp4 --input-video-duration 5

# URLs are passed through directly
listenhub music cover --audio https://example.com/song.mp3
```
Expand Down Expand Up @@ -158,6 +170,23 @@ listenhub image create \
--aspect-ratio 16:9 --size 4K
```

### Video generation

```bash
# Text-to-video
listenhub video create --prompt "A cat playing piano in a jazz bar"

# Image-to-video (first frame)
listenhub video create --prompt "Camera slowly zooms out" --first-frame ./scene.png

# With reference video
listenhub video create --prompt "Same style dancing" \
--reference-video ./clip.mp4 --input-video-duration 8

# Estimate credits
listenhub video estimate --model doubao-seedance-2-pro --resolution 1080p --duration 10
```

### JSON output for scripting

```bash
Expand Down
33 changes: 31 additions & 2 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ListenHub CLI

[ListenHub](https://listenhub.ai) 的命令行工具 — 在终端里创建播客、语音合成、讲解视频、幻灯片、AI 图片和音乐
[ListenHub](https://listenhub.ai) 的命令行工具 — 在终端里创建播客、语音合成、讲解视频、幻灯片、AI 图片、音乐和视频

[English](README.md)

Expand Down Expand Up @@ -76,6 +76,15 @@ listenhub tts create --text "你好世界" --lang zh
| `listenhub image list` | 列出图片 |
| `listenhub image get <id>` | 查看图片详情 |

### 视频生成

| 命令 | 说明 |
| -------------------------- | ---------------- |
| `listenhub video create` | 创建视频生成任务 |
| `listenhub video list` | 列出视频任务 |
| `listenhub video get <id>` | 查看视频任务详情 |
| `listenhub video estimate` | 预估积分消耗 |

### 其他

| 命令 | 说明 |
Expand All @@ -100,7 +109,7 @@ listenhub tts create --text "你好世界" --lang zh

## 本地文件上传

`music cover` 和 `image create` 支持引用本地文件。CLI 自动检测本地路径,校验格式和大小,上传到云存储后传给 API。
`music cover`、`image create` 和 `video create` 支持引用本地文件。CLI 自动检测本地路径,校验格式和大小,上传到云存储后传给 API。

```bash
# 本地音频文件用于翻唱(mp3, wav, flac, m4a, ogg, aac;最大 20MB)
Expand All @@ -109,6 +118,9 @@ listenhub music cover --audio ./song.mp3
# 本地图片用于参考(jpg, png, webp, gif;最大 10MB)
listenhub image create --prompt "以此为灵感" --reference ./photo.jpg

# 本地视频用于参考(mp4, mov;最大 50MB)
listenhub video create --prompt "同样风格" --reference-video ./clip.mp4 --input-video-duration 5

# URL 直接透传
listenhub music cover --audio https://example.com/song.mp3
```
Expand Down Expand Up @@ -158,6 +170,23 @@ listenhub image create \
--aspect-ratio 16:9 --size 4K
```

### 视频生成

```bash
# 文字生成视频
listenhub video create --prompt "一只猫在爵士酒吧弹钢琴"

# 图生视频(首帧)
listenhub video create --prompt "镜头缓缓拉远" --first-frame ./scene.png

# 带参考视频
listenhub video create --prompt "相同风格的舞蹈" \
--reference-video ./clip.mp4 --input-video-duration 8

# 预估积分
listenhub video estimate --model doubao-seedance-2-pro --resolution 1080p --duration 10
```

### 脚本中使用 JSON 输出

```bash
Expand Down
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "@marswave/listenhub-cli",
"version": "0.0.4",
"version": "0.0.5",
"description": "Command-line interface for ListenHub",
"license": "MIT",
"repository": "marswaveai/listenhub-cli",
Expand All @@ -25,7 +25,7 @@
"prepublishOnly": "pnpm run build"
},
"dependencies": {
"@marswave/listenhub-sdk": "^0.0.4",
"@marswave/listenhub-sdk": "^0.0.6",
"commander": "^14.0.3",
"open": "^10.0.0",
"ora": "^8.0.0"
Expand Down
10 changes: 5 additions & 5 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

91 changes: 91 additions & 0 deletions source/_shared/mp4-duration.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import {open} from 'node:fs/promises';

export async function getMp4Duration(filePath: string): Promise<number> {
const file = await open(filePath, 'r');
try {
const moovOffset = await findAtom(file, 'moov', 0, await fileSize(file));
if (moovOffset === undefined) {
throw new Error(`Cannot read video duration: moov atom not found in ${filePath}`);
}

const moovHeader = await readAtomHeader(file, moovOffset);
const moovEnd = moovOffset + moovHeader.size;
const mvhdOffset = await findAtom(file, 'mvhd', moovOffset + 8, moovEnd);
if (mvhdOffset === undefined) {
throw new Error(`Cannot read video duration: mvhd atom not found in ${filePath}`);
}

const dataOffset = mvhdOffset + 8;

const versionBuf = Buffer.alloc(1);
await file.read(versionBuf, 0, 1, dataOffset);
const version = versionBuf[0]!;

let timescale: number;
let duration: bigint;

if (version === 0) {
const buf = Buffer.alloc(8);
await file.read(buf, 0, 8, dataOffset + 4 + 8);
timescale = buf.readUInt32BE(0);
duration = BigInt(buf.readUInt32BE(4));
} else if (version === 1) {
const buf = Buffer.alloc(12);
await file.read(buf, 0, 12, dataOffset + 4 + 16);
timescale = buf.readUInt32BE(0);
duration = buf.readBigUInt64BE(4);
} else {
throw new Error(`Cannot read video duration: unsupported mvhd version ${String(version)}`);
}

if (timescale === 0) {
throw new Error(`Cannot read video duration: timescale is 0`);
}

return Math.round(Number(duration) / timescale);
} finally {
await file.close();
}
}

interface AtomHeader {
size: number;
type: string;
}

async function readAtomHeader(
file: Awaited<ReturnType<typeof open>>,
offset: number,
): Promise<AtomHeader> {
const buf = Buffer.alloc(8);
const {bytesRead} = await file.read(buf, 0, 8, offset);
if (bytesRead < 8) {
return {size: 0, type: ''};
}

const size = buf.readUInt32BE(0);
const type = buf.toString('ascii', 4, 8);
return {size, type};
}

async function findAtom(
file: Awaited<ReturnType<typeof open>>,
target: string,
start: number,
end: number,
): Promise<number | undefined> {
let offset = start;
while (offset < end) {
const header = await readAtomHeader(file, offset); // eslint-disable-line no-await-in-loop
if (header.size === 0) break;
if (header.type === target) return offset;
offset += header.size;
}

return undefined;
}

async function fileSize(file: Awaited<ReturnType<typeof open>>): Promise<number> {
const stat = await file.stat();
return stat.size;
}
37 changes: 37 additions & 0 deletions source/_shared/polling.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import type {
ListenHubClient,
LyricsTaskDetail,
MusicTaskDetail,
VideoGenerationTaskDetail,
} from '@marswave/listenhub-sdk';
import ora from 'ora';
import {CliTimeoutError} from './output.js';
Expand Down Expand Up @@ -127,6 +128,42 @@ export async function pollMusicTaskUntilDone(
throw new CliTimeoutError(`Timed out after ${timeoutS}s`);
}

export async function pollVideoTaskUntilDone(
client: ListenHubClient,
taskId: string,
options: {timeout?: number; json?: boolean},
): Promise<VideoGenerationTaskDetail> {
const timeoutS = options.timeout ?? 1200;
const maxAttempts = Math.ceil(timeoutS / (pollIntervalMs / 1000));
const spinner = options.json
? undefined
: ora({text: `Generating video... (1/${maxAttempts})`}).start();

for (let i = 0; i < maxAttempts; i++) {
if (i > 0) {
await sleep(pollIntervalMs); // eslint-disable-line no-await-in-loop
}

const task = await client.getVideoGenerationTask(taskId); // eslint-disable-line no-await-in-loop
if (task.status === 'success') {
spinner?.succeed('Video created successfully');
return task;
}

if (task.status === 'failed') {
spinner?.fail('Video creation failed');
throw new Error('Video creation failed');
}

if (spinner) {
spinner.text = `Generating video... (${String(i + 2)}/${maxAttempts})`;
}
}

spinner?.fail('Timed out');
throw new CliTimeoutError(`Timed out after ${timeoutS}s`);
}

const lyricsIntervalMs = 5000;

export async function pollLyricsTaskUntilDone(
Expand Down
15 changes: 11 additions & 4 deletions source/_shared/upload.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,22 @@ import {access, readFile, stat} from 'node:fs/promises';
import path from 'node:path';
import type {ListenHubClient} from '@marswave/listenhub-sdk';

type FileAcceptType = 'audio' | 'image';
type FileAcceptType = 'audio' | 'image' | 'video';

const audioExtensions = new Set(['.mp3', '.wav', '.flac', '.m4a', '.ogg', '.aac']);
const imageExtensions = new Set(['.jpg', '.jpeg', '.png', '.webp', '.gif']);
const videoExtensions = new Set(['.mp4', '.mov']);

const maxSizeBytes: Record<FileAcceptType, number> = {
audio: 20 * 1024 * 1024,
image: 10 * 1024 * 1024,
video: 50 * 1024 * 1024,
};

const categoryForType: Record<FileAcceptType, string> = {
audio: 'episode',
image: 'banana',
video: 'episode',
};

const mimeTypes = new Map<string, string>([
Expand All @@ -29,16 +32,20 @@ const mimeTypes = new Map<string, string>([
['.png', 'image/png'],
['.webp', 'image/webp'],
['.gif', 'image/gif'],
['.mp4', 'video/mp4'],
['.mov', 'video/quicktime'],
]);

function allowedExtensions(accept: FileAcceptType): Set<string> {
return accept === 'audio' ? audioExtensions : imageExtensions;
if (accept === 'audio') return audioExtensions;
if (accept === 'video') return videoExtensions;
return imageExtensions;
}

export async function resolveFileOrUrl(
client: ListenHubClient,
input: string,
options: {accept: FileAcceptType},
options: {accept: FileAcceptType; category?: string},
): Promise<string> {
const trimmed = input.trim();

Expand Down Expand Up @@ -77,7 +84,7 @@ export async function resolveFileOrUrl(
// Get presigned upload URL
const contentType = mimeTypes.get(ext)!;
const fileKey = path.basename(filePath);
const category = categoryForType[options.accept];
const category = options.category ?? categoryForType[options.accept];
const {presignedUrl, fileUrl} = await client.createFileUpload({
fileKey,
contentType,
Expand Down
2 changes: 2 additions & 0 deletions source/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import {register as registerPodcast} from './podcast/_cli.js';
import {register as registerSlides} from './slides/_cli.js';
import {register as registerSpeakers} from './speakers/_cli.js';
import {register as registerTts} from './tts/_cli.js';
import {register as registerVideo} from './video/_cli.js';

const program = new Command();
program.name('listenhub').description('ListenHub CLI').version('0.1.0');
Expand All @@ -23,6 +24,7 @@ registerImage(program);
registerMusic(program);
registerLyrics(program);
registerSpeakers(program);
registerVideo(program);
registerCreation(program);

program.parse();
Loading
Loading