[Bug] longbenchv2精度测评无结果输出

### 操作系统及版本

ubuntu20.04

### 安装工具的python环境

在anaconda/miniconda创建的python虚拟环境

### python版本

3.10

### AISBench工具版本

最新的master分支

### AISBench执行命令

ais_bench --models vllm_api_general_chat --datasets longbenchv2_gen --debug --num-warmups 0 --num-prompts 98

### 模型配置文件或自定义配置文件内容

api配置如下
```
root@localhost:~/benchmark# cat /root/benchmark/ais_bench/benchmark/configs/models/vllm_api/vllm_api_general_chat.py
from ais_bench.benchmark.models import VLLMCustomAPIChat
from ais_bench.benchmark.utils.postprocess.model_postprocessors import extract_non_reasoning_content

models = [
    dict(
        attr="service",
        type=VLLMCustomAPIChat,
        abbr="vllm-api-general-chat",
        path="/mnt/share/weights/Qwen3.5-397B-A17B-w8a8-org",
        model="qwen3.5",
        stream=False,
        request_rate=0,
        use_timestamp=False,
        retry=2,
        api_key="",
        host_ip="141.61.81.51",
        host_port=8010,
        url="",
        max_out_len=32768,
        batch_size=16,
        trust_remote_code=False,
        generation_kwargs=dict(
            temperature=0,
            ignore_eos=False,
        ),
        pred_postprocessor=dict(type=extract_non_reasoning_content),
    )
]

```

### 预期行为

_No response_

### 实际行为

测评打屏结果为
```
root@localhost:~/benchmark# ais_bench --models vllm_api_general_chat --datasets longbenchv2_gen --debug --num-warmups 0 --num-prompts 98
[2026-03-27 13:03:08,618] [ais_bench] [INFO] Loading vllm_api_general_chat: /root/benchmark/ais_bench/benchmark/configs/./models/vllm_api/vllm_api_general_chat.py
[2026-03-27 13:03:08,623] [ais_bench] [INFO] Loading longbenchv2_gen: /root/benchmark/ais_bench/benchmark/configs/./datasets/longbenchv2/longbenchv2_gen.py
[2026-03-27 13:03:08,625] [ais_bench] [INFO] Loading example: /root/benchmark/ais_bench/benchmark/configs/./summarizers/example.py
[2026-03-27 13:03:08,650] [ais_bench] [INFO] Current exp folder: outputs/default/20260327_130300
[2026-03-27 13:03:08,651] [ais_bench] [INFO] Keeping the first 98 prompts for dataset [LongBenchv2]
[2026-03-27 13:03:08,705] [ais_bench] [INFO] Starting inference tasks...
[2026-03-27 13:03:08,708] [ais_bench] [INFO] Partitioned into 1 tasks.
[2026-03-27 13:03:08,738] [ais_bench] [INFO] Launch TasksMonitor, PID: 138125, Refresh interval: 0.5, Run in background: True
[2026-03-27 13:03:17,741] [ais_bench] [INFO] Debug mode, print progress directly
[2026-03-27 13:03:17,742] [ais_bench] [INFO] Task [vllm-api-general-chat/LongBenchv2]
[2026-03-27 13:05:04,279] [ais_bench] [INFO] Zero Retriever initialized, returning empty shot case for all queries
[2026-03-27 13:05:04,822] [ais_bench] [INFO] Apply ice template finished
[2026-03-27 13:05:04,826] [ais_bench] [INFO] Warmup size is 0, skip...
[2026-03-27 13:05:04,879] [ais_bench] [INFO] Dataset needed memory size: 64.17680168 MB
[2026-03-27 13:05:04,879] [ais_bench] [INFO] Memory usage check passed: 13.02% < 80% (Available: 1751.38 GB)
/usr/local/python3.11.10/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
Bus error (core dumped)
[2026-03-27 13:05:10,060] [ais_bench] [INFO] Inference tasks completed.
[2026-03-27 13:05:10,066] [ais_bench] [INFO] Starting evaluation tasks...
[2026-03-27 13:05:10,069] [ais_bench] [INFO] Partitioned into 1 tasks.
[2026-03-27 13:05:10,088] [ais_bench] [INFO] Launch TasksMonitor, PID: 139489, Refresh interval: 0.5, Run in background: True
[2026-03-27 13:05:19,747] [ais_bench] [INFO] Debug mode, print progress directly
/usr/local/python3.11.10/lib/python3.11/site-packages/urllib3/connectionpool.py:1097: InsecureRequestWarning: Unverified HTTPS request is being made to host '141.0.180.100'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
[2026-03-27 13:07:04,262] [ais_bench] [WARNING] Task vllm-api-general-chat/LongBenchv2: No predictions found.
[2026-03-27 13:07:04,263] [ais_bench] [INFO] Evaluation task time elapsed: 104.52s
[2026-03-27 13:07:05,619] [ais_bench] [INFO] Evaluation tasks completed.
[2026-03-27 13:07:05,622] [ais_bench] [INFO] Summarizing evaluation results...
dataset      version    metric    mode    vllm-api-general-chat
-----------  ---------  --------  ------  -----------------------
LongBenchv2  -          -         -       -
[2026-03-27 13:07:05,626] [ais_bench] [INFO] write summary to /root/benchmark/outputs/default/20260327_130300/summary/summary_20260327_130300.txt
[2026-03-27 13:07:05,626] [ais_bench] [INFO] write csv to /root/benchmark/outputs/default/20260327_130300/summary/summary_20260327_130300.csv


The markdown format results is as below:

| dataset | version | metric | mode | vllm-api-general-chat |
|----- | ----- | ----- | ----- | -----|
| LongBenchv2 | - | - | - | - |

[2026-03-27 13:07:05,626] [ais_bench] [INFO] write markdown summary to /root/benchmark/outputs/default/20260327_130300/summary/summary_20260327_130300.md
```

--num-prompts 96 可以正常跑，--num-prompts 98就会出现上述问题

**无进度条，无报错日志**，初步怀疑为共享内存不够，因为loongbenchv2数据集上下文较长，超过了共享内存大小，待定位排查

### 前置检查

- [x] 我已读懂主页文档的快速入门，无法解决问题
- [x] 我已检索过FAQ，无重复问题
- [x] 我已搜索过现有Issue，无重复问题
- [x] 我已更新到最新版本，问题仍存在

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] longbenchv2精度测评无结果输出 #220

操作系统及版本

安装工具的python环境

python版本

AISBench工具版本

AISBench执行命令

模型配置文件或自定义配置文件内容

预期行为

实际行为

前置检查

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] longbenchv2精度测评无结果输出 #220

Description

操作系统及版本

安装工具的python环境

python版本

AISBench工具版本

AISBench执行命令

模型配置文件或自定义配置文件内容

预期行为

实际行为

前置检查

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions