Skip to content
@HiThink-Research

HiThink Research

HiThink-Research

Popular repositories Loading

  1. BizFinBench BizFinBench Public

    A Business-Driven Real-World Financial Benchmark for Evaluating LLMs

    Python 218 8

  2. MME-Finance MME-Finance Public

    [MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning

    Python 43 4

  3. GAGE GAGE Public

    General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.

    Python 29 5

  4. BizFinBench.v2 BizFinBench.v2 Public

    BizFinBench.v2: A Unified Offline–Online Bilingual Benchmark for Expert-Level Financial Capability Evaluation of LLMs

    Python 24 1

  5. FinMTM FinMTM Public

    FinMTM: A Multi-Turn Multimodal Benchmark for Financial Reasoning and Agent Evaluation

    Python 15

  6. PuzzleClone PuzzleClone Public

    PuzzleClone: An SMT-Powered Framework for Synthesizing Verified Mathematical Reasoning Data

    Python 5

Repositories

Showing 10 of 10 repositories
  • FinMTM Public

    FinMTM: A Multi-Turn Multimodal Benchmark for Financial Reasoning and Agent Evaluation

    HiThink-Research/FinMTM’s past year of commit activity
    Python 15 0 0 0 Updated Jan 15, 2026
  • GAGE Public

    General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.

    HiThink-Research/GAGE’s past year of commit activity
    Python 29 5 4 0 Updated Jan 15, 2026
  • CCPO Public

    Compress2Focus: Efficient Coordinate Compression for Policy Optimization in Multi-Turn GUI Agents

    HiThink-Research/CCPO’s past year of commit activity
    Python 5 0 0 0 Updated Jan 13, 2026
  • BizFinBench.v2 Public

    BizFinBench.v2: A Unified Offline–Online Bilingual Benchmark for Expert-Level Financial Capability Evaluation of LLMs

    HiThink-Research/BizFinBench.v2’s past year of commit activity
    Python 24 1 0 0 Updated Jan 13, 2026
  • BizFinBench Public

    A Business-Driven Real-World Financial Benchmark for Evaluating LLMs

    HiThink-Research/BizFinBench’s past year of commit activity
    Python 218 8 0 0 Updated Jan 9, 2026
  • PuzzleClone Public

    PuzzleClone: An SMT-Powered Framework for Synthesizing Verified Mathematical Reasoning Data

    HiThink-Research/PuzzleClone’s past year of commit activity
    Python 5 Apache-2.0 0 1 0 Updated Jan 9, 2026
  • MME-Finance Public

    [MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning

    HiThink-Research/MME-Finance’s past year of commit activity
    Python 43 Apache-2.0 4 0 1 Updated Jan 8, 2026
  • NEXUS-O Public

    [MM 2025] NEXUS-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision

    HiThink-Research/NEXUS-O’s past year of commit activity
    4 0 0 0 Updated Oct 20, 2025
  • PolyhedronEvaluator Public

    PolyhedronEvaluator

    HiThink-Research/PolyhedronEvaluator’s past year of commit activity
    Python 1 0 0 0 Updated Sep 19, 2025
  • HiThink-Research/Published_Papers’s past year of commit activity
    0 0 0 0 Updated Feb 17, 2025