Google Gemini 3 Sets New SOTA on OmniDocBench: The New Standard for Document AI

The verdict is in: The world's smartest AI models are now being graded on data built by Abaka AI.

Yesterday, the AI world shifted with Google DeepMind’s release of Gemini 3, their most capable multimodal model to date. While the headlines focus on its reasoning capabilities, a deeper look into their technical report reveals a critical detail for the data industry:

OmniDocBench 1.5, a benchmark co-developed by 2077AI with contribution of Abaka AI, was selected as the core standard to evaluate Gemini 3’s Optical Character Recognition (OCR) and document understanding performance.

This follows closely on the heels of DeepSeek-OCR citing the same benchmark. The pattern is undeniable: when top-tier labs need to prove their models can handle the messy, complex reality of the visual world, they turn to OmniDocBench.

Gemini 3's Performance: A Leap Forward

According to Google’s report, Gemini 3 Pro achieved an Edit Distance of 0.115 on OmniDocBench 1.5 (lower is better).

This score isn't just a number; it represents a new State-of-the-Art (SOTA), outperforming formidable competitors like GPT-5.1 (0.147) and Claude Sonnet 4.5 (0.145).

Gemini 3 Pro sets a new record on OmniDocBench 1.5, validating its superior document processing capabilities against the industry's toughest test.

The Engine Behind the Benchmark: Abaka AI's Data Pipeline

Why has OmniDocBench become the de-facto industry standard so quickly? Because it was built to be unbreakable by simple models.

Building a benchmark of this caliber isn't just about collecting PDFs; it’s about extreme Data Engineering. As one of a core data partners of 2077AI Open Source Foundation, our advanced data construction pipeline was the engine that powered OmniDocBench’s creation.

To challenge models like Gemini 3 and GPT-5, we couldn't rely on clean, digital-born academic papers. We had to engineer complexity.

1. Engineering Diversity

We constructed a dataset spanning 9 distinct document types, including notoriously difficult formats like:

Handwritten Notes: Testing the limits of vision-text alignment.
Multi-Column Newspapers: Challenging layout analysis and reading order logic.
Financial Reports: Requiring precise extraction from dense, borderless tables.

2. Engineering Granularity

Standard OCR datasets give a "pass/fail." Abaka AI’s pipeline annotated 19 layout categories and 15 attribute labels for every single page. This granular labeling allows researchers at Google and DeepSeek to diagnose exactly why a model fails—whether it's a rotated table header or a complex mathematical formula.

3. Engineering Precision

Our "Human-in-the-Loop" pipeline ensures ground-truth accuracy that meets the rigorous standards of top research labs. When Google measures an Edit Distance difference of 0.002, they need to know that the benchmark itself is pixel-perfect. Abaka AI delivered that precision.

Great AI Starts with Great Data

The adoption of OmniDocBench by Google DeepMind validates a core truth of the Generative AI era: Model architecture is converging; data is the differentiator.

Whether you are training the next GPT-5 or fine-tuning a specialized vertical model, the ceiling of your performance is defined by the quality and complexity of your data.

At Abaka AI, we don't just build datasets; we build rulers that measure intelligence. If our data pipelines can challenge Gemini 3, imagine what they can do for your models.

🚀 Explore the Industry Standard

See the benchmark that Google and DeepSeek are using to test their frontiers.

🌐 Visit OmniDocBench Homepage: Click Here to Explore the Data
📄 Read the Technical Deep Dive: The Science Behind the Benchmark
🤝 Partner with Abaka AI: Contact Us for Custom Data Solutions

Abaka AI‘s OmniDocBench Standardizes Gemini 3’s Document Intelligence

Google Gemini 3 Sets New SOTA on OmniDocBench: The New Standard for Document AI

The verdict is in: The world's smartest AI models are now being graded on data built by Abaka AI.

Gemini 3's Performance: A Leap Forward

The Engine Behind the Benchmark: Abaka AI's Data Pipeline

1. Engineering Diversity

2. Engineering Granularity

3. Engineering Precision

Great AI Starts with Great Data

🚀 Explore the Industry Standard

What's your data
bottleneck this quarter?

What's your data
bottleneck this quarter?

Other Articles

Products

Services

Resources

About Us

Abaka AI‘s OmniDocBench Standardizes Gemini 3’s Document Intelligence

Google Gemini 3 Sets New SOTA on OmniDocBench: The New Standard for Document AI

The verdict is in: The world's smartest AI models are now being graded on data built by Abaka AI.

Gemini 3's Performance: A Leap Forward

The Engine Behind the Benchmark: Abaka AI's Data Pipeline

1. Engineering Diversity

2. Engineering Granularity

3. Engineering Precision

Great AI Starts with Great Data

🚀 Explore the Industry Standard

What's your databottleneck this quarter?

What's your databottleneck this quarter?

Other Articles

What's your data
bottleneck this quarter?

What's your data
bottleneck this quarter?