Abaka AI - AI Data Annotation & Solution - Your Data Partner In The AI Industry

Off-the-Shelf Datasets

High-quality, ready-to-use datasets
for your AI models

Popular Datasets List

CODING

Repo Patch Dataset

Competitive Coding Dataset

Code Understanding Dataset

REASONING

Multimodal Chart-Analysis

AB-HLE Math QA

IMO Level Lean4 Formalization

IMAGE

Interleaved/Image-Editing

Animation Dataset

Stock Images With Dense Captions

VIDEO

Stock Videos With Dense Captions

360-degree Camera and Body Motion Videos

Talking Head Videos

AGENTIC

Browser Recording Dataset

Long Chain Agentic Dataset

Real World Tasks Dataset

AUDIO

Conversational Dataset

Multilingual TTS

Multilingual ASR

3D

Indoor Scenes and Floor Plans

3D Objects

Multi-Format 3D

Datasets Type

Coding

Datasets with repository patches, programming problems, and code comprehension tasks.

Reasoning

Datasets with math, logic, and multimodal problems, built to challenge multi-step problem solving and proof construction.

Image

Datasets for editing, medical imaging, and visual understanding, enabling context-rich perception and realistic image generation.

Video

Datasets with stock footage, motion, and talking-heads, improving spatial-temporal reasoning and video generation.

Audio

Datasets for speech recognition, emotion detection, speech synthesis

Agentic

Datasets of browser use, digital tasks, and demonstrations, training models to plan, act, and adapt in real environments.

3D

Datasets of indoor scenes and object libraries, supporting robotics, AR/VR, and 3D content creation.

Datasets illustrationDatasets illustration

Off-the-Shelf Datasets
Your Shortcut to Smarter AI

Contact Us