Name: jasperai/monet
Creator: jasperai
License: apache-2.0
Keywords: huggingface, task_categories:text-to-image, task_categories:image-feature-extraction, task_categories:zero-shot-image-classification, language:en, license:apache-2.0, size_categories:100M<n<1B, modality:image, arxiv:2605.21272, text-to-image, image-feature-extraction, zero-shot-image-classification

About jasperai/monet

MONET (Massive, Open, Non-redundant and Enriched Text-to-image dataset) is a large-scale, curated image-text dataset designed for training text-to-image (T2I) systems. It contains 104.9 million high-quality image-text pairs distilled from 2.9 billion raw pairs across nine heterogeneous open sources (6 real and 3 synthetic) through successive stages of safety filtering, domain-based filtering, exact and near-duplicate removal, and re-captioning with multiple vision-language models, and is further augmented with synthetically generated samples. Each image is released with pre-computed embeddings, structured annotations and pre-encoded VAE latents to accelerate downstream use.

jasperai/monet

Dataset details

About jasperai/monet