HomeModelsOtherapple/DFN5B-CLIP-ViT-H-14-378
D

apple/DFN5B-CLIP-ViT-H-14-378

Other·apple· 863.0K· 109
open_clip apple-amlr arxiv:2309.17425license:apple-amlrregion:us

A CLIP (Contrastive Language-Image Pre-training) model trained on DFN-5B. Data Filtering Networks (DFNs) are small networks used to automatically filter large pools of uncurated data. This model was trained on 5B images that were filtered from a pool of 43B uncurated image-text pairs (12.8B image-text pairs from CommonPool-12.8B + 30B additional public image-text pairs).

Open in MLForge Sign up free Desktop app Source ↗
# pull & run locally
pip install mlforge-sdk && mlforge pull apple/DFN5B-CLIP-ViT-H-14-378

Model details

Task
Other
Provider
apple
Framework
open_clip
Size
15 GB
License
apple-amlr
Downloads
863.0K
Likes
109
Paper
arXiv:2309.17425
Updated
2025-02-28

About apple/DFN5B-CLIP-ViT-H-14-378

A CLIP (Contrastive Language-Image Pre-training) model trained on DFN-5B. Data Filtering Networks (DFNs) are small networks used to automatically filter large pools of uncurated data. This model was trained on 5B images that were filtered from a pool of 43B uncurated image-text pairs (12.8B image-text pairs from CommonPool-12.8B + 30B additional public image-text pairs).

Related Other

E google/electra-base-discriminator Other 41.9M 128 🤗 HF A Bingsu/adetailer Other 12.6M 729 🤗 HF C colbert-ir/colbertv2.0 Other ·109.6M params 11.8M 362 🤗 HF C facebook/contriever Other 7.5M 93 🤗 HF W pyannote/wespeaker-voxceleb-resnet34-LM Other 7.3M 146 🤗 HF U lpiccinelli/unidepth-v2-vitl14 Other ·353.8M params 5.4M 34 🤗 HF