Found 4 AI tools
Click any tool to view details
SA-V Dataset is an open-world video dataset designed for training general object segmentation models, containing 51K diverse videos and 643K spatio-temporal segmentation masks (masklets). This dataset is used for computer vision research and is allowed to be used under the CC BY 4.0 license. Video content is diverse and includes topics such as places, objects, and scenes, with masks ranging from large-scale objects such as buildings to details such as interior decorations.
emo-visual-data is a public emoticon visual annotation data set. It collects 5329 emoticons through visual annotation completed using the glm-4v and step-free-api projects. This dataset can be used to train and test large multi-modal models and is of great significance for understanding the relationship between image content and text description.
ImageInWords (IIW) is a human-involved iterative annotation framework for curating hyper-detailed image descriptions and generating a new dataset. This dataset achieves state-of-the-art results by evaluating automated and human-parallelism (SxS) metrics. The IIW dataset significantly improves upon previous datasets and GPT-4V output in generating descriptions across multiple dimensions, including readability, comprehensiveness, specificity, hallucination, and human-likeness. Furthermore, models fine-tuned using IIW data performed well in text-to-image generation and visual language reasoning, able to generate descriptions that were closer to the original images.
CelebV-Text is a large-scale, high-quality, and diverse face text-video dataset designed to promote research on face text-video generation tasks. The dataset contains 70,000 video clips of faces in the wild, each with 20 texts, covering 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions, and 6 light directions. CelebV-Text validates its superiority in video, text, and text-video correlation through comprehensive statistical analysis, and builds a benchmark to standardize the evaluation of face text-video generation tasks.
Explore other subcategories under image Other Categories
832 tools
771 tools
543 tools
522 tools
352 tools
196 tools
95 tools
68 tools
AI dataset Hot image is a popular subcategory under 4 quality AI tools