AI model inference training

Found 4 AI tools

tools

Primary Category: image

Subcategory: AI model inference training

Found 4 matching tools

Related AI Tools

Click any tool to view details

SMPLer-X

SMPLer-X is a human pose and shape estimation model based on big data and large models, which can uniformly capture the movements of the body, hands and face, and has a wide range of applications. This model achieves a significant improvement in the capabilities of EHPS by systematically studying data sets from 32 different scenarios, optimizing the training plan and selecting data sets. SMPLer-X adopts Vision Transformer for model expansion and transforms it into an expert model through fine-tuning strategies, further improving performance. The model performs well on multiple benchmarks such as AGORA (107.2 mm NMVE), UBody (57.4 mm PVE), EgoBody (63.6 mm PVE) and EHF (62.3 mm PVE without finetuning). The advantage of SMPLer-X is its ability to handle diverse data sources and its excellent generalization capabilities and portability.

大模型大数据人体姿态估计 +2

图像 Visit

DreamLLM

DreamLLM is a learning framework that enables for the first time the synergy between multimodal understanding and creation of multimodal large language models (LLM). It generates posterior models of language and images by directly sampling in the original multi-modal space. This approach avoids the limitations and information loss inherent in external feature extractors like CLIP, resulting in a more comprehensive multi-modal understanding. DreamLLM also efficiently learns all conditional, marginal and joint multi-modal distributions by modeling text and image content as well as raw cross-documents with no structure layout. Therefore, DreamLLM is the first MLLM capable of generating free-form cross-content. Comprehensive experiments demonstrate the superior performance of DreamLLM as a zero-shot multimodal generalist, fully exploiting the enhanced learning synergy.

图像生成语言模型多模态

图像 Visit

DINOv2

DINOv2 is a self-supervised learning method for unsupervised learning that can generate high-performance visual features suitable for computer vision tasks. It requires no fine-tuning and is robust and performant across domains.

计算机视觉视觉特征无监督学习 +1

图像 Visit

CelebV-Text

CelebV-Text is a large-scale, high-quality, and diverse face text-video dataset designed to promote research on face text-video generation tasks. The dataset contains 70,000 video clips of faces in the wild, each with 20 texts, covering 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions, and 6 light directions. CelebV-Text validates its superiority in video, text, and text-video correlation through comprehensive statistical analysis, and builds a benchmark to standardize the evaluation of face text-video generation tasks.

视频视频创作数据集 +5

图像 Visit

Related Subcategories

Explore other subcategories under image Other Categories

AI design tools

832 tools

Image generation

771 tools

AI image generation

543 tools

Picture editing

522 tools

AI model

352 tools

AI image editing

196 tools

Development and Tools

95 tools

graphic design

68 tools

🖼️

Explore More image Tools

AI model inference training Hot image is a popular subcategory under 4 quality AI tools

Browse image Category Categories