Found 4 AI tools
Click any tool to view details
InternVL2_5-1B-MPO is a multimodal large language model (MLLM) built on InternVL2.5 and Mixed Preference Optimization (MPO), demonstrating superior overall performance. The model integrates the newly incrementally pretrained InternViT with various pretrained large language models (LLMs), including InternLM 2.5 and Qwen 2.5, using randomly initialized MLP projectors. InternVL2.5-MPO retains the same "ViT-MLP-LLM" paradigm as InternVL 2.5 and its predecessors in model architecture, and introduces support for multiple image and video data. This model performs well in multi-modal tasks and can handle a variety of visual language tasks including image description, visual question answering, etc.
InternVL 2.5 is an advanced multi-modal large language model series that builds on InternVL 2.0 by introducing significant training and testing strategy enhancements and data quality improvements while maintaining its core model architecture. The model integrates the newly incrementally pretrained InternViT with various pretrained large language models, such as InternLM 2.5 and Qwen 2.5, using randomly initialized MLP projectors. InternVL 2.5 supports multiple image and video data, with dynamic high-resolution training methods that provide better performance when processing multi-modal data.
InternVL2_5-26B is an advanced multimodal large language model (MLLM) that is further developed based on InternVL 2.0 by introducing significant training and testing strategy enhancements and data quality improvements. The model maintains the "ViT-MLP-LLM" core model architecture of its predecessor and integrates the newly incrementally pretrained InternViT with various pretrained large language models (LLMs), such as InternLM 2.5 and Qwen 2.5, using randomly initialized MLP projectors. InternVL 2.5 series models demonstrate excellent performance in multi-modal tasks, especially in visual perception and multi-modal capabilities.
InternVL 2.5 is a series of advanced multimodal large language models (MLLM) that builds on InternVL 2.0 by introducing significant training and testing strategy enhancements and data quality improvements. This model series is optimized in terms of visual perception and multi-modal capabilities, supporting a variety of functions including image and text-to-text conversion, and is suitable for complex tasks that require processing of visual and language information.
Explore other subcategories under image Other Categories
832 tools
771 tools
543 tools
522 tools
352 tools
196 tools
95 tools
68 tools
multimodal model Hot image is a popular subcategory under 4 quality AI tools