💻

programming Category

AI image detection and recognition

Found 7 AI tools

tools

Primary Category: programming

Subcategory: AI image detection and recognition

Found 7 matching tools

Related AI Tools

Click any tool to view details

EAGLE

EAGLE is a vision-centered, high-resolution multimodal large language model (LLM) family that enhances the perceptual capabilities of multimodal LLMs by mixing visual encoders and different input resolutions. The model contains channel connection based 'CLIP+X' fusion, suitable for vision experts with different architectures (ViT/ConvNets) and knowledge (detection/segmentation/OCR/SSL). The EAGLE model family supports input resolutions over 1K and achieves excellent results on multi-modal LLM benchmarks, especially on resolution-sensitive tasks such as optical character recognition and document understanding.

大型语言模型多模态学习文档理解 +2

编程 Visit

labelU-Kit

labelU-Kit is an open source front-end labeling component library that provides labeling functions for images, videos, and audios, and supports multiple labeling methods such as 2D boxes, points, lines, polygons, and three-dimensional boxes. It is provided as an NPM package, which is convenient for developers to integrate into their own annotation platform to improve the efficiency and flexibility of data annotation.

人工智能机器学习数据标注 +1

编程 Visit

OnnxOCR

OnnxOCR is a lightweight OCR model reconstructed based on PaddleOCR. It is separated from the PaddlePaddle deep learning training framework and achieves fast inference speed. The model supports inference in over 80 languages, and after conversion to an ONNX model, inference is 5 times faster than using the PaddlePaddle framework. OnnxOCR is independent of the deep learning training framework and can be deployed directly. It is suitable for scenarios with limited computing power but accuracy needs to be maintained, and can be deployed on ARM and x86 architecture computers.

多语言支持 OCR ONNX +2

编程 Visit

JavaVision

JavaVision is an all-round visual intelligent recognition project developed based on Java. It not only implements core functions such as PaddleOCR-V4, YoloV8 object recognition, face recognition, and image search, but can also be easily expanded to other fields, such as speech recognition, animal recognition, security inspection, etc. Project features include the use of the SpringBoot framework, versatility, high performance, reliability and stability, easy integration and flexible scalability. JavaVision aims to provide Java developers with a comprehensive visual intelligent recognition solution, allowing them to build advanced, reliable and easy-to-integrate AI applications in a familiar and favorite programming language.

人工智能开源计算机视觉 +2

编程 Visit

PetThoughts

PetThoughts is an image recognition application built on the Gemini API. Users can upload photos of their pets, and the app will intelligently analyze the pet's facial expressions and environment to guess what it may be thinking. The application has functions such as image recognition, facial analysis, and environmental analysis. It can accurately identify the pet's facial expressions, analyze its possible emotional state, and infer the pet's activities based on the environment. Finally, through natural language processing technology, the recognition results are converted into readable text descriptions. The app provides a simple and intuitive user interface, allowing users to easily upload photos and obtain pet analysis results. It helps users gain a deeper understanding of their pets' emotions and preferences.

自然语言处理图像识别宠物 +2

编程 Visit

Surya

Surya is a multilingual document OCR toolkit with accurate line-by-line text detection. It works across a range of documents and languages (see Usage and Benchmarking for more details). Surya is named after the Indian sun god, symbolizing universal vision. Surya is implemented in Python 3.9+ and PyTorch, supporting efficient OCR processing in multiple languages, including image animation and personalized T2I models. Surya is characterized by its efficiency and multi-language support capabilities.

多语言支持文本处理 OCR

编程 Visit

MakeML

MakeML is a development tool that can build an image target detection neural network without writing any code. It provides a simple and easy-to-use graphical interface. Users only need to upload training set images, draw bounding boxes, and set parameters to train an efficient target detection model and export it to CoreML format for use in iOS Apps. MakeML solves the pain point of high threshold for neural network development. It does not require any machine learning or programming knowledge to obtain powerful deep learning capabilities.

深度学习无代码开发编程 +2

编程 Visit

Related Subcategories

Explore other subcategories under programming Other Categories

Development and Tools

768 tools

AI model

465 tools

code assistant

368 tools

AI development assistant

294 tools

Model training and deployment

140 tools

AI code assistant

85 tools

Development platform

66 tools

research tools

61 tools

💻

Explore More programming Tools

AI image detection and recognition Hot programming is a popular subcategory under 7 quality AI tools

Browse programming Category Categories