Found 100 AI tools
Click any tool to view details
ComfyUI-Fluxtapoz is a collection of nodes designed for Flux to edit images in ComfyUI. It allows users to edit and style images through a series of node operations, and is especially suitable for professionals who need to perform image processing and creative work. This project is currently open source and follows the GPL-3.0 license agreement, which means that users can freely use, modify and distribute the software, but they need to comply with the relevant provisions of the open source license.
FaceFusion Labs is a leading platform focused on facial manipulation, leveraging advanced technology to enable the fusion and manipulation of facial features. The platform’s main advantages include high-precision facial recognition and fusion capabilities, as well as a developer-friendly API interface. FaceFusion Labs background information shows that it made an initial submission on October 15, 2024, and was developed by Henry Ruhs. The product is positioned as an open source project, encouraging community contributions and collaboration.
DisEnvisioner is an advanced image generation technology that isolates and enhances subject features to generate customized images without tedious adjustments or reliance on multiple reference images. This technology effectively distinguishes and enhances subject features while filtering out irrelevant attributes, achieving superior personalization quality in terms of editability and identity preservation. The research background of DisEnvisioner is based on the current need in the field of image generation for extracting subject features from visual cues. It solves the challenges of existing technologies in this field through innovative methods.
FacePoke is an AI-powered real-time head and face transformation tool that allows users to manipulate facial features through an intuitive drag-and-drop interface, breathing life into portraits for realistic animations and expressions. FacePoke utilizes advanced AI technology to ensure that all edits maintain a natural and realistic appearance, while automatically adjusting surrounding facial areas to maintain the overall integrity of the image. This tool stands out for its user-friendly interface, real-time editing capabilities, and advanced AI-driven adjustments, making it suitable for users of all skill levels, whether they are professional content creators or beginners.
Pic Pic AI Editor is a powerful AI picture editing tool that provides a variety of functions such as photo enhancement, background removal, object removal, etc., allowing users to easily edit photos at a professional level. This product is based on a user-friendly interface and efficient AI technology, aiming to simplify the image editing process and improve editing efficiency while ensuring the output image quality. Pic Pic AI editor is suitable for users of all levels, whether they are social media users, e-commerce sellers or professional photographers, who can improve their image processing capabilities through this platform.
photo4you is an online ID photo production website based on artificial intelligence technology. Users can easily create ID photos without downloading or installing any software. The website supports a variety of standard sizes for official documents such as passports, visas, and driver's licenses. It automatically removes photo backgrounds through intelligent background removal, ensuring that ID photos have a clear, professional look. Users can download the produced ID photos immediately, saving time and trouble. photo4you provides high-resolution output suitable for printing or digital submission.
PMRF (Posterior-Mean Rectified Flow) is a newly proposed image restoration algorithm designed to solve the distortion-perceptual quality trade-off problem in image restoration tasks. It proposes a novel image restoration framework by combining posterior mean and correction flow, which can reduce image distortion while ensuring the perceptual quality of the image.
DepthFlow is a highly customizable parallax shader for animating your images. It is a free and open source ImmersityAI alternative capable of converting images into videos with 2.5D parallax effect. This tool has fast rendering capabilities and supports a variety of post-processing effects, such as vignette, depth of field, lens distortion, etc. It supports a variety of parameter adjustments, can create flexible motion effects, and has a variety of built-in preset animations. In addition, it also supports video encoding and export, including H264, HEVC, AV1 and other formats, and provides a user experience without watermarks.
Minionverse is an AI-based creative workflow that generates images by using different nodes and models. This workflow is inspired by an online glif application and provides a video tutorial to guide users on how to use it. It contains a variety of custom nodes that can perform text replacement, conditional loading, image saving and other operations. It is very suitable for users who need to generate and edit images.
PuLID-Flux ComfyUI implementation is an image processing model based on ComfyUI, which uses PuLID technology and Flux model to achieve advanced customization and processing of images. This project was inspired by cubiq/PuLID_ComfyUI and is a prototype that uses some handy model tricks to handle the encoder part. The developers wish to test the quality of the model before re-implementing it more formally. For better results, it is recommended to use the 16-bit or 8-bit version of the GGUF model.
Posterior-Mean Rectified Flow (PMRF) is a novel image restoration algorithm that minimizes the mean square error (MSE) by optimizing the posterior mean and rectified flow model while ensuring image fidelity. The PMRF algorithm is simple and efficient, and its theoretical basis is to optimize the posterior mean prediction (minimum mean square error estimate) to match the real image distribution. This algorithm performs well in image restoration tasks, can handle various degradation problems such as noise and blur, and has good perceptual quality.
FaceFusion is an industry-leading facial manipulation platform specializing in face swapping, lip sync, and deep manipulation technologies. It utilizes advanced artificial intelligence technology to provide users with a highly realistic facial operation experience. FaceFusion has a wide range of applications in image processing and video production, especially in the entertainment and media industries.
Light and Shadow Magic Hand is a feature-rich image processing software that provides a variety of photo editing tools and AI technology to help users easily edit and beautify photos. The software has a friendly interface, simple operation, supports a variety of image formats, and is suitable for users of all levels.
StableDelight is an advanced model focused on removing specular reflections from textured surfaces. It builds on the success of StableNormal, which focuses on improving the stability of monocular normal estimation. StableDelight solves the challenging task of removing reflections by applying this concept. The training data includes Hypersim, Lumos, and various specular highlight removal datasets from TSHRNet. Furthermore, we integrate multi-scale SSIM loss and stochastic conditional scaling techniques during diffusion training to improve the clarity of one-step diffusion predictions.
Colorful Diffuse Intrinsic Image Decomposition is an image processing technique that decomposes photos taken in the wild into albedo, diffuse shadows, and non-diffuse residual components. This technique enables the estimation of colorful diffuse shadows in images by progressively removing monochromatic lighting and Lambertian world assumptions, including multiple lighting and secondary reflections in the scene, while modeling specular and visible light sources. This technology is important for image editing applications such as specular removal and pixel-level white balancing.
This is a method of creating re-illuminable radiation fields by leveraging priors extracted from 2D image diffusion models. This method is able to convert multi-view data captured under single illumination conditions into a dataset with multiple illumination effects and represent the re-illuminable radiation field through 3D Gaussian splats. This method does not rely on precise geometry and surface normals, and is therefore more suitable for handling cluttered scenes with complex geometries and reflective BRDFs.
opencv_contrib is an additional module library for OpenCV used to develop and test new image processing functions. These modules are usually integrated into the OpenCV core library after the API is stable, fully tested, and widely accepted. This library allows developers to use the latest image processing technology, driving innovation in the field of computer vision.
OpenCV is a cross-platform open source computer vision and machine learning software library that provides a range of programming functions, including but not limited to image processing, video analysis, feature detection, machine learning, etc. This library is widely used in academic research and commercial projects, and is favored by developers because of its powerful functionality and flexibility.
Diffusers Image Outpaint is an image epitaxy technology based on the diffusion model, which can generate additional parts of the image based on the existing image content. This technology has broad application prospects in image editing, game development, virtual reality and other fields. It uses advanced machine learning algorithms to make image generation more natural and realistic, providing users with an innovative image processing method.
FLUX.1-dev-Controlnet-Inpainting-Alpha is an AI image repair model released by AlimamaCreative Team, specifically designed to repair and fill in missing or damaged parts of images. This model performs best at 768x768 resolution and is able to achieve high-quality image restoration. As an alpha version, it demonstrates advanced technology in the field of image restoration, and is expected to provide even more superior performance with further training and optimization.
FLUX-Controlnet-Inpainting is an image repair tool based on the FLUX.1-dev model released by Alimama's creative team. This tool uses deep learning technology to repair images and fill in missing parts, and is suitable for image editing and enhancement. It performs best at 768x768 resolution and can provide high-quality image repair effects. The tool is currently in alpha testing stage and an updated version will be released in the future.
finegrain-object-cutter is an image editing tool based on the Hugging Face Spaces platform. It uses advanced machine learning technology to achieve fine-grained cutting of objects in images. The main advantages of this tool are its high accuracy and ease of use, which allow users to achieve complex image editing tasks with simple operations. It is especially suitable for designers and developers who need to perform fine processing of images, and can be widely used in image editing, augmented reality, virtual reality and other fields.
FlexClip AI Image to Image Generator is an online image conversion tool that uses advanced AI technology to convert user-uploaded images into different artistic styles. This product ensures high-quality image style conversion through continuously updated AI models, and is suitable for professional and personal use. It also provides rich AI features such as AI text to image, AI text to video, and AI background remover to speed up the photo and video creation process.
RapidLayoutRecover is a layout restoration tool specifically for document images. It can integrate the results of layout analysis, text recognition, table recognition and formula recognition to restore the original layout information of the document. This tool is of great value in the fields of document digitization, archives management, and academic research, and can significantly improve the efficiency and accuracy of document processing.
Visual Try-On Chrome Extension is a Chrome browser plug-in that uses artificial intelligence image processing technology to allow users to virtually try on clothes on any e-commerce website. The plug-in captures the main product image through OpenAI GPT-4, uploads user images to Cloudinary, uses the Kolors model on Hugging Face for AI processing, and stores the results in the browser cache to improve usability. It protects user privacy and does not send personal data or pictures to the server, except when Hugging Face performs AI processing.
ComfyUI-AdvancedLivePortrait is an advanced tool for real-time preview and editing of facial expressions. It allows users to track and edit faces in videos, insert expressions into videos, and even extract expressions from sample photos. This project simplifies the installation process by automating the installation using ComfyUI-Manager. It combines image processing and machine learning technologies to provide users with a powerful tool for creating dynamic and interactive media content.
Flux Latent Detailer is an experimental tool capable of producing finer details in images through Flux's latent space interpolation technology. This tool works through multiple passes to try to enhance image details without ruining the overall composition, while avoiding an over-processed look. The developer emphasizes that this is an experimental project and does not provide support, it is only for sharing.
Dark Gray Photography is an image generation model focused on generating images of dark gray tones and East Asian women. This model is based on LoRA technology and is trained through deep learning to generate images with consistent style and bright colors. It is particularly suitable for users who need to use dark gray tones in portrait, product, architectural and nature landscape photography.
bonding_w_geimini is an image processing application developed based on the Streamlit framework. It allows users to upload pictures, perform object detection through the Gemini API, and draw the bounding box of the object directly on the picture. This application uses machine learning models to identify and locate objects in pictures, which is of great significance to fields such as image analysis, data annotation, and automated image processing.
Magnifier Lens Effect is a JavaScript library that allows users to add a magnifying glass effect to any image and adjust the magnification by rolling the mouse wheel. The library is easy to integrate and customize, and is suitable for web pages that require detailed display of images.
DiPIR is a physics-based method jointly developed by the Toronto AI Lab and NVIDIA Research that enables virtual objects to be realistically inserted into indoor and outdoor scenes by recovering scene lighting from a single image. The technology not only optimizes materials and tone mapping, but also automatically adjusts to different environments to improve the realism of images.
MagicFixup is an open source image editing model launched by Adobe Research that simplifies the photo editing process by observing dynamic video. The model uses deep learning technology to automatically identify and repair defects in images, improve editing efficiency and reduce the need for manual operations. It is trained on the Stable Diffusion 1.4 model and has powerful image processing capabilities, suitable for professional image editors and enthusiasts.
AI Photo Editor is an advanced photo editing app powered by AI technology that provides a seamless and intuitive experience for both beginners and professionals. It is a one-stop design studio that can remove unwanted objects from photos, enhance image quality, apply amazing filters, and even convert photos into anime-style portraits, all with AI precision. Whether you're editing photos for fun or looking for professional-quality results, this app makes the process easy and free.
TurboEdit is a technology developed based on Adobe Research to solve the challenges of precise image inversion and decoupled image editing. It achieves the ability to precisely edit images in a few steps through iterative inversion technology and conditional control based on text prompts. This technique is not only fast, but also outperforms existing multi-step diffusion model editing techniques.
birefnet for background removal is an image processing model based on deep learning that can automatically identify and remove the background in images while retaining foreground objects. This technology has important application value in image editing, advertising design, product display and other fields. Its main advantages include easy operation, fast processing speed and natural effects. Product background information includes its development team, technical principles, market positioning, etc.
Amazon Titan Image Generator v2 is an AI image generation model launched by AWS that streamlines workflows, increases productivity, and turns creative visions into reality by using reference images, editing existing visuals, removing backgrounds, generating image variations, and safely customizing models to maintain brand style and theme consistency.
PanoFree is an innovative panoramic multi-view image generation technology that solves consistency and cumulative error issues through iterative deformation and colorization without the need for additional tuning. This technique shows significant error reduction in experiments, improves global consistency, and improves image quality without adding additional tuning. Compared with existing methods, PanoFree is more efficient in time and GPU memory usage while maintaining the diversity of results.
Matting by Generation is an online tool that uses artificial intelligence technology for image segmentation. It can identify the foreground and background in images and achieve precise separation. It is widely used in fields such as design, video production and image editing. The main advantages of the product include high efficiency, easy operation and high-quality segmentation results.
Fai-Fuzer is an image editing tool based on AI technology, which can achieve precise editing and control of images through advanced control network technology. The main advantage of this tool is its high flexibility and accuracy, which can be widely used in image repair, beautification, creative editing and other fields.
ComfyUI-segment-anything-2 is an image segmentation library based on the segment-anything-2 model, which allows users to easily implement image segmentation functions through ComfyUI nodes. The library is currently in the development stage, but the functionality is basically available. It provides users with a simple and easy-to-use image segmentation solution by automatically downloading the model and integrating it into ComfyUI.
Meta Segment Anything Model 2 (SAM 2) is a next-generation model developed by Meta for real-time, promptable object segmentation in videos and images. It achieves state-of-the-art performance and supports zero-shot generalization, i.e., no need for custom adaptation to apply to previously unseen visual content. The release of SAM 2 follows an open science approach, with the code and model weights shared under the Apache 2.0 license, and the SA-V dataset also shared under the CC BY 4.0 license.
Alchemist is a technology that leverages pre-trained text-to-image models and synthetic data to allow users to edit the material properties of objects in images. It enables parametric editorial control over an object's specific material properties such as roughness, metallicity, base color saturation, and transparency by fine-tuning synthetic datasets. The main advantages of this technology include the ability to change the properties of the object while maintaining the object geometry and image lighting, and even to realistically fill in the background, hidden internal structures and refracted light effects when the object is made transparent.
Stable-Hair is a novel diffusion model-based hairstyle transfer method that can robustly transfer real-world diverse hairstyles to user-provided facial images for virtual try-on. The method excels when dealing with complex and diverse hairstyles, maintaining original identity content and structure while achieving highly detailed and high-fidelity transfer effects.
Diffree is a text-guided image inpainting model that is able to add new objects to images through text descriptions while maintaining background consistency, spatial suitability, and object relevance and quality. By training on the OABench dataset, using a stable diffusion model and an additional mask prediction module, the model is uniquely able to predict the location of new objects, enabling object addition guided only by text.
image-matting is an AI matting project based on the open source model briaai/RMBG-1.4. This project aims to realize the image matting function of local model algorithms by learning AI technology, GUI development, front-end learning, and i18n internationalization. It supports single and batch cutout, and users can quickly process images by dragging and pasting. The project also provides a download link for the packaged running file for user convenience.
ComfyUI-LivePortraitKJ is an open source project that provides support for LivePortrait through the ComfyUI node. It allows users to capture and animate facial features in real-time videos and pictures, and supports multiple facial detection technologies, including Insightface and MediaPipe. The project is licensed under an MIT license, provides better Mac support, and optimizes performance and efficiency, allowing for a near-real-time viewing experience in the ComfyUI environment.
TruthPix is an AI image detection tool designed to help users identify photos that have been tampered with by AI. Through advanced AI technology, this application can quickly and accurately identify traces of cloning and tampering in images, thereby preventing users from being misled by false information on social media and other platforms. The main advantages of this application include: high security, all detection is completed on the device, no data is uploaded; detection speed is fast, it only takes less than 400 milliseconds to analyze an image; it supports a variety of AI-generated image detection technologies, such as GANs, Diffusion Models, etc.
ViTMatte is an image matting system based on pre-trained Plain Vision Transformers (ViTs). It utilizes a hybrid attention mechanism and convolution neck to optimize the balance between performance and computation, and introduces a detail capture module to supplement the detail information required for matting. ViTMatte is the first work to unleash the potential of ViT in the field of image matting through simple adaptation, inheriting ViT's advantages in pre-training strategies, simple architecture design, and flexible inference strategies. On Composition-1k and Distinctions-646, the two most commonly used image matting benchmarks, ViTMatte achieves state-of-the-art performance and surpasses previous work by a large margin.
PaintsUndo is an AI model focused on digital painting behavior, capable of simulating and reproducing the brushstrokes and steps in the painting process. It extracts sketches of paintings by analyzing input static images, implements interpolation from external sketches, and can even convert anime-style works into sketch styles. This model is important in the field of image processing and can be widely used in artistic creation, education and entertainment.
UltraPixel is an advanced ultra-high-definition image synthesis technology designed to push image resolution to new heights. This technology was jointly developed by Hong Kong University of Science and Technology (Guangzhou), Huawei's Noah's Ark Laboratory, Max Planck Institute of Informatics and other institutions. It has significant advantages in image synthesis, text-to-image conversion, and personalization. It can generate images with resolutions up to 4096x4096 to meet the needs of professional image processing and visual arts.
ControlNet++ is a new network design based on the ControlNet architecture that supports more than 10 control types for conditional text-to-image generation and can generate high-resolution images comparable to midjourney vision. It extends the original ControlNet with two new modules, supports different image conditions using the same network parameters, and supports multi-condition input without increasing the computational burden. This model has been open sourced to allow more people to enjoy the convenience of image generation and editing.
Magic Insert is an innovative image editing technology that allows users to drag and drop an image subject of any style into a target image of another style and achieve style-aware and realistic insertion. This technique formally defines the problem of style-aware drag and drop and proposes a method to solve it by solving the two sub-problems of style-aware personalization and real-world object insertion in stylized images. Magic Insert's approach significantly outperforms traditional image restoration techniques. Additionally, a dataset called SubjectPlop is provided to facilitate evaluation and future development in this field.
OccFusion is an innovative human body rendering technology that utilizes 3D Gaussian scattering and pre-trained 2D diffusion models to render complete human body images efficiently and with high fidelity even when the human body is partially occluded. This technology significantly improves the accuracy and quality of human rendering in complex environments through a three-stage process: initialization, optimization and refinement.
InstantStyle-Plus is an advanced image generation model focused on enabling style transfer during text-to-image generation while maintaining the integrity of the original content. It decomposes the style transfer task into three subtasks: style injection, spatial structure maintenance, and semantic content maintenance, and uses the InstantStyle framework to implement style injection in an efficient and lightweight way. The model maintains spatial composition by inverting content latent noise and using Tile ControlNet, and enhances semantic content fidelity through global semantic adapters. In addition, a style extractor is used as a discriminator to provide additional style guidance. The main advantage of InstantStyle-Plus is its ability to harmonize style and content without sacrificing content integrity.
ComfyUI-Fast-Style-Transfer is a fast neural style transfer plug-in developed based on the PyTorch framework. It allows users to achieve image style transfer through simple operations. This plug-in is based on the fast-neural-style-pytorch project, and currently only basic inference functions have been ported. Users can customize styles and achieve unique style transfer effects by training their own models.
AutoStudio is a multi-round interactive image generation framework based on a large language model that generates high-quality images through three agents and a stable diffusion-based agent. This technology has made significant progress in multi-subject consistency, improving the quality and consistency of image generation through parallel UNet structures and topic initialization generation methods.
MimicBrush is an innovative image editing model that allows users to achieve zero-sample image editing by specifying the editing area in the source image and providing a wild reference image. The model can automatically capture the semantic correspondence between the two and complete the editing in one go. The development of MimicBrush is based on diffusion priors and captures the semantic relationships between different images through self-supervised learning. Experiments have proven its effectiveness and superiority in a variety of test cases.
AI Playground is a desktop client application launched by Intel for Arc GPU users, designed to simplify the process of AI image creation, editing and AI-driven answer acquisition. It leverages Intel Xe-cores and the XMX engine designed specifically for AI, providing users with an easy way to use AI without in-depth knowledge of AI. The app, expected to be available for free download this summer, supports local control, protects user data privacy, and is user-friendly and easy to operate. In addition, AI Playground also provides model flexibility and open projects, encouraging developers and AI enthusiasts to experiment and innovate.
ComfyUI_omost is an Omost model implemented based on the ComfyUI framework, which allows users to interact with large language models (LLM) to obtain JSON-like structured layout hints. The model is currently under development and its node structure is subject to change. It uses two parts, LLM Chat and Region Condition, to convert JSON conditions into ComfyUI's regional format for image generation and editing.
InstaDrag is a fast, high-quality drag-based image editing technology that uses information from videos to train and enables pixel-level control in about 1 second. Improves editing speed and accuracy by eliminating time-consuming operations like gradient steering. This technology can be widely used in the field of image editing.
TryOnDiffusion is an innovative image synthesis technology that simultaneously maintains clothing details and adapts to significant body posture and shape changes in a single network through the combination of two UNets (Parallel-UNet). This technology can adapt to different body postures and shapes while maintaining clothing details, solving the shortcomings of previous methods in detail maintenance and posture adaptation, and achieving industry-leading performance.
krita-ai-diffusion is an open source Krita plugin designed to simplify the AI image generation process. It allows users to repair selected areas of an image, expand the canvas, and create new images from scratch through AI technology in Krita. The plug-in supports text prompts and provides powerful customization options, suitable for advanced users. It utilizes Stable Diffusion technology and combines it with the ComfyUI backend to provide a localized, adjustment-free image generation experience.
Anyline is a ControlNet line preprocessor capable of accurately extracting object edges, image details and text content from most images. It is based on the innovative efforts of the "Tiny and Efficient Model for the Edge Detection Generalization (TEED)" paper and is one of the most advanced vision algorithms currently. Anyline is combined with the Mistoline ControlNet model to form a complete SDXL workflow, maximizing precise control and leveraging the SDXL model generation capabilities.
Xinghui is an application that provides rich AI picture-generating capabilities, allowing users to upload pictures, enter keywords, and freely switch styles, such as pixel style, cyberpunk, Japanese comics, etc., to instantly have a virtual life experience. Users can explore parallel worlds, freely input AI images, and perform AI Cosplay, while enjoying AI photo and AI stylist functions. The application also supports picture stylization, such as classical oil paintings, street graffiti, Chinese ink, etc.
ComfyUI-IC-Light is a native plug-in of ComfyUI, used to implement IC-Light technology. The technology allows users to generate backgrounds and relights through a series of workflows to enhance the visual impact of images. Its importance lies in its ability to provide more natural and realistic image processing results, especially for users who require advanced image editing capabilities.
Stable Artisan is a Discord bot that leverages the Stability AI platform API to transform users' thoughts into stunning images through natural language prompts. It supports multi-topic prompts, image quality, and spelling capabilities, making it a powerful tool for creative image generation.
The IC-Light project aims to use advanced machine learning technology to manipulate the lighting conditions of images to achieve consistent lighting effects. It provides two types of models: text-conditional relighting models and background-conditional models, both of which take foreground images as input. The importance of this technology lies in its ability to achieve precise control of image lighting through simple text descriptions or background conditions without relying on complex prompts, which is of great significance to image editing, augmented reality, virtual reality and other fields.
IntrinsicAnything is an advanced image inverse rendering technology that optimizes the material recovery process by learning a diffusion model, solving the problem of object material recovery in images captured under unknown static lighting conditions. This technology learns material priors through a generative model, decomposes the rendering equation into diffuse reflection and specular reflection terms, and uses the existing rich 3D object data for training, effectively solving the ambiguity problem in the inverse rendering process. Additionally, the technique develops a coarse-to-fine training strategy that leverages the estimated material-guided diffusion model to produce multi-view consistency constraints, resulting in more stable and accurate results.
PuLID is a deep learning model focused on face identity customization, which achieves high-fidelity face identity editing through contrast alignment technology. This model can reduce interference with the behavior of the original model while providing a variety of applications, such as style changes, IP fusion, accessory modifications, etc.
AI Image Eraser is a tool based on artificial intelligence technology that can quickly and easily remove unwanted content from photos and improve the overall quality of your photos. The tool is easy to operate, free to use, and suitable for both personal and professional users.
The image watermark removal tool uses powerful AI technology to help users quickly remove watermarks from images, improving creative freedom and social media effects. The product is positioned to provide convenient watermark removal services with the goal of enhancing user experience.
AI Background Remover uses artificial intelligence to detect the subject of your picture, create a mask and remove the background. Supports PNG, JPG, and WebP formats, so there is no need to worry about affecting image size and quality. Allows you to easily create transparent background images.
ZeST is an image material transfer technology jointly developed by the University of Oxford, Stability AI and MIT CSAIL research teams. It can achieve material transfer of objects from one image to another without any prior training. ZeST supports the migration of a single material and can handle multiple material editing in a single image. Users can easily apply one material to multiple objects in the image. In addition, ZeST also supports fast image processing on the device, getting rid of dependence on cloud computing or server-side processing, greatly improving efficiency.
Super Canvas is an AI creative generation tool proudly produced by Baidu Netdisk. It can automatically generate creative images of various styles based on the portrait pictures you upload, such as realistic, aesthetic, fantasy, etc., helping photographers improve work efficiency and realize image creativity for everyone. The tool provides a free trial, and has a flexible payment model to meet different needs.
HairFastGAN is a hair style transfer method for high resolution, near real-time performance, and excellent reconstruction. The approach includes a new architecture operating in StyleGAN’s FS latent space, enhanced inpainting methods, and improved encoders for better alignment, color transfer, and post-processing. In the most difficult cases, the method can transfer hairstyle shape and color from one picture to another in less than a second.
DreamWalk is a text-aware image generation method based on diffusion guidance, which allows fine-grained control over the style and content of images without the need to fine-tune the diffusion model or modify internal layers. Supports multiple styles of interpolation and spatially varying bootstrap functions, and can be widely used in various diffusion models.
SwapAnything is a novel framework that can swap arbitrary objects in an image based on a personalized concept given by a reference while keeping the context intact. Compared with existing personalized topic exchange methods, SwapAnything has three unique advantages: (1) precise control of arbitrary objects and parts instead of topics, (2) more faithful preservation of contextual pixels, and (3) better adaptation of personalization concepts into images. It achieves region control on latent feature maps through targeted variable exchange, exchanging masked variables to maintain faithful context and initial semantic concept exchange. Then, semantic concepts are seamlessly adjusted to the original image through appearance adjustment, including target location, shape, style, and content. Extensive results on both manual and automatic evaluation show that our approach achieves significant improvements over baseline methods in personalized exchange. In addition, SwapAnything demonstrates accurate and faithful exchange capabilities on single objects, multiple objects, partial objects, and cross-domain exchange tasks. SwapAnything also achieves excellent performance on text-based exchange and tasks beyond exchange, such as object insertion.
Cos Stable Diffusion XL 1.0 Base is tuned to use Cosine Continuous EDM VPred scheduling. The most important feature is that it produces a full color range image from pure black to pure white, with more subtle improvements in the rate of change of the image at each step. Edit Stable Diffusion XL 1.0 Base is tuned to use Cosine Continuous EDM VPred scheduling and upgraded to perform image editing. This model takes as input a source image and a prompt, interpreting the prompt as instructions on how to change the image. Pricing: Free to use. Positioning: Used in the creative process of generating artworks, designs, etc., for applications in education or creative tools, to study generative models, to deploy models with the potential to generate harmful content, and to explore and understand the limitations and biases of generative models.
DesignEdit is a unified framework that integrates various space-aware image editing functions. It achieves this by decomposing the spatially aware image editing task into two subtasks: decomposition and fusion of multi-layer latent representations. First, the latent representation of the source image is segmented into multiple layers, including several target layers and an incomplete background layer that needs to be reliably repaired. In order to avoid additional tuning, we further explored the repair capabilities within the self-attention mechanism and introduced a key-masking self-attention scheme that can propagate surrounding context information in the occluded area while reducing the impact outside the occluded area. Second, we propose an instruction-based latent fusion method to paste multiple layers of latent representations onto the canvas latent space. We also introduce a latent space artifact suppression mechanism to enhance the repair quality. Due to the inherent modularity advantage of this multi-layer representation, we can achieve precise image editing, and our method achieves excellent performance on multiple editing tasks, surpassing state-of-the-art spatial editing methods.
Cleanup Pictures by iFoto is an online picture repair tool that can easily remove unwanted objects, people, text and watermarks from photos. Suitable for quickly improving the quality of e-commerce images.
InstantStyle is a general framework that leverages two simple but powerful techniques to achieve effective separation of style and content in reference images. Principles include separating content from images, injecting only style blocks, and providing features such as style composition and image generation. InstantStyle can help users maintain style during the text-to-image generation process, providing users with a better generation experience.
This algorithm is designed to simplify the animation coloring process. Traditionally, digital artists need to manually color wireframe animations frame by frame, which is a very time-consuming and labor-intensive process. This algorithm only requires the painter to color the first frame, and it can automatically spread the color to all subsequent frames, greatly improving work efficiency. The core of the algorithm is a novel inclusion relationship matching module, which can accurately capture details such as object deformation and occlusion in animations to ensure the accuracy of coloring. This algorithm developed a special data set for training, which can give full play to the algorithm's coloring capabilities. Compared with existing technologies, this algorithm shows excellent coloring quality and robustness.
ComfyUI-PixelArt-Detector is an open source tool for detecting pixel art, which can be integrated into ComfyUI to help users identify and process pixel art images.
SwinIR is an official PyTorch implementation of image restoration based on Swin Transformer, achieving state-of-the-art performance in tasks such as classic, lightweight, and real-world image super-resolution, grayscale/color image denoising, and JPEG compression artifact removal. It consists of shallow feature extraction, deep feature extraction and high-quality image reconstruction, with excellent performance and parameter optimization.
ObjectDrop is a supervised method designed to achieve photorealistic object removal and insertion. It leverages a counted fact data set and bootstrap supervision techniques. The main function is the ability to remove objects from the image and their impact on the scene (such as occlusions, shadows and reflections), as well as the ability to insert objects into the image in an extremely realistic way. It achieves object deletion by fine-tuning the diffusion model on a small specially captured data set, and for object insertion, it uses a bootstrapped supervision method to synthesize a large-scale count fact data set using the deletion model, and then fine-tune it to the real data set after training on this data set to obtain a high-quality insertion model. Compared with the previous method, ObjectDrop has significantly improved the authenticity of object deletion and insertion.
ComfyUI - SuperBeasts is an image processing application for enhancing the dynamic range and visual appeal of images. It provides a set of adjustable parameters for fine-tuning HDR effects according to user preferences. The app has the following features: adjusts the intensity of shadows, highlights, and the overall HDR effect; applies gamma correction to control overall brightness and contrast; enhances contrast and color saturation to make results more vivid; preserves color accuracy by processing images in LAB color space; utilizes brightness-based masks for targeted adjustments; blends adjusted brightness with the original brightness for a balanced effect.
FlashFace encodes face identity through feature maps and introduces a decoupling integration strategy, which excellently retains details and follows instructions. It is suitable for applications such as face exchange under language prompts.
The Stability AI developer platform now provides a comprehensive set of API services, including image generation, enhancement, extension painting and editing, aiming to improve the quality and efficiency of media creation.
This project can translate text in comics/pictures. Its main functions include text detection, optical character recognition (OCR), machine translation and image repair. It supports multiple languages such as Japanese, Chinese, English and Korean, and can achieve near-perfect translation effects. This project is mainly aimed at comic lovers and image processing workers, who can easily read foreign language comics or perform multi-language processing of images. In addition, it also provides a variety of usage methods such as Web services, online demonstrations, and command line tools, with good usability. The project code is open source, and everyone is welcome to improve and contribute.
Freepik Reimagine is an artificial intelligence-based image creation tool that leverages advanced AI algorithms to create new versions and styles of your existing images. No need for tedious editing operations, just upload the image and set the desired changes, and AI can automatically generate brand new image variations. This tool has powerful creative capabilities and can change the style, composition, color and other elements of the image according to user needs, bringing you unlimited creative possibilities. At the same time, it is easy to operate, and even users without professional background can get started quickly. Whether you are a designer, artist or creative enthusiast, you can use Freepik Reimagine to inspire endless creativity and improve work efficiency. This tool is currently in public beta and is free to use.
img2img-turbo is an open source project that is an improvement on the original img2img project and aims to provide faster image-to-image conversion. The project uses advanced deep learning technology and is able to handle various image conversion tasks, such as style transfer, image colorization, image restoration, etc.
The Upscale.media plug-in uses advanced AI technology to provide image enlargement and enhancement capabilities to streamline your image processing workflow in just a few clicks. Thousands of users already use Upscale.media to save time and get great results.
Blur ID is an automatic coding tool that can detect private text, avatars and QR codes contained in photos/screenshots, and automatically code them to protect privacy. Users can customize avatars to achieve immersive coding effects. The application runs completely locally without a server, ensuring privacy and security. Content that supports coding includes faces, sensitive text, avatars, QR codes and barcodes. The software improves recognition accuracy by continuously optimizing the model. Blur ID provides a free version and a paid subscription service, and the paid version provides more advanced features.
Polycam's Gaussian Blur creation tool lets you convert images into immersive 3D blurs for free, which you can preview, share and export. The tool supports the input of 20-200 images in PNG or JPG format. The input images must follow the best practices of image measurement to ensure clear images, uniform exposure and no motion blur effects. The generated 3D blur can be used in engines such as Unity and Unreal, and the plug-in is constantly updated to support more software. The tool also provides Gallery functionality for browsing and sharing community creations.
FineControlNet is an official Pytorch-based implementation for generating images that can control the shape and texture of image instances through spatially aligned text control inputs (such as 2D human poses) and instance-specific text descriptions. It can use anything from simple line drawings as spatial input, to complex human poses. FineControlNet ensures natural interaction and visual coordination between instances and environments, while gaining the quality and generalization capabilities of Stable Diffusion, but with more control.
remove-background-webgpu is a small program that runs in the browser and uses WebGPU technology to quickly remove image backgrounds. It helps users quickly obtain background-free images without downloading any additional software.
StableDrag is a point-based image editing framework that aims to solve the problems of inaccurate point tracking and incomplete motion supervision existing in existing drag and drop methods. It designs a discriminative point tracking method and a confidence-based latent enhancement strategy. The former accurately locates updated handle points, thereby improving the stability of long-distance operations; the latter is responsible for ensuring that the quality of optimized latent representations in all operation steps is as high as possible. The framework instantiates two image editing models, StableDrag-GAN and StableDrag-Diff, which can achieve more stable drag performance through extensive qualitative experiments and quantitative evaluation on DragBench.
sd-forge-layerdiffuse is a work-in-progress extension for generating transparent images and layers, utilizing latent transparency techniques. The tool currently supports image generation and basic layer functionality, but transparent image-to-image conversion is not yet complete. The codebase is highly dynamic and there may be a lot of changes in the coming month.
sd-forge-layerdiffusion is a work-in-progress extension for generating transparent images and layers for SD WebUI via Forge. This extension supports native transparent diffusion processing, which can generate complex effects such as transparent glass and translucent glow effects. Currently, image generation and basic layer functions are available, but the transparent img2img function is not yet complete. The codebase is highly dynamic and there may be a lot of changes in the coming month.
Glif is a plugin that uses artificial intelligence to remix any image on the web. It offers a variety of AI workflows, allowing users to restyle the image by right-clicking on the image, writing prompts, or using AI's creativity. Glif is powered by AI workflows and anyone can build on it. Please use it reasonably. It is recommended to use it on public domain resources, such as Public Domain Review or artvee, etc. Please check the official website for pricing information.
Explore other subcategories under image Other Categories
832 tools
771 tools
543 tools
522 tools
352 tools
95 tools
68 tools
63 tools
AI image editing Hot image is a popular subcategory under 196 quality AI tools