Found 47 AI tools
Click any tool to view details
Swarm is an experimental framework managed by the OpenAI Solutions team for building, orchestrating, and deploying multi-agent systems. It achieves coordination and execution between agents by defining the abstract concepts of agents and handoffs. The Swarm framework emphasizes lightweight, high controllability, and ease of testing. It is suitable for scenarios that require a large number of independent functions and instructions, allowing developers to have complete transparency and fine-grained control over context, steps, and tool calls. The Swarm framework is currently in the experimental stage and is not recommended for use in production environments.
LiveKit Agents is an end-to-end framework that enables developers to build intelligent multi-modal voice assistants (AI agents) that can interact with users through voice, video and data channels. It provides a quick start guide to creating voice assistants by integrating OpenAI's real-time API and LiveKit's WebRTC infrastructure, including pipelines for speech recognition (STT), language models (LLM), and text-to-speech (TTS). Additionally, it supports the ability to create voice-to-voice agents, answer and respond to incoming calls, and make calls on behalf of users.
curiosity is a chatbot project based on the ReAct framework, aiming to explore and build a Perplexity-like user interaction experience through LangGraph and FastHTML technology stack. The core of the project is a simple ReAct agent that uses Tavily search to enhance text generation. Supports three different LLMs (Large Language Models), including OpenAI's gpt-4o-mini, Groq's llama3-groq-8b-8192-tool-use-preview, and Ollama's llama3.1. The project builds the front-end via FastHTML, which overall provides a fast user experience, although there may be some challenges during debugging.
RD-Agent is an automated research and development tool launched by Microsoft Research Asia. Relying on the powerful capabilities of large language models, it creates a new model of artificial intelligence-driven R&D process automation. By integrating data-driven R&D systems, it can use artificial intelligence capabilities to drive the automation of innovation and development. It not only improves R&D efficiency, but also uses intelligent decision-making and feedback mechanisms to provide unlimited possibilities for future cross-field innovation and knowledge transfer.
Sentient is a framework/SDK that allows developers to build intelligent agents that can control browsers in 3 lines of code. It leverages the latest artificial intelligence technology to enable complex network interactions and automation tasks through simple code. Sentient supports a variety of AI models, including OpenAI, Together AI, etc., and can provide customized solutions according to users' specific needs.
muAgent is an innovative Agent framework driven by a knowledge graph engine that supports multi-Agent orchestration and collaboration technology. It uses LLM+EKG (Eventic Knowledge Graph industry knowledge bearing) technology, combined with FunctionCall, CodeInterpreter, etc., to realize the automation of complex SOP processes through canvas drag and light text writing. muAgent is compatible with various Agent frameworks on the market and has core functions such as complex reasoning, online collaboration, manual interaction, and ready-to-use knowledge. This framework has been verified in multiple complex DevOps scenarios of Ant Group.
GenAgent is a framework for building collaborative AI systems by creating workflows and converting these workflows into code for better understanding by large language model (LLM) agents. GenAgent is able to learn from human-designed work and create new workflows. The generated workflows can be interpreted as collaborative systems to complete complex tasks.
Open-LLM-VTuber is an open source project designed to interact with large language models (LLM) via speech, with real-time Live2D facial capture and cross-platform long-term memory capabilities. The project supports macOS, Windows, and Linux platforms, allowing users to choose from different speech recognition and speech synthesis backends, as well as custom long-term memory solutions. It is particularly suitable for developers and enthusiasts who want to implement natural language conversations with AI on different platforms.
multi-agent-concierge is a multi-agent concierge system that uses multiple specialized agents to complete complex tasks and a "concierge" agent to guide users to the correct agent. Such systems are designed to handle multiple tasks with interdependencies, using hundreds of tools. The system demonstrates how to create implicit "chains" between agents through natural language instructions and manage these chains through "continuation" agents, while using global state to track users and their current status.
agent-service-toolkit is a complete toolkit for running AI agent services based on LangGraph, including LangGraph agent, FastAPI service, client and Streamlit application, providing complete settings from agent definition to user interface. It leverages the high degree of control and rich ecosystem of the LangGraph framework to support advanced features such as concurrent execution, graph looping, and streaming results.
AgentK is a self-evolving modular self-agent general artificial intelligence (AGI) model consisting of multiple cooperative agents that can build new agents based on user needs to complete tasks. It is built on the LangGraph and LangChain frameworks, has the ability to self-test and repair, and is designed to be a minimal collection of agents and tools that can self-guide and develop its own intelligence.
This is an open source project for remote control operation of the humanoid robot Unitree H1_2. It utilizes Apple Vision Pro technology to allow users to control the robot through a virtual reality environment. The project was tested on Ubuntu 20.04 and Ubuntu 22.04, and detailed installation and configuration guides are provided. The main advantages of this technology include the ability to provide an immersive remote control experience and support testing in a simulated environment, providing a new solution for the field of robot remote control.
RedCache-AI is a dynamic memory framework designed for large language models and agents, which allows developers to build a wide range of applications from AI-driven dating apps to medical diagnostic platforms. It solves the problem of existing solutions being expensive, closed source, or lacking extensive support for external dependencies.
Memory is an open source memory layer designed for autonomous agents. It improves the agent's reasoning and learning capabilities by imitating human memory. It uses the Neo4j graph database to store knowledge, and combines the Llama Index and Perplexity models to enhance the query capabilities of the knowledge graph. The main advantages of Memory include functions such as automatic memory generation, memory modules, system improvement and retrospective memory. It is designed to integrate with existing agents with minimal developers and provide visual data for memory analysis and system improvement through the dashboard.
Agent Zero is a highly transparent, readable, understandable, customizable and interactive personal AI framework. It is not pre-programmed for specific tasks, but is designed as a general-purpose personal assistant capable of executing commands and code, cooperating with other agent instances, and completing tasks to the best of its ability. It has a persistent memory that remembers previous solutions, codes, facts, instructions, etc. to solve tasks faster and more reliably in the future. Agent Zero uses the operating system as a tool to complete tasks, there are no pre-programmed single-purpose tools. Instead, it can write its own code and use the terminal to create and use its own tools as needed.
aiwaves-cn/agents is an open source framework focusing on data-driven adaptive language agents. It provides a systematic framework for training language agents through symbolic learning, inspired by the connectionist learning process used to train neural networks. The framework implements backpropagation and gradient-based weight updates using language-based losses, gradients, and weights, supporting the optimization of multi-agent systems.
Llama-agentic-system is a system-level agent component based on the Llama 3.1 model, which is capable of performing multi-step reasoning and using built-in tools such as search engines or code interpreters. The system also emphasizes security assessment, with input and output filtering through Llama Guard to ensure security needs are met under different usage scenarios.
Composio is a platform that provides high-quality tools and integrations for AI agents. It simplifies the authentication, accuracy and reliability of agents, allowing developers to integrate multiple tools and frameworks with a single line of code. It supports more than 100 tools, covering more than 90 platforms such as GitHub, Notion, and Linear, and provides a variety of functions including software operation, operating system interaction, browser functions, search, software development environment (SWE), and ad hoc agent data (RAG). Composio also supports six different authentication protocols, which can significantly improve the accuracy of proxy calling tools. Additionally, Composio can be embedded into applications as a backend service to manage authentication and integration for all users and agents, maintaining a consistent experience.
AgentScope is an innovative multi-agent platform designed to empower developers to build multi-agent applications using large-scale models. It has easy-to-use, high robustness and Actor-based distributed features, and supports custom fault-tolerance control and retry mechanisms to enhance application stability.
IoAI (Internet of Agents) is an intelligent agent interconnection framework that aims to achieve automated collaboration between different intelligent agents through a highly modular design. It allows developers to quickly integrate third-party intelligent agents and perform task allocation and execution through a unified interface. The core advantage of IoA is its flexibility and scalability, supporting a variety of application scenarios, including but not limited to collaborative paper writing, benchmark testing, and open instruction data sets.
AutoGPT is a powerful tool that allows users to create and run intelligent agents that can automate various tasks and make life easier. The goal of AutoGPT is to provide tools that allow users to focus on what matters. It pushes the frontiers of AI innovation by building and using AI agents.
Enchanted is an open source, Ollama-compatible app for macOS/iOS/visionOS that allows users to talk to private self-hosted language models such as Llama 2, Mistral, Vicuna, and more. It's basically a ChatGPT application interface connected to a private model. Enchanted's goal is to provide a product that allows for unfiltered, secure, private and multi-modal experiences across all devices in the iOS ecosystem (macOS, iOS, Watch, Vision Pro).
OmAgent is a complex multi-modal intelligent agent system dedicated to leveraging multi-modal large-scale language models and other multi-modal algorithms to complete engaging tasks. The project includes omagent_core, a lightweight intelligent agent framework carefully designed to address multi-modal challenges. OmAgent consists of three core components: Video2RAG, DnCLoop and Rewinder Tool, which are respectively responsible for long video understanding, complex problem decomposition and information review.
xLAM is an intelligent agent research project based on Large Language Models (LLMs) developed by the Salesforce AI Research team. It works by aggregating intelligent agent trajectories from different environments, standardizing and unifying these trajectories into a consistent format to create an optimized universal data loader specifically for the training of intelligent agents. xLAM-v0.1-r is version 0.1 of this model series, designed for research purposes and compatible with VLLM and FastChat platforms.
llama-agents is an asynchronous-first framework for building, iterating, and productionizing multi-agent systems, including multi-agent communication, distributed tool execution, artificial-in-the-loop, and more. Each agent is treated as a service, constantly processing incoming tasks. Agents pull and publish messages from the message queue. At the top of the system is the control plane, which keeps track of ongoing tasks, services in the network, and decides which service should handle the next step of the task.
Agent-E is a system based on the AutoGen agent framework designed to automate operations on the user's computer, currently focusing on in-browser automation. It interacts with web browsers through natural language to perform operations such as filling out forms, searching and sorting e-commerce products, locating website content, managing playback settings, performing web searches, and managing project management platform tasks. Agent-E is growing and can handle a variety of tasks, but the best tasks are discovered by users themselves.
OpenAgents is an open platform designed to enable users and developers to use and host language agents in their daily lives. The platform has implemented three types of agents: Data Agent for data analysis, Plugins Agent for integrating 200+ daily tools, and Web Agent for automatic web browsing. OpenAgents enables ordinary users to interact with agent functionality through an optimized web UI, while providing developers and researchers with a seamless deployment experience on local settings, providing a foundation for the construction and real-world evaluation of innovative language agents.
Nerve is an LLM tool that can create stateful agents, allowing users to define and perform complex tasks without writing code. It enables agents to plan and step through the actions required to complete a task by dynamically updating system prompts and maintaining state across multiple inference processes. Nerve supports any model accessible through ollama, groq or the OpenAI API, with a high degree of flexibility and efficiency while focusing on memory safety.
Agent Mode is a feature of Warp AI that allows users to complete multi-step workflows in the terminal using natural language. It recognizes and interprets natural language instructions, provides context-specific guidance, and guides users through multi-step tasks. Agent Mode utilizes OpenAI's API but does not store or retain user input or output data.
agentUniverse is a multi-agent application development framework based on large-scale language models, providing all necessary components to build single-agent and multi-agent collaboration mechanisms. Through Pattern Factory, developers are allowed to build and customize multi-agent collaboration patterns, easily build multi-agent applications, and share pattern practices in different technologies and business fields.
TalkWithGemini is a cross-platform application that supports one-click free deployment. Users can interact with Gemini models through this application. It supports multi-modal interaction methods such as image recognition and voice dialogue to improve work efficiency.
Mentals AI is a tool designed to create and manipulate agents with loops, memories and various tools via simple Markdown syntax. It allows users to focus on the logic of the agent without writing low-level code in Python or other languages, thus redefining the basic framework for future AI applications.
gpt-computer-assistant is an application designed for Windows, macOS and Ubuntu operating systems to provide an alternative ChatGPT application. It allows users to easily install through Python libraries, and plans to provide native installation scripts (.exe). The product is powered by Upsonic Tiger, a platform that provides a feature center for large language model (LLM) agents. Key benefits of the product include cross-platform compatibility, ease of installation and use, and future support for native models.
ModelScope-Agent is a customizable and extensible agent framework with capabilities such as role playing, large language model invocation, tool usage, planning and memory. It simplifies the implementation process of agent applications, provides rich model and tool interfaces, unified interfaces, high scalability, and low coupling, allowing developers to easily use built-in tools, LLM, memory and other components without binding higher-level agents.
AutoGroq is an AI-powered conversational assistant designed to revolutionize the way users interact with AI tools by automatically generating expert agents. It overcomes the limitations of existing solutions and provides a user-friendly, powerful and configuration-free experience. The platform focuses on providing immediate and relevant assistance by automatically generating expert agents dedicated to any problem, regardless of its complexity.
WebLlama is an agent built on Meta Llama 3 and fine-tuned specifically for web navigation and conversation. It is designed to build effective human-centered agents that help users navigate the web, not replace them. The model outperforms GPT-4V (zero-sample) by 18% on the WebLINX benchmark, demonstrating its superior performance in web page navigation tasks.
AgentStudio is an open source tool suite that covers the entire life cycle of building a universal virtual assistant. It provides environment implementation, benchmark suite, data collection pipeline and graphical interface to promote the development of future research on general virtual assistants. Agent Studio provides a unified observation and action space consistent with human-computer interaction, allowing agents to be evaluated and data collected on any human-performed task. This feature greatly expands the potential mission space. As a result, AgentStudio facilitates the development and evaluation of agents that span a variety of real-world use cases.
Skyvern is an automation tool that combines large language models (LLMs) and computer vision techniques to automate browser-based workflows. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
The Cradle framework is designed to enable underlying models to perform complex computer tasks through the same common interface as humans (screen as input, keyboard and mouse operations as output). The framework is used as a case study in the game Red Dead Redemption II, demonstrating its generalization and adaptability in complex environments.
NavAIGuide is a scalable multi-modal intelligence framework that enables planning and user querying by accessing applications across mobile and desktop ecosystems. Featuring features such as visual task detection, advanced code selectors, action-oriented execution, and robust error handling. Positioned to provide users with efficient automation solutions.
AIlice is a lightweight AI agent designed to create a self-contained artificial intelligence assistant similar to JARVIS. It does this by building a "text computer" with a large language model (LLM) at its core. AIlice excels in topic research, coding, systems administration, literature reviews, and complex hybrid tasks beyond these basic abilities. AIlice leverages GPT-4 to achieve near-perfect performance in daily life tasks and is leveraging the latest open source models to move towards practical applications.
coze-discord-proxy is a plug-in that acts as a proxy for discord robots. It can call the discord robot hosted by coze through the API interface to achieve dialogue with AI such as chatGPT. The plug-in supports streaming conversation return, conversation text-to-image and image-to-text functions. It also supports the creation of channels/sub-channels/threads, and designated dialogue channels to achieve isolation. It is also compatible with openai's dialogue interface, GPT image recognition interface, etc., and is very suitable for integration into panels such as NextChat and OneChat, providing powerful AI chat capabilities.
AGI-Samantha is an autonomous agent that simulates Samantha in the movie "Her". It has dynamic voice capabilities and can speak autonomously according to context. Compared with general LLMs, it is not limited to answers and reactions. It also has real-time vision capabilities, external classification memory, and the ability to dynamically read and write, selecting the most relevant information. AGI-Samantha is evolving every moment, and the experiences stored in its memory can influence its subsequent behavior, such as personality, speaking frequency and style. AGI-Samantha is coordinated through a series of LLM calls, each with a different purpose, and these modules work together to simulate basic human brain workflows.
CrewAI is an open source library for developers that can help you build and coordinate a team of AI agents to solve complex tasks. It is built on LangChain and can seamlessly integrate various AI tools to empower your agents and allow them to complete specific work goals. You can combine different agents, tasks, and tools like building blocks to create an AI agent system that suits your needs.
InsActor is a character control system based on physics simulation. It can drive characters to complete various interactive tasks in complex environments through natural language instructions. The system uses conditional and adversarial diffusion models for multi-level planning, and is combined with low-level controllers to achieve stable and robust control. It has the advantages of smooth control and natural interaction, and is suitable for application scenarios such as creative content generation, interactive entertainment, and human-computer interaction.
AppAgent is a multi-modal agent framework based on LLM (Large Scale Language Model) designed to operate smartphone applications. Through simplified action spaces (such as clicks and slides), human-like interaction methods are simulated to implement application operations without the need for system back-end access. Agents learn how to use new applications through autonomous exploration or observation of human demonstrations, and create a knowledge base for performing complex tasks in different applications.
Suspicion-Agent is an implementation using GPT-4 with theory of mind awareness to play imperfect information games. It can train and evaluate agents and provide sample output.
Explore other subcategories under programming Other Categories
768 tools
465 tools
368 tools
294 tools
140 tools
85 tools
66 tools
61 tools
AI Agents Hot programming is a popular subcategory under 47 quality AI tools