# Agentic-RAG-R1 **Repository Path**: cangmj/Agentic-RAG-R1 ## Basic Information - **Project Name**: Agentic-RAG-R1 - **Description**: mirror:https://github.com/jiangxinke/Agentic-RAG-R1.git - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-12-01 - **Last Updated**: 2025-12-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # πŸ€– Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning πŸš€ ## Table of Contents - [πŸ€– Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning πŸš€](#-agentic-rag-r1-enhance-agentic-rag-reasoning-capacity-via-reinforcement-learning-) - [Table of Contents](#table-of-contents) - [Introduction 🌟](#introduction-) - [What is Agentic RAG? πŸ’‘](#what-is-agentic-rag-) - [Architecture πŸ—οΈ](#architecture-️) - [Training Strategy 🧠](#training-strategy-) - [Rollout Generation πŸ”„](#rollout-generation-) - [Installation πŸ› οΈ](#installation-️) - [Tools Environment (Optional) 🧰](#tools-environment-optional-) - [Folder Structure πŸ“](#folder-structure-) - [Quick Start ⚑](#quick-start-) - [Training](#training) - [Inference](#inference) - [Features ✨](#features-) - [Results πŸ“Š](#results-) - [Experiment Log on Qwen 2.5-7B-Instruct](#experiment-log-on-qwen-25-7b-instruct) - [Results on MedQA Test Set πŸ₯](#results-on-medqa-test-set-) - [Roadmap πŸ—ΊοΈ](#roadmap-️) - [Acknowledgements πŸ™](#acknowledgements-) - [ContributorsπŸ“](#contributors) - [Citation πŸ“](#citation-) - [🌟 Star History](#-star-history) - [License πŸ“„](#license-) ## Introduction 🌟 Agentic RAG‑R1 is an open‑source initiative to build an Agentic Retrieval‑Augmented Generation (RAG) system by endowing a base language model with autonomous search & reasoning skills through reinforcement learning (currently using the GRPO algorithm). **Chinese Language Version:** ![Chinese version results](https://github.com/user-attachments/assets/a6e42d35-4fec-43b9-9a04-3d102e544e20) **English Language Version:** ![English version results](https://github.com/user-attachments/assets/40f11648-bf46-4cd3-873c-78ca63069499) ### What is Agentic RAG? πŸ’‘ Agentic RAG combines two powerful concepts: - **Retrieval‑Augmented Generation (RAG)**: Combines generative power with on‑the‑fly retrieval from external knowledge bases, ensuring factual and up‑to‑date answers. - **Agentic AI**: Gives the model the ability to decide when to retrieve, what to retrieve, and how to weave the retrieved evidence into its reasoning. ![Agentic RAG concept](https://github.com/user-attachments/assets/7b4b6559-b395-4de0-8326-ad0fca2e671a) ### Architecture πŸ—οΈ Our architecture is inspired by TC‑RAG and features an agent memory stack that orchestrates the full deliberation loop, supporting the following actions: 1. Plan (❌) 2. Reasoning (βœ…) 3. Backtrack (βœ…) 4. Summary (βœ…) 5. Tool Observation – wiki/document/knowledge‑graph search, etc. (βœ…) 6. Conclusion (βœ…) ![Architecture diagram](https://github.com/user-attachments/assets/53dfae56-6c59-488f-9313-7688d5839077) ### Training Strategy 🧠 Motivated by DeepSeek-R1, we apply GRPO (Generalized Relevance Policy Optimization) to reinforce the agent's choice of reasoning steps and retrieval actions, effectively boosting both search depth and answer quality. ![Training strategy diagram](https://github.com/user-attachments/assets/9880394a-f16a-4acd-84c8-db9f4f7d8433) ### Rollout Generation πŸ”„ ![Rollout generation diagram](https://github.com/user-attachments/assets/21d90097-f7a4-46ef-a442-c8a0a778bab4) ## Installation πŸ› οΈ We use conda to manage the environment. Follow these steps to set up: ```bash conda create -n AgenticRAG python=3.11 -y conda activate AgenticRAG pip install -r requirements.txt ``` ### Tools Environment (Optional) 🧰 We provide our search tool repository [ArtSearch](https://github.com/Artessay/ArtSearch) as the search engine, which supports retrieval of information from Wikipedia. You can follow the instructions in that repository to deploy a local instance of the search system. ### Folder Structure πŸ“ ``` . β”œβ”€β”€ ArtSearch # Search tool integration β”œβ”€β”€ checkpoints # Model checkpoints β”œβ”€β”€ examples # Example use cases β”œβ”€β”€ experiments β”‚ β”œβ”€β”€ evaluation # Evaluation scripts and results β”‚ └── training # Training configurations β”œβ”€β”€ README.md β”œβ”€β”€ requirements.txt β”œβ”€β”€ script β”‚ β”œβ”€β”€ evaluation # Evaluation scripts β”‚ β”œβ”€β”€ run_server.sh # Server deployment script β”‚ └── training # Training scripts β”œβ”€β”€ service β”‚ β”œβ”€β”€ chat_client.py # Client for interacting with the model β”‚ └── chat_server.py # Server for hosting the model β”œβ”€β”€ src β”‚ β”œβ”€β”€ config # Configuration files β”‚ β”œβ”€β”€ data # Data processing utilities β”‚ β”œβ”€β”€ evaluation # Evaluation metrics and tools β”‚ β”œβ”€β”€ models # Model definitions β”‚ β”œβ”€β”€ train.py # Main training script β”‚ └── utils # Utility functions ``` ### Quick Start ⚑ Follow the steps below to get up and running with Agentic RAG‑R1. Before you start, rename file ".env_format" to ".env" and fill the necessary os enviroment variables. #### Training - **Zero‑2 Mode** `./script/training/train_zero2.sh` - **Zero‑3 Mode** `./script/training/train_zero3.sh` #### Inference - **Example Mode** comming soon~ - **Server Mode** Launch the chat server: `./script/run_server.sh` ## Features ✨ - **LoRA Tuning Support** πŸ”§: Fine-tune efficiently with Low-Rank Adaptation - **Model Quant Support** πŸ’»: Support model quant to nf4 and .. - **Custom Agent Tools** πŸ› οΈ: Integrate your own tools and personal RAG datasets - **Distributed Training** 🌐: Support for Deepspeed Zero 2 Stage and Zero 3 Stage - **Efficient Resource Usage** πŸ’»: Support for models up to 32B parameters using only **2 A100 GPUs** - **Tool Calling Reward** 🎯: Enhanced reward model that includes: - Accuracy reward - Format reward - RAG accuracy reward using the RAGAS framework The total reward is calculated as: $$r_{total} = r_{accuracy} + r_{format} + r_{rag}$$ - **TCRAG Integration** πŸ”—: Use [TCRAG](https://github.com/Artessay/TC-RAG) as the rollout generator ## Results πŸ“Š ### Experiment Log on Qwen 2.5-7B-Instruct ![Experiment log](https://github.com/user-attachments/assets/591d61aa-d4e4-45e9-a48a-77a01858a24b) We have made our training logs publicly available at: [SwanLab Training Log](https://swanlab.cn/@devilran/xiaobeir1/runs/ipuoxctxo764rvub20d6h/chart) ### Results on MedQA Test Set πŸ₯ Our Qwen 2.5-7B-Instruct model was evaluated on the MedQA test set using Qwen‑2.5‑72B as the judge: | Configuration | Format Accuracy | Answer Accuracy | |-------------------------------------|------------------|------------------| | Before fine-tuning | 39% | 84% | | Before fine-tuning + search | 56% | 79% | | After fine-tuning (200 steps) + search | 92% | 87% | ## Roadmap πŸ—ΊοΈ - [ ] Add more tools - [ ] [Additional planned features] ## Acknowledgements πŸ™ The concept of Agentic-RAG-R1 is inspired by [Deepseek-R1](https://arxiv.org/abs/2501.12948) and [TC-RAG](https://arxiv.org/abs/2408.09199). We sincerely appreciate the efforts of these teams for their contributions to open-source research and development. This work is in the same period as work with Search-R1 and ReSearch. ## ContributorsπŸ“ Supervisors: Junfeng Zhao, Xu Chu, Yasha Wang Affiliation: Key Laboratory of High Confidence Software Technologies (Peking University), School of Computer Science, Peking University, China ## Citation πŸ“ If you use this work in your research, please cite: ```bibtex @misc{Agentic_RAG_R1, title = {Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning}, author = {Xinke Jiang, Jiaran Gao, Rihong Qiu, Wentao Zhang, Yue Fang, Hongxin Ding, Yifan Dai}, year = {2025}, howpublished= {\url{https://github.com/jiangxinke/Agentic-RAG-R1}}, note = {GitHub repository}, } ``` ## 🌟 Star History [![Star History Chart](https://api.star-history.com/svg?repos=jiangxinke/Agentic-RAG-R1&type=Date)](https://star-history.com/#jiangxinke/Agentic-RAG-R1&Date) ## License πŸ“„ This project is licensed under the Apache License. See the [LICENSE](LICENSE) file for details.