GitHunt
PA

pat-jj/Awesome-Adaptation-of-Agentic-AI

Repo for "Adaptation of Agentic AI"

Awesome Adaptation of Agentic AI

Awesome
Stars
License: CC BY-NC-ND 4.0
PRWelcome
arXiv

A curated list of papers on adaptation strategies of agentic AI systems. This repository accompanies the paper "Adaptation of Agentic AI" (Ongoing Work).

Cite this paper:

@article{jiang2025adaptation,
  title={Adaptation of Agentic AI},
  author={Jiang, Pengcheng and Lin, Jiacheng and Shi, Zhiyi and Wang, Zifeng and He, Luxi and Wu, Yichen and Zhong, Ming and Song, Peiyang and Zhang, Qizheng and Wang, Heng and others},
  journal={arXiv preprint arXiv:2512.16301},
  year={2025}
}

Table of Contents


Agent Adaptation

A1: Tool Execution Signaled Agent Adaptation

Development Timeline:

RL-based Methods

Time Method Venue Task(s) Tool(s) Agent Backbone Tuning
2025.11 Orion arXiv
Paper
IR Retrievers LFM2 GRPO
2025.10 olmOCR2 arXiv
Paper
Code
Document OCR Synthetic Document Verifier Qwen2.5-VL SFT, GRPO
2025.10 AlphaProof Nature’25
Paper
Formal Theorem Proving Lean Compiler Transformer (3B Enc-Dec) SFT, AlphaZero, TTRL
2025.10 ToolExpander arXiv
Paper
Tool-Calling Various APIs Qwen2.5 SFT, GRPO
2025.09 BFS-Prover-V2 arXiv
Paper
Code
Formal Theorem Proving Lean Compiler Qwen2.5 BFS-Guided AlphaZero-like EI
2025.09 WebGen-Agent arXiv
Paper
Code
Website Generation VLM, GUI Agent, Code Executor Various Models SFT, Step-GRPO
2025.09 Tool-R1 arXiv
Paper
Code
General Tool-Augmented Reasoning, Multimodal QA Code Execution, Multimedia Tools Qwen2.5 GRPO
2025.08 FTRL arXiv
Paper
Code
Multi-Step Tool-Use Simulated APIs Qwen3 GRPO
2025.08 Goedel-Prover-V2 arXiv
Paper
Code
Formal Theorem Proving Lean Compiler Qwen3 SFT, GRPO
2025.07 Leanabell-Prover-V2 arXiv
Paper
Code
Formal Theorem Proving Lean Compiler Qwen2.5 SFT, AlphaZero-like EI
2025.06 Router-R1 NeurIPS'25
Paper
Code
Multi-Round Routing LLM Routing Pool Qwen2.5, LLaMA3.2 PPO
2025.05 R1-Code-Interpreter arXiv
Paper
Code
Coding Code Execution Sandbox Qwen2.5 GRPO
2025.05 Tool-N1 arXiv
Paper
Code
Tool-Calling Various APIs Qwen2.5 GRPO
2025.04 DeepSeek-Prover-V2 arXiv
Paper
Code
Formal Theorem Proving Lean Compiler DeepSeek-V2 SFT, GRPO
2025.04 Kimina-Prover arXiv
Paper
Code
Formal Theorem Proving Lean Compiler LLaMA-2 SFT, AlphaZero-like EI
2025.04 SQL-R1 NeurIPS'25
Paper
Code
Text2SQL Search SQL Engine Qwen2.5, OmniSQL SFT, GRPO
2025.03 Rec-R1 TMLR'25
Paper
Code
Recommendation Optimization Recommendation System Qwen2.5, LLaMA3.2 GRPO
2025.03 ReZero arXiv
Paper
Code
Web Search, IR Web Search Engine LLaMA3.2 GRPO
2025.03 Code-R1 ---
Code
Coding Code Executor Qwen2.5 GRPO
2025.02 DeepRetrieval COLM'25
Paper
Code
Web Search, IR, Text2SQL Search Engine, Retrievers, SQL exec. Qwen2.5, LLaMA3.2 PPO, GRPO
2025.01 DeepSeek-R1-Zero (Code) Nature
Paper
Coding Code Executor DeepSeek-V3-Base GRPO
2024.10 RLEF ICML'25
Paper
Coding Code Executor LLaMA3.1 PPO
2024.08 DeepSeek-Prover-V1.5 ICLR’25
Paper
Code
Formal Theorem Proving Lean 4 Prover DeepSeek-Prover-V1.5-RL SFT, GRPO
2024.05 LeDex NeurIPS'24
Paper
Coding Code Executor StarCoder & CodeLlaMA SFT, PPO

SFT & DPO Methods

Time Method Venue Task(s) Tool(s) Agent Backbone Tuning
2024.12 AWL ICML'25
Paper
Code
Scientific Reasoning,
Adaptive Tool Usage
Scientific Simulators Llama-3.1-8B,
Qwen-2.5-{14/32}B
SFT, DPO
2024.10 LeReT ICLR'25
Paper
Code
IR Dense Retriever LLaMA3, Gemma2 DPO-like (IPO)
2024.10 ToolFlow NAACL'25
Paper
Tool-Calling Various APIs LLaMA3.1 SFT
2024.06 TP-LLaMA NeurIPS'24
Paper
Tool-Calling Various APIs LLaMA2 SFT, DPO
2024.05 AutoTools WWW'25
Paper
Code
Automated Tool-Calling Various APIs GPT4, LLaMA3, Mistral SFT
2024.03 CYCLE OOPSLA'24
Paper
Coding Code Executor CodeGen, StarCoder SFT
2024.02 RetPO NAACL'25
Paper
Code
IR Retriever LLaMA2-7B SFT, DPO
2024.02 CodeAct ICML'24
Paper
Code
Coding Code Executor LLaMA2, Mistral SFT
2024.01 NExT ICML'24
Paper
Program Repair Code Executor PaLM2 SFT
2023.07 ToolLLM ICLR'24
Paper
Code
Tool-Calling, API Planning, Multi-Tool Reasoning Real-World APIs LLaMA, Vicuna SFT
2023.06 ToolAlpaca arXiv
Paper
Code
Multi-Turn Tool-Use Simulated APIs Vicuna SFT
2023.05 Gorilla NeurIPS'24
Paper
Code
Tool-Calling, API Retrieval Various APIs LLaMA SFT
2023.05 TRICE NAACL'24
Paper
Code
Math Reasoning, QA, Multilingual QA, Knowledge Retrieval Calculator, WikiSearch, Atlas QA Model, NLLB Translator ChatGLM, Alpaca, Vicuna SFT
2023.02 Toolformer NeurIPS'23
Paper
Code
QA, Math Calculator, QA system, Search Engine, Translation System, Calendar GPT-J SFT



A2: Agent Output Signaled Agent Adaptation

Development Timeline:

Methods with Tools

Time Method Venue Task(s) Tool(s) Agent Backbone Tuning
2025.10 TT-SI arXiv
Paper
Tool Calling Various APIs Qwen2.5 Test-Time Fine-Tuning
2025.10 A²FM arXiv
Paper
Code
Web Navigation, Math, QA Search Engine, Crawl, Code Executor Qwen2.5 APO, GRPO
2025.09 VerlTool arXiv
Paper
Code
Math, QA, SQL, Visual, Web Search, Coding Code Interpreter, Search Engine, SQL Executor, Vision Tools Qwen2.5, Qwen3 GRPO
2025.08 MedResearcher-R1 arXiv
Paper
Code
Medical Multi-hop QA Medical Retriever, Web Search API, Document Reader MedResearcher-R1 SFT, GRPO
2025.08 Agent Lightning arXiv
Paper
Code
Text-to-SQL, RAG, Math SQL Executor, Retriever, Calculator LLaMA3.2 LightningRL
2025.07 CodePRM ACL'25
Paper
Coding Code Executor Qwen2.5-Coder SFT
2025.07 DynaSearcher arXiv
Paper
Code
Multi-Hop QA, RAG Document Search, KG Search Qwen2.5, LLaMA3.1 GRPO
2025.06 MMSearch-R1 arXiv
Paper
Code
Web Browsing, QA, Multimodal Search Image Search, Web Browsing, Retriever Qwen2.5 REINFORCE, SFT
2025.06 Self-Challenging arXiv
Paper
Web Browsing, Calculation, Retail, Airline Code Interpreter, Web Browser, Database APIs LLaMA3.1 REINFORCE, SFT
2025.05 StepSearch EMNLP'25
Paper
Code
Multi-Hop QA Search Engine, Retriever Qwen2.5 StePPO
2025.05 ZeroSearch arXiv
Paper
Code
Multi-Hop QA, QA Search Engine, Web Search Qwen2.5, LLaMA3.2 REINFORCE, GPRO, PPO, SFT
2025.05 AutoRefine NeurIPS'25
Paper
Code
Multi-Hop QA, QA Retriever Qwen2.5 GRPO
2025.04 ReTool arXiv
Paper
Code
Math Code Interpreter Qwen2.5 PPO
2025.04 ToolRL arXiv
Paper
Code
Tool Calling Various Tools Various Models GRPO
2025.04 DeepResearcher arXiv
Paper
Code
QA, Multi-Hop Reasoning, Deep Research Web Search API, Web Browser Qwen2.5 GRPO
2025.03 ReSearch NeurIPS'25
Paper
Code
QA Search Engine, Retriever Qwen2.5 GRPO
2025.03 Search-R1 COLM'25
Paper
Code
QA Search Engine, Retriever Qwen2.5 PPO, GRPO
2025.03 R1-Searcher arXiv
Paper
Code
QA Retriever LLaMA3.1, Qwen2.5 REINFORCE++
2025.02 RAS arXiv
Paper
Code
QA Retriever LLaMA2, LLaMA3.2 SFT
2025.01 Agent-R arXiv
Paper
Code
Various Tasks Monte Carlo Tree Search Qwen2.5, LLaMA3.2 SFT
2024.06 Re-ReST EMNLP'24
Paper
Code
Multi-Hop QA, VQA, Sequential Decision, Coding Various APIs Various Models DPO
2024.06 RPG EMNLP'24
Paper
Code
RAG, QA, Multi-hop Reasoning Search Engine, Retriever LLaMA2, GPT3.5 SFT
2023.10 Self-RAG ICLR'24
Paper
Code
RAG, QA, Fact Verification Retriever LLaMA2 SFT
2023.10 FireAct arXiv
Paper
Code
QA Search API GPT3.5, LLaMA2, CodeLLaMA SFT

Methods without Tools

Time Method Venue Task(s) Tool(s) Agent Backbone Tuning
2025.10 Empower arXiv
Paper
Code
Coding --- Gemma3 SFT
2025.10 KnowRL arXiv
Paper
Code
Knowledge calibration --- LLaMA3.1, Qwen2.5 REINFORCE++
2025.10 GRACE arXiv
Paper
Code
Embedding Tasks --- Qwen2.5, Qwen3, LLaMA3.2 GRPO
2025.06 Magistral arXiv
Paper
Math, Coding --- Magistral PPO, GRPO
2025.05 EHRMind arXiv
Paper
Code
EHR-based Reasoning --- LLaMA3 SFT, GRPO
2025.01 Kimi k1.5 arXiv
Paper
Code
Math, Coding --- Kimi k1.5 GRPO
2025.01 DeepSeek-R1-Zero (Math) Nature
Paper
Math --- DeepSeek-V3 GRPO
2024.09 SCoRe ICLR'25
Paper
Code
Math, Coding, QA --- Gemini1.0 Pro, Gemini1.5 Flash REINFORCE
2024.07 RISE NeurIPS'24
Paper
Code
Math --- LLaMA2, LLaMA3, Mistral SFT
2024.06 TextGrad Nature
Paper
Code
Various Tasks --- GPT3.5, GPT4o Prompt Tuning
2023.03 Self-Refine NeurIPS'23
Paper
Code
Dialogue, Math, Coding --- GPT3.5, GPT4, CODEX Test-Time Prompting

Tool Adaptation

T1: Agent-Agnostic Tool Adaptation

Foundational Systems and Architectures

Year.Month Method Name Venue Paper Name
2021.08 Neural Operators JMLR'23
Paper
Neural Operator: Learning Maps Between Function Spaces
2023.09 HuggingGPT NeurIPS'23
Paper
Code
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
2023.08 ViperGPT ICCV'23
Paper
Code
ViperGPT: Visual Inference via Python Execution for Reasoning
2025.07 SciToolAgent Nature Comp. Sci.'25
Paper
SciToolAgent: A Knowledge-Graph-Driven Scientific Agent for Multitool Integration

Categories and Training Methods

Year.Month Method Name Venue Paper Name
2021.01 CLIP ICML'21
Paper
Code
Learning Transferable Visual Models from Natural Language Supervision
2023.04 SAM ICCV'23
Paper
Code
Segment Anything
2024.06 SAM-CLIP CVPR'24
Paper
SAM-CLIP: Merging Vision Foundation Models Towards Semantic and Spatial Understanding
2023.12 Whisper ICML'23
Paper
Code
Robust Speech Recognition via Large-Scale Weak Supervision
2024.10 CodeAct ICML'24
Paper
Code
Executable Code Actions Elicit Better LLM Agents
2020.04 DPR EMNLP'20
Paper
Code
Dense Passage Retrieval for Open-Domain Question Answering
2020.04 ColBERT SIGIR'20
Paper
Code
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
2021.12 Contriever TMLR'22
Paper
Code
Unsupervised Dense Information Retrieval with Contrastive Learning
2022.12 e5 arXiv
Paper
Code
Text Embeddings by Weakly-Supervised Contrastive Pre-training
2021.07 AlphaFold2 Nature
Paper
Code
Highly Accurate Protein Structure Prediction with AlphaFold
2023.03 ESMFold Science
Paper
Evolutionary-Scale Prediction of Atomic-Level Protein Structure with a Language Model



T2: Agent-Supervised Tool Adaptation

Development Timeline:

Time Method Venue Task(s) Tool Backbone Agent Backbone Tuning
2025.10 QAgent arXiv
Paper
Code
QA, RAG Qwen2.5-3B Qwen-7B GRPO
2025.10 AgentFlow arXiv
Paper
Code
Web Search, Planning, Reasoning, Math Qwen2.5-7B Qwen2.5-7B Flow-GRPO
2025.10 Advisor Models arXiv
Paper
Code
Math, Reasoning Qwen2.5-7B, Qwen3-8B GPT-4o-Mini, GPT-5, Claude4-Sonnet, GPT-4.1-Mini GRPO
2025.10 AutoGraph-R1 arXiv
Paper
Code
KG Construction, RAG KG Constructor (Qwen2.5-3B/7B) Frozen RAG Generator (Qwen2.5-7B) GRPO
2025.10 MAE arXiv
Paper
Code
Math, Coding, Commonsense Reasoning Qwen2.5-3B Qwen2.5-3B REINFORCE++
2025.09 Mem-α arXiv
Paper
Code
Retrieval, Test-Time Learning, Long-Range Understanding Qwen3-4B Qwen3-4B, Qwen3-32B, GPT-4.1-Mini GRPO
2025.08 AI-SearchPlanner arXiv
Paper
Web QA Qwen3-32b Qwen2.5-7B PPO
2025.08 Memento arXiv
Paper
Code
Long-Horizon Reasoning, Web Research, QA, Academic Reasoning Q-function (two-layer MLPs) GPT-4.1 Soft Q-Learning
2025.08 R-Zero arXiv
Paper
Code
Math, Reasoning Qwen3-4B, Qwen3-8B, OctoThinker-3B, OctoThinker-8B Qwen3-4B, Qwen3-8B, OctoThinker-3B, OctoThinker-8B GRPO
2025.06 Sysformer arXiv
Paper
QA, RAG Small Transformer LLaMA-2-7B, LLaMA-3.1-8B, Mistral-7B, Phi-3.5-mini, Zephyr-7B-beta Supervised Learning
2025.05 s3 EMNLP'25
Paper
Code
QA, RAG Qwen2.5-7B Qwen2.5-7B, Qwen2.5-14B, Claude-3-Haiku PPO
2024.10 Matryoshka Pilot NeurIPS'25
Paper
Code
Math, Planning, Reasoning LLaMA3-8B, Qwen2.5-7B GPT-4o-Mini, GPT-3.5-Turbo DPO, IDPO
2024.06 CoBB EMNLP'24
Paper
Code
QA, Math Mistral-7b-inst-v2 GPT-3.5-Turbo, Claude-3-Haiku, Phi-3-mini-4k-inst, Gemma-1.1-7B-it, Mistral-7B-inst-v2 SFT, ORPO
2024.05 Medadapter EMNLP'24
Paper
Code
Medical QA, NLI, RQE BERT-Base-Uncased GPT-3.5-Turbo SFT, BPO
2024.03 BLADE AAAI'25
Paper
Code
Domain-Specific QA BLOOMZ-1b7 ChatGPT, ChatGLM, Baichuan, Qwen SFT, BPO
2024.02 ARL2 ACL'24
Paper
Code
QA LLaMA2-7B GPT-3.5-Turbo Contrastive Learning
2024.02 EVOR EMNLP'24
Paper
Code
RAG-based Coding GPT-3.5-Turbo GPT-3.5-Turbo, CodeLLaMA Prompt Engineering
2024.02 Bbox-Adapter ICML'24
Paper
Code
QA DeBERTa-v3-base (0.1B), DeBERTa-v3-large (0.3B) GPT-3.5-Turbo, Mixtral-8x7B Contrastive Learning
2024.01 Proxy-Tuning COLM'24
Paper
Code
QA, Math, Code LLaMA2-7B LLaMA2-70B Proxy-Tuning
2024.01 BGM ACL'24
Paper
QA, Personalized Generation (NQ, HotpotQA, Email, Book) T5-XXL-11B PaLM2-S SFT, PPO
2023.10 RA-DIT ICLR'24
Paper
Knowledge-Intensive Tasks (MMLU, NQ, TQA, ELI5, HotpotQA, etc.) DRAGON+ LLaMA-65B SFT, LSR
2023.06 LLM-R EACL'24
Paper
Code
Zero-shot NLU (Reading Comprehension, QA, NLI, Paraphrase, Sentiment, Summarization) E5-base GPT-Neo-2.7B, LLaMA-13B, GPT-3.5-Turbo Contrastive Learning
2023.05 AAR ACL'23
Paper
Code
Zero-Shot Generalization (MMLU, PopQA) ANCE, Contriever Flan-T5-Small, InstructGPT Contrastive Learning
2023.05 ToolkenGPT NeurIPS'23
Paper
Code
Numerical Reasoning, QA, Plan Generation Token Embedding GPT-J 6B, OPT-6.7B, OPT-13B Proxy-Tuning
2023.03 UPRISE EMNLP'23
Paper
Code
Zero-shot NLU (Reading Comprehension, QA, NLI, Paraphrase, Sentiment, Summarization) GPT-Neo-2.7B BLOOM-7.1B, OPT-66B, GPT-3-175B Contrastive Learning
2023.01 REPLUG NAACL'24
Paper
Code
QA Contriever GPT3-175B, PaLM, Codex, LLaMA-13B Proxy-Tuning, LSR

Citation

If you find this repository useful, please consider citing our survey:

@article{jiang2025adaptation,
  title={Adaptation of Agentic AI},
  author={Jiang, Pengcheng and Lin, Jiacheng and Shi, Zhiyi and Wang, Zifeng and He, Luxi and Wu, Yichen and Zhong, Ming and Song, Peiyang and Zhang, Qizheng and Wang, Heng and others},
  journal={arXiv preprint arXiv:2512.16301},
  year={2025}
}

Contributing

We welcome contributions! Please feel free to submit a Pull Request to add new papers or update existing entries.



(ノ◕ヮ◕)ノ*:・゚✧ Keep exploring the awesome world of agentic AI! ✧゚・: *ヽ(◕ヮ◕ヽ)

pat-jj/Awesome-Adaptation-of-Agentic-AI | GitHunt