PapersFlow Research Brief

Physical Sciences · Computer Science

Adaptive Dynamic Programming Control
Research Guide

What is Adaptive Dynamic Programming Control?

Adaptive Dynamic Programming Control is the application of adaptive dynamic programming and reinforcement learning techniques to solve optimal control problems in continuous-time nonlinear systems using neural networks, policy iteration, actor-critic algorithms, and H∞ control for online learning and feedback control.

The field encompasses 9,978 works focused on neural networks, policy iteration, actor-critic algorithms, and H∞ control applied to continuous-time nonlinear systems. These methods enable online learning and feedback control in domains including robotics, energy management, and multi-agent systems. Actor-critic approaches, such as those in 'Continuous control with deep reinforcement learning' (Lillicrap et al., 2015), extend deep Q-learning to continuous action spaces with 5,352 citations.

Topic Hierarchy

100%
graph TD D["Physical Sciences"] F["Computer Science"] S["Computational Theory and Mathematics"] T["Adaptive Dynamic Programming Control"] D --> F F --> S S --> T style T fill:#DC5238,stroke:#c4452e,stroke-width:2px
Scroll to zoom • Drag to pan
10.0K
Papers
N/A
5yr Growth
175.9K
Total Citations

Research Sub-Topics

Actor-Critic Algorithms for Continuous-Time ADP

This sub-topic develops actor-critic neural network architectures for solving Hamilton-Jacobi-Bellman equations in continuous-time nonlinear optimal control problems. Researchers focus on convergence guarantees, approximation errors, and real-time implementation.

15 papers

Policy Iteration in Adaptive Dynamic Programming

Research explores iterative policy improvement and evaluation schemes using neural networks for infinite-horizon optimal control in continuous-time systems. Studies analyze stability, value function convergence, and handling of input constraints.

15 papers

H∞ Control via Adaptive Dynamic Programming

This area integrates ADP with H∞ control frameworks to achieve robust optimal performance against worst-case disturbances in nonlinear systems. Work includes game-theoretic formulations, neural approximators, and multi-objective regulation.

14 papers

Neural Network Approximators in ADP

Investigations center on deep neural networks, radial basis functions, and sigmoid-weighted units for universal function approximation of value functions and policies in ADP schemes. Researchers study generalization, curse-of-dimensionality mitigation, and training stability.

15 papers

Multi-Agent Adaptive Dynamic Programming

This sub-topic addresses decentralized ADP for cooperative and competitive multi-agent systems, including mean-field approximations and distributed policy optimization. Applications span swarm robotics, power grids, and traffic networks.

15 papers

Why It Matters

Adaptive Dynamic Programming Control provides solutions for optimal control in complex systems like robotics and multi-agent coordination. Lillicrap et al. (2015) in 'Continuous control with deep reinforcement learning' introduced an actor-critic algorithm that operates over continuous action spaces, achieving success in simulated robotic tasks with 5,352 citations. Vamvoudakis and Lewis (2010) developed an online actor-critic algorithm solving continuous-time infinite horizon optimal control problems, applicable to energy management systems with 1,560 citations. Sutton et al. (1999) advanced policy gradient methods with function approximation, enabling scalable reinforcement learning for nonlinear feedback linearizable systems as extended by Bechlioulis and Rovithakis (2008), who guaranteed prescribed tracking performance in MIMO nonlinear systems with 2,430 citations.

Reading Guide

Where to Start

'Continuous control with deep reinforcement learning' by Lillicrap et al. (2015) first, as it provides an accessible introduction to actor-critic methods extended from deep Q-learning to continuous actions, with a clear algorithm description and 5,352 citations.

Key Papers Explained

Sutton et al. (1999) in 'Policy Gradient Methods for Reinforcement Learning with Function Approximation' established theoretical foundations for direct policy optimization (4,951 citations), which Konda and Tsitsiklis (2002) built upon in 'Actor-critic algorithms' by analyzing two-time-scale convergence (1,811 citations). Lillicrap et al. (2015) applied these in 'Continuous control with deep reinforcement learning' to deep networks for continuous control (5,352 citations), while Silver et al. (2014) refined the approach in 'Deterministic policy gradient algorithms' for expected gradients (1,738 citations). Vamvoudakis and Lewis (2010) adapted actor-critic to continuous-time optimal control in 'Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem' (1,560 citations).

Paper Timeline

100%
graph LR P0["Systematic design of adaptive co...
1991 · 1.9K cites"] P1["Policy Gradient Methods for Rein...
1999 · 5.0K cites"] P2["Actor-critic algorithms
2002 · 1.8K cites"] P3["Robust Adaptive Control of Feedb...
2008 · 2.4K cites"] P4["Deterministic policy gradient al...
2014 · 1.7K cites"] P5["Continuous control with deep rei...
2015 · 5.4K cites"] P6["Sigmoid-weighted linear units fo...
2018 · 1.7K cites"] P0 --> P1 P1 --> P2 P2 --> P3 P3 --> P4 P4 --> P5 P5 --> P6 style P5 fill:#DC5238,stroke:#c4452e,stroke-width:2px
Scroll to zoom • Drag to pan

Most-cited paper highlighted in red. Papers ordered chronologically.

Advanced Directions

Research emphasizes robust adaptations for MIMO nonlinear systems, as in Bechlioulis and Rovithakis (2008) guaranteeing prescribed performance (2,430 citations), and multi-agent extensions like Foerster et al. (2018) (1,537 citations). No recent preprints available.

Papers at a Glance

# Paper Year Venue Citations Open Access
1 Continuous control with deep reinforcement learning 2015 arXiv (Cornell Univers... 5.4K
2 Policy Gradient Methods for Reinforcement Learning with Functi... 1999 5.0K
3 Robust Adaptive Control of Feedback Linearizable MIMO Nonlinea... 2008 IEEE Transactions on A... 2.4K
4 Systematic design of adaptive controllers for feedback lineari... 1991 IEEE Transactions on A... 1.9K
5 Actor-critic algorithms 2002 DSpace@MIT (Massachuse... 1.8K
6 Deterministic policy gradient algorithms 2014 HAL (Le Centre pour la... 1.7K
7 Sigmoid-weighted linear units for neural network function appr... 2018 Neural Networks 1.7K
8 Self-improving reactive agents based on reinforcement learning... 1992 Machine Learning 1.6K
9 Online actor–critic algorithm to solve the continuous-time inf... 2010 Automatica 1.6K
10 Counterfactual Multi-Agent Policy Gradients 2018 Proceedings of the AAA... 1.5K

Frequently Asked Questions

What is an actor-critic algorithm in Adaptive Dynamic Programming Control?

Actor-critic algorithms use two components: the actor updates the policy in the gradient direction of the expected return, while the critic estimates the value function using temporal difference learning. Konda and Tsitsiklis (2002) analyzed these two-time-scale algorithms with linear function approximation in 'Actor-critic algorithms', achieving convergence properties with 1,811 citations. They apply to continuous-time nonlinear systems for online optimal control.

How do policy gradient methods work in this field?

Policy gradient methods directly parameterize and optimize the policy using gradient ascent on expected returns, avoiding value function approximations. Sutton et al. (1999) in 'Policy Gradient Methods for Reinforcement Learning with Function Approximation' proved convergence for these methods with 4,951 citations. They enable handling of continuous action spaces in nonlinear control problems.

What role do neural networks play in Adaptive Dynamic Programming Control?

Neural networks serve as function approximators for policies and value functions in high-dimensional continuous-time systems. Lillicrap et al. (2015) used deep neural networks in an actor-critic setup for continuous control tasks in 'Continuous control with deep reinforcement learning', earning 5,352 citations. Elfwing et al. (2018) introduced sigmoid-weighted linear units to improve approximation in reinforcement learning with 1,728 citations.

How is optimal control solved online in continuous-time systems?

Online actor-critic algorithms iteratively update policies and value functions using real-time data without prior system models. Vamvoudakis and Lewis (2010) proposed such an algorithm for the continuous-time infinite horizon optimal control problem in 'Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem', with 1,560 citations. It relies on policy iteration adapted for neural network implementations.

What applications exist in multi-agent systems?

Counterfactual policy gradients enable decentralized learning in cooperative multi-agent environments like network routing. Foerster et al. (2018) developed these gradients in 'Counterfactual Multi-Agent Policy Gradients' for efficient policy learning, cited 1,537 times. The method addresses non-stationarity in multi-agent reinforcement learning for continuous control.

Open Research Questions

  • ? How can actor-critic algorithms guarantee stability in partially observable continuous-time nonlinear systems?
  • ? What convergence rates achieve deterministic policy gradients in high-dimensional robotic control tasks?
  • ? How do H∞ control integrations with adaptive dynamic programming handle worst-case disturbances in multi-agent systems?
  • ? Which function approximation architectures minimize bias in policy iteration for infinite-horizon optimal control?
  • ? How can online learning scale to real-time energy management in large-scale nonlinear networks?

Research Adaptive Dynamic Programming Control with AI

PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:

See how researchers in Computer Science & AI use PapersFlow

Field-specific workflows, example queries, and use cases.

Computer Science & AI Guide

Start Researching Adaptive Dynamic Programming Control with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

See how PapersFlow works for Computer Science researchers