pith. sign in

arxiv: 1803.07696 · v4 · pith:35CK6ABJnew · submitted 2018-03-21 · 💻 cs.RO · cs.SY· eess.SY

Inverse Optimal Control from Incomplete Trajectory Observations

classification 💻 cs.RO cs.SYeess.SY
keywords featurestrajectorymatrixobservationsrecoveryweightscontrolincomplete
0
0 comments X
read the original abstract

This article develops a methodology that enables learning an objective function of an optimal control system from incomplete trajectory observations. The objective function is assumed to be a weighted sum of features (or basis functions) with unknown weights, and the observed data is a segment of a trajectory of system states and inputs. The proposed technique introduces the concept of the recovery matrix to establish the relationship between any available segment of the trajectory and the weights of given candidate features. The rank of the recovery matrix indicates whether a subset of relevant features can be found among the candidate features and the corresponding weights can be learned from the segment data. The recovery matrix can be obtained iteratively and its rank non-decreasing property shows that additional observations may contribute to the objective learning. Based on the recovery matrix, a method for using incomplete trajectory observations to learn the weights of selected features is established, and an incremental inverse optimal control algorithm is developed by automatically finding the minimal required observation. The effectiveness of the proposed method is demonstrated on a linear quadratic regulator system and a simulated robot manipulator.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. From open-loop representations to closed-loop feedback implementations in differential games: A numerical case study

    eess.SY 2026-05 unverdicted novelty 5.0

    Neural networks trained on open-loop data can approximate feedback strategies for a specific surveillance-evasion game, with analysis of sample-and-hold effects due to strategy discontinuities.