Two approaches to optimal adaptive control under large dimensionality

doi:10.15406/iratj.2017.03.00062

eISSN: 2574-8092

International Robotics & Automation Journal

Mini Review Volume 3 Issue 4

Two approaches to optimal adaptive control under large dimensionality

Peter Lohmander

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Optimal Solutions, Sweden

Correspondence: Peter Lohmander, Optimal Solutions, Sweden

Received: October 29, 2017 | Published: November 13, 2017

Citation: Lohmander P. Two approaches to optimal adaptive control under large dimensionality. Int Rob Auto J. 2017;3(4):328-330. DOI: 10.15406/iratj.2017.03.00062

Download PDF

Abstract

The ambition of this study is to develop general methods for optimization of stochastic and dynamic decision problems. Applications can be found in most sectors of the economies.

Keywords: optimal adaptive decisions, high dimensionality, optimization of adaptive control functions

Introduction

Theoretical understanding of the relevant problem structure and consistent mathematical modelling are necessary keys to formulating operations research models to be used for optimization of decisions in real applications. The numbers of alternative models, methods and applications of operations research are very large. This paper presents fundamental and general decision and information structures, theories and two examples that can be expanded and modified in several directions. The discussed methods and examples are motivated from the points of view of empirical relevance and computability. SDP with LP or QP subroutines and oil sector application. Stochastic dynamic programming, SDP, is often the optimal method. SDP can be extended to handle very large dimensionality in the decision space, as long as the dimensionality of the state space is not too large, since SDP can be combined with linear or quadratic programming subroutines for every state and stage. A typical application of relevance to this method is described by the following problem: We want to optimize the many interdependent activities in an industrial sector, such as the oil sector, in a country. The international oil price, for instance the Brent crude price, is impossible to predict with high precision. It may be regarded as highly stochastic and of central importance to the optimal decisions in the sector. The amount of oil in the oil reserves, in the country, is another variable of fundamental importance. This can be controlled by decisions within the oil producing country.

In a case of this type, we may define the optimization problem as a stochastic dynamic programming problem where the international oil price is an exogenous state variable. The size of the oil reserve is also a state variable, which is dependent on the optimal oil price dependent extraction decisions. Hence, this is endogenously controlled but the optimal time path of the oil reserve is dependent on the path of the oil price. With these two state variables in the SDP problem, it is possible to handle both variables in high resolution. For instance, with 100 different price states and 100 different oil reserve states, we have 10 000 combinations to consider in every time period. That can easily be handled. In every time period, for every combination of oil price level and oil reserve level, all of the logistics problems, production problems and so on in the sector are solved. For this purpose, we define a linear or quadratic programming problem. With LP or QP problems, we know that the optimal solution will be obtained in a finite number of iterations. Note that it in principle would be possible to handle all of the detailed logistics problems directly in the SDP problem. Then, however, the number of dimensions would be very large and the time of computation would become extremely high. For instance, if the number of “state variables” in the logistics problem is 1000 and we want to have 100 different levels in each dimension, then the number of state combinations in the “one period problems” would be 100¹⁰⁰⁰=10²⁰⁰⁰. If we should handle all of these logistics states directly in the SDP, then the SDP, also including 100 oil price levels and 100 oil reserve levels, would have 100¹⁰⁰⁰.10²=10²⁰⁰⁴ state combinations. Such problems are impossible to solve in reasonable time with SDP. Hence, SDP (with a few state dimensions) with LP or QP subroutines give us the advantage of very much faster calculations. Furthermore, it is very good to be able to handle the logistics decisions and”logistics dimensions” via continuous variables. With the stochastic dynamic programming method as a general tool, we may consider the detailed production and/or logistics problem via the classical mathematical programming tools LP and QP. Now, we can solve many such problems, as sub problems, within the general stochastic dynamic programming formulation. Hence, for each state and stage, we solve the relevant sub problems. Now, we include the sub problems in the stochastic dynamic programming recursion equation (1). Problems of this kind have been defined and numerically solved by Lohmander.^1–4

$f (t, s, m) = \max_{u \in U (t, s, m)} ((\underset{\begin{array}{l} s . t . \\ α_{11} x_{1} + ... + α_{1 K} x_{K} \leq C_{_{1}} \\ ... \\ α_{L 1} x_{1} + ... + α_{L K} x_{K} \leq C_{L} \end{array}}{\max π (x_{1}, ..., x_{K}; u, t, s, m)}) + \sum_{n}^{} τ (n | m) f (t + 1, s - u, n)) \forall (t, s, m) | (0 \leq t \leq T)$ (1)

$f (t, s, m)$ is the expected present value at time $t$ if the oil reserve level is $s$ and the oil price level is $m$ . $u$ is the oil price adaptive oil extration level decision and $U$ is the feasible set. In each period, QP or LP problems are solved. The objective function of these problems is $π (x_{1}, ..., x_{K}; u, t, s, m)$ where $π$ denotes present value of profit and $x_{1}, ..., x_{K}$ are the logistics decision variables.

$α_{11} x_{1} + ... + α_{1 K} x_{K} \leq C_{_{1}}, ..., α_{L 1} x_{1} + ... + α_{L K} x_{K} \leq C_{L}$ denote the linear constraints in the LP or QP problems. $τ (n | m)$ is the probability of transition from oil price state $m$ to oil price state $n$ in the following period. The SDP problem is defined with a finite horizon, $T$ . It is solved via backward recursion. More details concerning this particular type of application are found in Lohmander.^4,5 Observe that (1) represents a very general and flexible way to formulate and solve applied stochastic multi period production and logistics problems of many kinds. The true sequential nature of decisions and information is explicitly handled, stochastic market prices and very large numbers of decision variables and constraints may be consistently considered. Furthermore, many other stochastic phenomena may be consistently handled with this approach. Several examples of how different kinds of stochastic disturbances may be included in optimal dynamic decisions are found in Lohmander et al.,^5–8 Adaptive control function optimization and typical application. When the number of decision variables is very large and the optimal decisions are dependent on detailed information in a state space of large dimensionality, SDP cannot be applied with reasonable computation time. Consider the following typical application problem: We want to optimize the management of a forest area with many trees. We have one hectare with 1000 trees. The dynamic development of each tree (diameter increment, height increment and so on) is dependent on the properties of the individual tree and the local competition (which is a function of the properties of the neighbour trees within a circle with radius ten meters, surrounding the individual tree). The optimal harvest decisions are price dependent and the price is exogenous and stochastic. We instantly realize that, in order to keep track of the development of each tree, we have to keep track of the properties of every tree. In principle, we could handle this problem via SDP. The number of possible state combinations is however extremely high. For this reason, SDP cannot be used (with reasonable time of computation). Then, optimal control functions for local decisions may be defined and the parameters can be determined via stochastic full system simulation and multidimensional regression analysis (Figure 1). Lohmander^5,8 include an approach to determination of all local decisions based on locally relevant state space information within stochastic dynamic and spatially explicit production. The expected present value of all harvests, over time and space, in a forest area, is maximized. Each tree is affected by competition from neighbour trees. The harvest decisions, for each tree, are functions of the price in the stochastic market, the dimensions and qualities of the individual trees and the local competition. The expected present value of the forest, when optimal adaptive decisions are taken, is an increasing function of the level of price risk.

Figure 1 Adaptive control function parameter optimization when the dimensionality of the state space is too high for stochastic dynamic programming.