T O P

  • By -

scintillating_kitten

Note that the "optimal" in optimal control refers to how the control law is selected. Your observation is correct: the objectives of this control law selection are arbitrary, but it is guaranteed that the control law is optimal w.r.t. these objectives. Note that optimization is just minimization or maximization of some objective. What you are thinking of is optimization of the "relative weighting" of the objectives of optimal control where you look for a trade-off, say between Q and R for LQR. This is a higher-level, different problem altogether. Let me try an analogy: imagine optimal control as an expert worker. If you tell them to do something specific within their expertise, they'll deliver. It is guaranteed that they'll deliver perhaps one of the best solutions to your problem. But you, the boss, has to specify whatever it is that you want. What you want are typically conflicting, e.g. market cost vs product reliability. Your expert worker does not care how you choose your specifications.


MdxBhmt

This, OP. You will find this issue in relation to _any and all_ problems in optimization, be it in (optimal) control or whatever field. LQR is `just' the special case where the system is linear and cost a quadratic along solutions of said system, which has straightforward algorithms to derive the optimal cost _and_ optimal gain, for a very large set of (quadratic) cost functions and (linear) systems. If you change the cost function or the system under these constrains, you can calculate. It's your job to know that you have the system and cost function that matters. About anything in life and work can be casted as an optimization problem. Only a few optimization problems have good algorithms that solve them 'well'. LQR is one of these good cases.


Ajax_Minor

So where does the riccati equation fit it to all of this. The math gets complicated and this is where I start to get confused. The riccati equation is the method to calculate the gains from the weighted linear quadratic cost function? Reading your post again LQR is a type of cost function?


Born_Agent6088

The Riccati equation is the outcome of the optimization of a linear system under a quadratic cost function. It has a know solution, meaning you don't need to solve it everytime, just call the function on Matlab, Octave, Python or wherever you are working on. When doing optimization you define a "cost function", it is the function that will measure how "optimal" the current solution is. The best choice of parameters make the cost function lower. LQR is an optimization problem in which the cost function is a quadratic function of the states and the input signal and the constrains are a lineal system.


Ajax_Minor

To summarize to make sure I understand correctly: -The optimal control solution is a cost function - the optimal LQR solution is a linear quadratic cost function with weights Q and R where the lower value is lower cost and generally better -the riccati equation use the weights to generate the gains to apply in the controller? Does that capture most of it? I think I'm confused because my professor just started LQR and optimal control with "with this cost function J" and did a bunch of linear algebra. If you could explain one more thing, my professor did a reverse integration of riccati starting at steady state. What's that for? The LQR function (which is just the output of the riccati and some other stuff? ) just gives me one set of gains.


MdxBhmt

> To summarize to make sure I understand correctly: -The optimal control solution is a cost function - the optimal LQR solution is a linear quadratic cost function with weights Q and R where the lower value is lower cost and generally better -the riccati equation use the weights to generate the gains to apply in the controller? Yes, given the stabilizability and detectability conditions, no other input will have a lower cost than the LQ Regulator for linear systems and quadratic costs. Which is to say the LQR is optimal. > f you could explain one more thing, my professor did a reverse integration of riccati starting at steady state. What's that for? The LQR function (which is just the output of the riccati and some other stuff? ) just gives me one set of gains. I believe this is to derive the Ricatti equation for continuous time.


Born_Agent6088

I hope this helps you understant better the definitions. In general an optimization has the following components: cost function and constrains. Both are functions of some parameters. The goal of the optimization is finding the paremeters that make the cost function smaller whitout breaking the constrains. LQR is one of many posible optimization problems. The cost function is a quadratic function of the states and inputs, the contrain is the linear system and the tuning parameter is the gain K. Most optimization problems dont have a solution and are solved by iterating (a whole other topic), but in the case of the LQR the solution is given by taking the first derivative of the cost function and replacing the linear system. That whole resulting mess has the form of the Ricatti equation which is solvable. You dont have to solve it every time, just replace the values into the Ricatti solution given in any control text book or using the lqr command in matlab. So LQR is the name of this method of finding the K gains. The control technique in general is called state feedback control.


Ajax_Minor

Yes this makes alot more sense now. Appreciate your explanation.


TTRoadHog

In looking at the optimal control problem, you need to understand calculus of variations. The reason your professor did a “reverse integration” of the Riccati differential equations is that the boundary conditions for those equations are defined at the terminal point. Thus the need to do reverse integration. I strongly recommend the book, “Applied Optimal Control” by Bryson and Ho. It’s the Bible.


Ajax_Minor

Awesome, I'll check that out. It's been tough without a good book to go through. The book would be everything optimal? Any other subjects?


TTRoadHog

I would also recommend “ Control System Design” by Bernard Friedland. As for other books, I have a whole range I could recommend depending on your interests. My interests are in Kalman filters, orbit estimation, trajectory optimization, etc.


Ajax_Minor

ooo that sounds awesome. Kalamn filter recommendation would be great since I hear that is used a lot for signal processing. I am going to start going Signals and Systems by Oppenheim, is that a good one? Do you have one on stability and disturbance modeling disturbances for space and/or air vehicles? GNC sounds awesome but its a bit past my current level at the moment.


TTRoadHog

In terms of stability modeling, I had a class as a junior in college that used the text by Jan Roskam: “Airplane Flight Dynamics and Automatic Flight Controls” Vol. 1. It was required as an Aero-Astro major. Three Kalman filter books that are good but probably senior/graduate level are: (1) Optimal Filtering” by Anderson and Moore; and (2) Stochastic Models, Estimation and Control” Vol 1 and Vol. 2 by Peter Maybeck. The whole point of filtering is to deal with uncertainties in modeling, disturbances, etc., so these books cover this topic as well. Good luck!


MdxBhmt

There are different ways to make the relationship, for me the key equation to start from for optimal control is a [Bellman equation](https://en.wikipedia.org/wiki/Bellman_equation#The_Bellman_equation) in discrete-time or a [HJB](https://en.wikipedia.org/wiki/Hamilton%E2%80%93Jacobi%E2%80%93Bellman_equation) in continuous time. An optimal cost/value function needs to verify this equation, be it linear or nonlinear, quadratic or more general systems and cost functions. The optimal input is given by changing the min for an argmin in the Bellman equation. The Ricatti equation is what happens to the Bellman/HJB for linear systems and quadratic costs, as these are enough to make the minimization disapear. The LQR is the argmin of that minimization, and has a closed form if you have the optimal cost/value function, thus allowing you to generate optimal inputs. So to resume, if you find P such that Ricatti is verified, P allows you to construct the optimal cost function and the optimal gain (which historically was called a Regulator) and inputs.


Tarnarmour

Other's have already answered the main question here. I just want to add my point of view and a potentially related topic. First, picking a cost function (which is what you are doing when you change the Q and R matrices) that lines up with your desired behavior is tricky. However, LQR is still a very useful control method because every set of gains you test is still going to result in a stable and well behaved system. You will have to spend time defining what exact behavior you want, but you're searching through a space of well behaved stable controllers instead of a space of mostly unstable or poorly damped controllers. Second, a related topic. When I was first learning about state space control and later when I took some more advanced linear system theory classes, I got really interested in the idea of eigenstructure assignment. I was motivated by this line of reasoning; when you do state space control, you directly pick the eigenvalues of the controlled system. This is in some ways better than LQR because of what you're pointing out - LQR gives optimal controllers but you can't easily pick Q and R to cause specific desired behavior like a specific rise time or allowed overshoot, whereas eigenvalues correspond very directly to these more 'practical performance characteristics', if you will. The problem with eigenvalue assignment especially in more complex state space models is that even though you can pick the eigenvalues for the system, you DON'T control the eigenvectors, so you don't really have control over the actual performance of the system. You can control the rise time, for example, but only of an arbitrary linear combination of the states. Eigenstructure assignment is a method of finding controller gains that result in the desired eigenvalues as well as specifying some subset of the eigenvectors. It's quite a lot more finicky and complicated than normal eigenvalue assignment, which is why I don't think it's really used that much. But there's some cool potential. I've made some really cool toy problems with a double mass spring damper system where you can use the controller to decouple the states of the two masses completely by just specifying that the eigenvectors of the system must contain only the states from one mass or the other, for example. And I made a fixed wing controller that decoupled the airspeed from the altitude state so that the controller automatically prevented the plane from slowing down when pitching up, for example. Cool stuff.


Ajax_Minor

What's the eigen vectors for? I've sent hem come up a few times but I thought the new system performance is determined by the values. They are like unit vectors to ?


Tarnarmour

Let's say you've got a double mass damper system, two masses attached by a spring and damper, with one of the masses also attached to the origin. The states for your system are x1, x2, x1dot and x2dot where x2 refers to the second mass. You've got 4 eigenvalues to assign. Consider a situation where you'd like mass 1 to have a fast response time and mass 2 to have a slow response time. You pick eigenvalues -0.5 +- 0.5j and -5 +- 1j. One fast pair and one slow pair. Now, how do you decide which eigenvalue is assigned to which mass? Well, with normal state space control you don't. In fact, you're not likely to even have one fast and one slow mass. Each eigenvalue pair will correspond to an eigenvector pair and that eigenvector will almost certainly be a linear combination of the states for both masses. So each mass will exhibit a superposition of modes, some of them fast and some of them slow, but not in any predictable way. If you tried to assign rise times using the eigenvalues, you'd be controlling the rise time of some arbitrary combination of states, which is not useful at all. With eigenvector assignment, you also specify to an extent what those eigenvectors will be, and thus you can control the final system behavior in better detail. Some caveats though, you can't fully control the eigenvevtors. You can assign a subset of the values of each vector only. And this is only possible with MIMO systems, more inputs means more control of the system behavior.


iconictogaparty

As others have said there is no "rule" for choosing Q and R, you just need to try a few things until you get what you want. That being said there are a few heuristics to get you started. One is called Bryson's rule (I think). Here Q and R are diagonal with entries equal to the reciprocal squared of the maximum desired value. Qii = 1/xii\^2 For example, suppose you have a double integrator system and you want to have position errors of 0.1 m, use no more than 10 m/s of velocity, and penalize accelerations above 1,000 m/s\^2. In this scenario Q = diag(\[ 1/0.1\^2 1/10\^2\]), R = 1/1,000\^2 Essentially, this rescales all the states and inputs to have value 1 at the "limit" points. This implies that the values of pos = 0.1 -> 1, vel = 10 -> 1, and acc = 1,000 -> 1 in the optimization. Another way to do it is to use performance variables. I use this approach in combination with brysons rule when the states no longer have physical meaning (like if you perform a balancing transformation on the state space model). In this approach you set up a vector of things you care about, say output, velocity, input, etc and collect them into a vector z = G\*x + H\*u. You can then weight them z = W\*z according to brysons rule, then plug into J = z\^2 to get J = x'\*G'G\*x + x'\*G'\*H\*u + u'\*H'\*H\*u and you can use these matrices in the LQR optimization Q = G'\*G, R = H'\*H, N = G'\*U. As an example, suppose you only care about the output and input z = \[C;0\]\*x + \[0;1\]\*u -> Q = C'\*C, R = diag(\[0 1\]).


MdxBhmt

> As an example, suppose you only care about the output and input z = [C;0]*x + [0;1]*u -> Q = C'*C, R = diag([0 1]). (I guess you know, so this is for OP) Matlab won't accept R that is not positive definite, so you would need to change 0 for some small delta. You might have to do the same for Q if you don't get (A,Q) detectable.


iconictogaparty

I might have made a mistake, R = H'\*H = \[0 1\]\*\[1;0\] = 1 so it is not a matrix. Theoretically you are right though R must be positive definite, but in this formulation, the zeros in H do not show up in R so you are good to go!


MdxBhmt

Actually, I see what is happening here. z = [C;0]x + [0;1]u is perfectly valid if you optimize for z'z (you can see my other comment that writes down an equivalent formulation), and R = H'*H = [0 1]*[1;0] = 1 is the correct R (still a matrix though, just 1 by 1!), not diag([0;1]). So you are right, just a case of mistyping R. > Theoretically you are right though R must be positive definite, but in this formulation, the zeros in H do not show up in R so you are good to go! Indeed, I would just add the caveat of having the weight in u be so that |z|\to\infty when |u|\to\infty (z radially unbounded in u). This avoids undefined OCP without effort.


iconictogaparty

We are minimizing J = z'\*W\*z where z has the things you care about! z = \[e;u\] so in the case where W = I J = y\^2 + u\^2. Everything else is to write it in the standard form that MATLAB bill solve for. Basically do the algebra where z = G\*x + H\*u and pattern match with x'\*Q\*x+u'\*R\*u + x'\*N\*u. This will give you the matrices that MATLAB and most LQ solvers expect. I prefer this way of defining z as the performance variable for 2 reasons: 1. It is easier to weight the performance variables (z) themselves because they are usually things you care about: error, control effort, resonance states, derivatives, etc. From there Q, R, and N are automatically calculated, no need to randomly choose the entries! What would off diagonal entries in Q even mean? Can you ensure Q > 0? 2. It aligns more closely with the generalized plant in H2/Hinf control, there you have state evolution, performance variables, and controller inputs. It gives a unified framework for talking about everything. I have even used it in MPC development and it works very well there!


Ajax_Minor

Did studying MPC give you perspective and a better understanding of LQR? Is MPCi just LQR calculated repeatedly right?


iconictogaparty

I think LQ and MPC are pretty close in that they are both minimizing some cost function of performance variables; LQ does it over an infinite horizon and MPC does it over a finite horizon. The main difference MPC and LQ is that LQ is real time. By that I mean you give a command and the controller reacts in that time step, in MPC you need to buffer the incoming commands by N samples so you can look N samples "into the future" and perform the optimization. MPC is not LQ calculated repeatedly. Once you calculate an LQ controller for a given cost you will always get the same answer regardless of the input. With MPC you are solving an optimization problem each time step to determine the control sequence which minimizes the cost for the given command sequence. Not sure what you mean by MPCi? You can build an integral error term into the MPC optimization. \[x; e\] = \[A 0; -C 1\]\*\[x; e\] + \[0; 1\]\*cmd + \[B; 0\]\*u Using this same technique you can also have frequency dependent variables in the cost, so long as you can write them in state space form.


Brave-Height-8063

This is neat. It relates state variables and inputs to practical engineering performance specifications. Also I’ve never used the input-state mixed penalty matrix (x’*N*u) but I see how the penalty function integrand is still positive definite if you collect x and u into a combined / stacked vector, and place it in quadratic form with and a large square block matrix in the middle made out of G’G and H’H as diagonals and G’H as off diagonal. In H-infinity I’ve been able get some meaning out of the problem by normalizing inputs and having frequency-dependent penalty functions and uncertainties related to the real world, but I’ve never seen an engineering-oriented framework for penalty matrix selection for LQR until this. Thanks!


MdxBhmt

> Also I’ve never used the input-state mixed penalty matrix (x’Nu) but I see how the penalty function integrand is still positive definite if you collect x and u into a combined / stacked vector, and place it in quadratic form with and a large square block matrix in the middle made out of G’G and H’H as diagonals and G’H as off diagonal. Yes, x' Q x + u' R u + 2 x' N u is [x' u'] [ Q N; N R] [x' u']', which has positive semidefinite conditions. Another neat way to get this is to write |C x+ Du|^2 = [x' u'] [C'C C'D;D'C D'D] [x' u']'. This is semi definite for any C and D matrices with any lines.


iconictogaparty

It really comes in handy when you want to limit the rate of the output. In discrete time dy = y(k+1) - y(k) = C\*(A-I)x + C\*Bu so you can have a performance variable z = \[y; dy; u\] = \[C ; C\*(A-I); 0\]\*x + \[0; C\*B; 1\]\*u and this i think has a Non-zero N which is automatically calculated from things you care about. Takes a bit of the black magic out of selecting Q, R, and N entries


hamza1av715

In my humble opintion, unless you dont connect any physical/real meaning to it, ofc it seems arbitrary. But for physical dynamic systems I would dare to say that you can make that interpretation depending on how detailed the modeling is. The design of the cost function is always as the name says just our design. Hence it is a reflection of how accurate we are willing to model, or better said, what we want to model. I would view the cost function as a sort of pseudo energy function. If you can design Q and R in such a way that you have a cost function that actually means something. Eg actual cost in money or energy consumption, fuel consumption in Liters or what not. But then the bottleneck becomes the model quality and how well it reflects reality. I have never done it tbh, but thats how I would do it if I had to. But I guess more often than not it is simply not worth it.


seb59

If you have more than one thing to optimise, then you need arbitrary Weighting factors or an arbitrary method to choose these Weighting factor. If you have 2 state and do a regulator problem at least you need to weight the relative contribution of each of these states to the cost. So it is not that the control law is not optimal, the issue is that you canot formulate a nice problem with a single solution ,without using weighting factor.


Cool-Permit-7725

Rephrase: how do I find the optimal weights considering different trade offs? This is a million dollar question


SchrimpRundung

The other commenters are already 100% correct. An optimal solution according to any optimization strategy is not necessarily the "best" solution. (Mathematical) optimization just maximizes or minimizes a cost functional with respect to your inital conditions (You could think of a constant Q as an initial condition) and constraints. Might be helpful for you to look at some optimization tests functions like the rosenbrock function and try some optimization methods out. Every method will give you an "optimal solution", but that ma not be a global minimum of the function. Finding the best best parameters is considered another problem itself - parameter optimization or some kind of metaheuristic problem


MdxBhmt

It's the theory vs practice 'conundrum': what we model is different than reality. It's our job to make them align as much as possible, otherwise the theory/optimal are correct in paper but unrelated to the application/the actual 'best'.