Inspired by Harsh Shrivastava

Algebra Level 5

Let M M be the maximal value of x y + 2 y z + 3 x z xy+2yz+3xz , subject to the constraint x 2 + y 2 + z 2 = 2 x^2+y^2+z^2=2 , where x , y x,y and z z are real numbers. If M 3 = a M + b M^3=aM+b for integers a a and b b , enter a + b a+b .

Hint: Use the Spectral Theorem from Linear Algebra, Theorem 3.1, (or, equivalently, Lagrange multipliers)

Bonus question : If m m is the minimal value, write m 3 = c m + d m^3=cm+d .


Inspiration .


The answer is 26.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

1 solution

Otto Bretscher
Nov 13, 2015

If we multiply x , y x,y and z z with 2 \sqrt{2} , the problems amounts to finding the maximum of 2 x y + 4 y z + 6 x z 2xy+4yz+6xz subject to x 2 + y 2 + z 2 = 1 x^2+y^2+z^2=1 . Our solution will be in terms of this version.

The most elegant and conceptual solution is based on the Spectral Theorem of Linear Algebra. If you are not familiar with this material, please read the excellent explanation here , up to Theorem 3.1. This is important and beautiful math, well worth learning.

The symmetric matrix A A of our quadratic form q ( x , y , z ) = 2 x y + 4 y z + 6 x z q(x,y,z)=2xy+4yz+6xz is A = [ 0 1 2 1 0 3 2 3 0 ] A=\begin{bmatrix}0&1&2\\1&0&3\\2&3&0\end{bmatrix}

The characteristic polynomial is p ( λ ) = det ( λ I 3 A ) = λ 3 14 λ 12 p(\lambda)=\det(\lambda I_3-A)=\lambda^3-14\lambda-12 , and the eigenvalues are the solutions of the equation λ 3 14 λ 12 = 0 \lambda^3-14\lambda-12=0 . Since the maximum M M we seek (as well as the minimum m m ) is an eigenvalue (see Theorem 3.1), we have the equation M 3 = 14 M + 12 = a M + b M^3=14M+12=aM+b , so that a + b = 26 a+b=\boxed{26} .

Equivalently, the problem can be solved with Lagrange multipliers. This is essentially the same solution since the equations involving the Lagrange multipliers λ \lambda are linear, with the coefficient matrix A A we found above.

i was expecting solution using elementary algebra

Dev Sharma - 5 years, 7 months ago

Log in to reply

Linear algebra is pretty elementary ;)

Otto Bretscher - 5 years, 7 months ago

@Lakshya Sinha how did you solve it?

Dev Sharma - 5 years, 7 months ago

I am not sure whether you can rigorously apply Lagrange multiplier technique here. Because the objective function is bilinear and non-convex. Thus KKT only implies local optimality in general. This is also evident from the fact that one of the solution corresponds to minima ( λ min \lambda_{\min} ). Thus we need to distinguish between them. Linear algebraic methods are more appropriate for this class of problems.

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

I claim that the Lagrange multiplier method is indeed rigorous in this case, unrelated to convexity. The global extrema M M and m m are local as well, so that they are "Lagrange multiplier points". (The global extrema exist, of course, since we are dealing with a continuous function on a compact set.)

For the purpose of this problem, we don't need to distinguish between the max, the min, and the third solution, since they are all roots of the cubic equation we seek (I made that point clear with my "bonus question").

I agree with you that linear algebra methods are best suited for this problem, but many of our young comrades on Brilliant don't know these methods yet, so, reluctantly, I offered Lagrange multipliers as an alternative ;)

Otto Bretscher - 5 years, 7 months ago

Log in to reply

I agree that "for the purpose of this problem", the multiplier technique passes. However, if one is really trying to obtain the maximum (or minimum) of the stated constrained optimization problem, the multiplier technique alone is not sufficient.

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

@Abhishek Sinha I claim that the multiplier technique is sufficient for the stated problem. In a nutshell, my argument is this: The global extrema are local extrema and thus they are Lagrange multiplier points. The highest value at a Lagrange multiplier point will be the max, and the lowest will be the min.

Otto Bretscher - 5 years, 7 months ago

Log in to reply

@Otto Bretscher By using Lagrange multipliers, you are actually assuming that the strong duality holds for this problem. This is a non-trivial requirement. There are sufficient conditions (convexity with Slater's condition being a common one) for strong duality to hold. If strong duality does not hold, we have duality gap where the lagrange multiplier technique gives a wrong value for the constrained optimization problem.

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

@Abhishek Sinha It seems to me that you are overthinking this.

It is an exercise in elementary calculus to show that a local extremum of f ( x , y , z ) f(x,y,z) subject to a constraint g ( x , y , z ) = 0 g(x,y,z)=0 can be attained only at a Lagrange multiplier point. That is all we need to do this problem rigorously.

Otto Bretscher - 5 years, 7 months ago

Log in to reply

@Otto Bretscher Let me give a counter-example (albeit with two constraints) where what you just said is clearly false . Consider the problem min x 1 + x 2 \min x_1+x_2 Subject to, ( x 1 1 ) 2 + x 2 2 = 1 (x_1-1)^2+x_2^2=1 and, ( x 1 2 ) 2 + x 2 2 = 4 (x_1-2)^2+x_2^2=4 If you draw the circles corresponding to the constraints, you will find that the only feasible solution is x 1 = 0 , x 2 = 0 x_1=0,x_2=0 . Hence, the optimal value of the objective function is zero.

Now let's apply the Lagrange multiplier technique, with the multipliers being λ 1 , λ 2 \lambda_1,\lambda_2 . The Lagrangian is L ( x 1 , x 2 , λ 1 , λ 2 ) = x 1 + x 2 + λ 1 ( ( x 1 1 ) 2 + x 2 2 1 ) + λ 2 ( ( x 1 2 ) 2 + x 2 2 4 ) \mathcal{L}(x_1,x_2,\lambda_1,\lambda_2)=x_1+x_2+\lambda_1((x_1-1)^2+x_2^2-1)+\lambda_2((x_1-2)^2+x_2^2-4) Setting the gradients w.r.t. x 1 x_1 and x 2 x_2 to zero at the only feasible point ( 0 , 0 ) (0,0) yields the following equation ( 1 1 ) + λ 1 ( 2 0 ) + λ 2 ( 4 0 ) = ( 0 0 ) \begin{pmatrix}1\\1\end{pmatrix}+\lambda_1\begin{pmatrix}-2\\0\end{pmatrix}+\lambda_2\begin{pmatrix}-4\\0\end{pmatrix}=\begin{pmatrix}0\\0\end{pmatrix} However, it is clear that there can not be any solution for the multipliers λ 1 , λ 2 \lambda_1,\lambda_2 . Thus the only feasible point x 1 = 0 , x 2 = 0 x_1=0,x_2=0 is NOT a ``Lagrange Multiplier Point". What went wrong ?

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

@Abhishek Sinha Last time I checked we were talking about one constraint... what you said about that case was "clearly wrong", to use your own words.

Otto Bretscher - 5 years, 7 months ago

Log in to reply

@Otto Bretscher It is even easier to construct a counter-example with a single constraint. Consider the problem min x 2 + y 2 \min x^2+y^2 Subject to, ( x 1 ) 2 = 0 (x-1)^2=0 Its optimal value is 1 1 and is uniquely attained at x = 1 , y = 0 x=1, y=0 . Now let's formulate the Lagrangian L ( x , y , λ ) = x 2 + y 2 + λ ( x 1 ) 2 \mathcal{L}(x,y, \lambda)= x^2+y^2+\lambda(x-1)^2 Setting the gradient of the Lagrangian with respect to the variables x x and y y to zero gives 2 x + 2 λ ( x 1 ) = 0 , 2 y = 0 2x+2\lambda(x-1)=0,\hspace{10pt} 2y=0 And we can clearly see that the optimal solution x = 1 , y = 0 x=1,y=0 is NOT a "Lagrange Multiplier Point", i.e., there does not exist any λ R \lambda\in \mathbb{R} for which the optimal point ( 1 , 0 ) (1,0) satisfies the local optimality condition of the Lagrangian.

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

@Abhishek Sinha If you properly define a Lagrange multiplier point as a point where the two gradients are parallel, then (1,0) sure is a Lagrange multiplier point.

I just want beginning students to know that Lagrange multipliers, at its core, is a quiet intuitive and simple concept. It has nothing to do with convexity, KKT, "strong duality" and all these other fancy terms you throw out. The proof for one constraint (which I present every year to my calculus students) makes use of a few basic properties of the objective function and the constraint surface... and that's it.

Obviously, if you are at a boundary point of your constraint (as in your wacky example with the two constraints), you need to examine those point separately ... every introductory calculus student knows to "check the endpoints", even if they are not critical points.

Otto Bretscher - 5 years, 7 months ago

Log in to reply

@Otto Bretscher I am sorry but in the last example, the two gradients at the optimal point ( 1 , 0 ) (1,0) are given by ( 2 0 ) \begin{pmatrix} 2 \\ 0\end{pmatrix} and ( 0 0 ) \begin{pmatrix} 0 \\ 0\end{pmatrix} respectively and they are NOT parallel (i.e., one can't be written as a non-zero scalar multiplier of the other). That is precisely the point.

The fundamental reason as to why Lagrangian Multiplier fails at the above counter-examples is that they do not satisfy appropriate constraint qualifications. Strong Duality does not hold and we are in trouble. This is indeed a deep issue, not wacky as you have commented. There is a more-or-less complete characterization of sufficient conditions for convex problems (known as Slater's condition) under which strong duality holds. That is why convexity is such a treasured property. General non-convex optimization is intrinsically hard (more precisely, NP-hard in the language of complexity theory). So there is little hope that simple techniques such as Lagrange Multiplier would work in general.

Abhishek Sinha - 5 years, 7 months ago

Log in to reply

@Abhishek Sinha What you misconstrue as a "fundamental reason as to why Lagrange multipliers fail" is in fact just a matter of definition: Most people define two vectors to be parallel if they are linearly dependent.

Lagrange multipliers don't "fail" in this case... you just don't apply them the right way.

Otto Bretscher - 5 years, 7 months ago

@Abhishek Sinha In my calculus classes, I prove the following result (a short and easy proof):

If a smooth function f ( x , y , z ) f(x,y,z) attains a local extremum at an interior point P P of a smooth surface g ( x , y , z ) = 0 g(x,y,z)=0 , then the gradients of f f and g g are parallel at P P (i.e, linearly dependent).

This theorem applies to our problem where the constraint surface is a sphere.

The proof has nothing to do with convexity, "strong duality" and all that fancy stuff, of course.

More generally, if f f attains a local maximum subject to the constraints g j = 0 g_j=0 , for j = 1.. k j=1..k , then the differentials d f , d g 1 , . . . , d g k df,dg_1,...,dg_k are linearly dependent (or, equivalently, the gradients are dependent).

A proof can be found in a good calculus text such as "Second Year Calculus" by David Bressoud.

You started your first post by saying "I am not sure whether you can rigorously apply Lagrange multiplier technique here." Yes, we can. Have I convinced you yet?

Otto Bretscher - 5 years, 7 months ago

Log in to reply

@Otto Bretscher Thanks for the full statement.

Abhishek Sinha - 5 years, 6 months ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...