Theory Behind the Approximation Functions for Several Equations


Derivation of the Approximation Function for Real Solutions of Several Equations

In this section, we will use differential equations to approximate the difference between a solution of several equations, and an estimate of the solution. From this, we will derive the approximation functions for the system of equations. The derivation refers to a system of three equations, but it applies to any number of equations that relate an equal number of quantities.

Suppose that we have three equations describing the quantities w1, w2 and w3, and that the equations are restricted to equal zero. Because of the restriction, the quantities w1, w2 and w3 are also restricted to certain sets of numbers. These sets are the solutions of the equations. We'll define W as a set of numbers that is a valid solution:

W = (w1, w2, w3) = Solution of 3 equations

If the equations were not restricted to equal zero, but were free to equal any number, then they would be called "functions." In general, the quantities described by these functions would not be restricted, but could vary independently. Let us define Z as the set of variables, and F(Z) as the set of functions:

Z = (z1, z2, z3) = Unrestricted variables

F(Z) = ( F1(Z), F2(Z), F3(Z) ) = Unrestricted functions

If Z equals a solution W, then each of the functions would return the number zero. Thus, we could describe the three equations to be solved as F(W) = (0, 0, 0):

F(W) = ( F1(W), F2(W), F3(W) ) = (0, 0, 0)

Example 1. Suppose that we need to solve this set of equations:

7w1 w22 - 3w2 w34 + 7 = 0
w13w2 + w14w32 - 9 = 0
4w12w32 + 2w23w3 - 37 = 0

These equations can be described by F(W) = (0, 0, 0). The elements of unrestricted F(Z) are

F1 = 7z1 z22 - 3z2 z34 + 7
F2 = z13z2 + z14z32 - 9
F3 = 4z12z32 + 2z23z3 - 37

(End of Example 1)

We can assign numbers to Z as an estimate of solution W. Most likely, our estimate will not be perfect, so F(Z) will not equal (0, 0, 0). We will define the difference between Z and W as DZ, and the difference between F(Z) and F(W) as DF:

DZ = Z - W

DF = F(Z) - F(W)

Since F(W) = (0, 0, 0), the increment DF is reduced to F(Z):

DF = F(Z)

While F(Z) is easy to compute, we cannot compute DZ because we don't know solution W. However, we can approximate DZ by using the differential equations of F(Z). The differentials of F(Z) are defined by the partial derivatives of F(Z), and the differentials of Z:

dF1 = [dF1/dz1]dz1 + [dF1/dz2]dz2 + [dF1/dz3]dz3

dF2 = [dF2/dz1]dz1 + [dF2/dz2]dz2 + [dF2/dz3]dz3

dF3 = [dF3/dx1]dz1 + [dF3/dz2]dz2 + [dF3/dz3]dz3

This system of differential equations can be written as a matrix product:


















The matrix of partial derivatives is called the Jacobian of F(Z) and Z. We will represent this matrix by J(Z), where the (Z) indicates that the elements are computed with estimate Z. If we also represent the column-vectors of differentials by dF and dZ, then we can express the matrix-equation in a single line:

dF = J(Z) * dZ

Example 2. The table below shows the sets of restricted equations and corresponding unrestricted functions presented in Example 1:

Restricted Equations Unrestricted Functions
7w1 w22 - 3w2 w34 + 7 = 0 F1 = 7z1 z22 - 3z2 z34 + 7
w13w2 + w14w32 - 9 = 0 F2 = z13z2 + z14z32 - 9
4w12w32 + 2w23w3 - 37 = 0 F3 = 4z12z32 + 2z23z3 - 37

The differentials of the functions are

dF1 = [7z22] dz1 + [14z1 z2 - 3z34] dz2 + [-12z2 z33] dz3

dF2 = [3z12 z2 + 4z13 z32] dz1 + [z13] dz2 + [2z14 z3] dz3

dF3 = [8z1 z32] dz1 + [6z22 z3] dz2 + [8z12 z3 + 2z23] dz3

We can write the differentials as a matrix expression, dF = J(Z)*dZ:





7 z22

3 z12 z2 + 4z13 z32    

8 z1 z32

14 z1 z2 - 3 z34    


6 z22 z3

-12 z2 z33

2 z14 z3

8z12 z3 + 2 z23





(End of Example 2)

Next, we will substitute the increments DZ and F(Z) for the differentials dZ and dF to get an approximation relating F(Z) and DZ:

F(Z) ~ J(Z) * DZ

At this stage, we'll restrict F(Z), J(Z), and DZ to real values. Let 1/J be the inverse of matrix J(Z), and left-multiply both sides of the approximation by 1/J:

1/J * F(Z) ~ 1/J * J(Z) * DZ

This gives us an approximation of DZ:

1/J * F(Z) ~ DZ

Earlier we defined DZ = Z - W. We can use this definition to express W as

W = Z - DZ

Substituting the approximation of DZ into the above gives us an approximation of W:

W ~ Z - 1/J * F(Z)

This is the approximation function for computing real solutions of several equations. The function returns a new approximation of W to be used as the next estimate Z. The computations are repeated until Z converges.


Derivation of the Approximation Functions for Complex Solutions of Several Equations

In the previous section, we derived an approximation function for real solutions of several equations. The derivation included the approximation

F(Z) ~ J(Z) * DZ

where F(Z) is a column-vector of functions based on the set of equations, J(Z) is the Jacobian matrix, and DZ is a column-vector of increments between estimate Z and solution W.

We will now derive complex approximation functions by allowing W, Z, DZ, F(Z) and J(Z) to have complex values:

W = U + iV = (u1, u2, u3) + i(v1, v2, v3)

Z = X + iY = (x1, x2, x3) + i(y1, y2, y3)

F(Z) = Fr + iFi = (Fr1, Fr2, Fr3) + i(Fi1, Fi2, Fi3)

J(Z) = Jr + iJi

DZ = DX + iDY = (Dx1, Dx2, Dx3) + i(Dy1, Dy2, Dy3)

Since DZ = Z - W, we can also write the relations

DX = X - U

DY = Y - V

Example 3. Referring to the functions in Example 1, we will resolve the first element of F(Z) into its real and imaginary parts. Note that R and q represent the absolute value and imaginary argument of a variable:

F1 = 7z1 z22 - 3z2 z34 + 7 = Fr1 + iFi1

Fr1 = 7R1 R22Cos(q1 + 2q2) - 3R2 R34Cos(q2 + 4q3) + 7

Fi1 = 7R1 R22Sin(q1 + 2q2) - 3R2 R34Sin(q2 + 4q3)

The other elements of F(Z) and the elements of J(Z) are resolved in similar manner.

(End of Example 3)

Now we will return to the approximation F(Z) ~ J(Z)*DZ, and substitute the complex quantities:

Fr + iFi ~ (Jr + iJi) * (DX + iDY)

Multiplying the terms on the right-hand-side yields

Fr + iFi ~ Jr*DX - Ji*DY + iJi*DX + iJr*DY

The real and imaginary terms can be separated into two approximations:

Fr ~ Jr*DX - Ji*DY

Fi ~ Ji*DX + Jr*DY

We will use these approximations to find expressions for DX and DY. To solve for DX, we left-multiply the approximation of Fr by 1/Ji, and left-multiply the approximation of Fi by 1/Jr:

1/Ji*Fr ~ 1/Ji*Jr*DX - (1/Ji)*Ji*DY

(1/Jr)*Fi ~ (1/Jr)*Ji*DX + (1/Jr)*Jr*DY

This gives us

(1/Ji)*Fr ~ (1/Ji)*Jr*DX - DY

(1/Jr)*Fi ~ (1/Jr)*Ji*DX + DY

Adding these approximations will eliminate DY:

(1/Ji)*Fr + (1/Jr)*Fi ~ [(1/Ji)*Jr + (1/Jr)*Ji]*DX

The term [(1/Ji)*Fr + (1/Jr)*Fi] on the left-hand-side is a column vector, and the term [(1/Ji)*Jr + (1/Jr)*Ji] on the right-hand-side is a square matrix. We can invert the matrix, and then left-multiply both sides of the approximation by the inverse to get our approximation of DX.

DX ~ 1/[(1/Ji)*Jr + (1/Jr)*Ji] * [(1/Ji)*Fr + (1/Jr)*Fi]

Now we'll return to the approximations of Fr and Fi, and find an expression for DY. We left-multiply the approximation of Fr by 1/Jr, and left-multiply the approximation of Fi by 1/Ji:

(1/Jr)*Fr ~ (1/Jr)*Jr*DX - (1/Jr)*Ji*DY

(1/Ji)*Fi ~ (1/Ji)*Ji*DX + (1/Ji)*Jr*DY

This gives us

(1/Jr)*Fr ~ DX - (1/Jr)*Ji*DY

(1/Ji)*Fi ~ DX + (1/Ji)*Jr*DY

Subtracting the top approximation from the bottom will eliminate DX:

(1/Ji)*Fi - (1/Jr)*Fr ~ [(1/Ji)*Jr + (1/Jr)*Ji]*DY

The term [(1/Ji)*Fi - (1/Jr)*Fr] on the left-hand-side is a column vector, and the term [1/Ji*Jr + (1/Jr)*Ji] on the right-hand-side is the same square matrix that appeared in the solution of DX. We can left-multiply both sides of the above by the inverse of this matrix to get our approximation of DY:

DY ~ 1/[(1/Ji)*Jr + (1/Jr)*Ji] * [(1/Ji)*Fi - (1/Jr)*Fr]

Earlier we defined DX as DX = X - U, and DY as DY = Y - V. We can use these definitions to express U and V as

U = X - DX

V = Y - DY

Substituting the approximations of DX and DY into the above gives us approximations of U and V:

U ~ X - 1/[(1/Ji)*Jr + (1/Jr)*Ji] * [(1/Ji)*Fr + (1/Jr)*Fi]

V ~ Y - 1/[(1/Ji)*Jr + (1/Jr)*Ji] * [(1/Ji)*Fi - (1/Jr)*Fr]

These are the approximation functions for computing complex solutions of several equations. The functions return new approximations of U and V to be used as the next estimates X and Y. The computations are repeated until X and Y converge.

Comment: In the procedure for finding complex solutions of several equations, and in the Sample Excel Sheet accompanying this procedure, the vectors and matrices were assigned the following nicknames:

matrix Marie = [(1/Ji)*Jr + (1/Jr)*Ji]

matrix 1/Marie = 1/[(1/Ji)*Jr + (1/Jr)*Ji]

vector Vince = [(1/Ji)*Fr + (1/Jr)*Fi]

vector Vito = [(1/Ji)*Fi - (1/Jr)*Fr]



Unpublished Work. © Copyright 2001 Pat Russell. Updated April 17, 2009.