Logo
0

线性映射和雅可比矩阵

一般的,我们将nn维空间中的向量记为xn×1\boldsymbol{x_{n\times1}}mm维空间中的向量记为ym×1\boldsymbol{y_{m\times1}}nn维空间的向量可以通过线性映射转化到mm维空间

y=Am×nx\boldsymbol{y}=A_{m\times n}\boldsymbol{x}

矩阵Am×nA_{m\times n}即为雅可比矩阵

Am×n=[fx1fx2fxn]A_{m\times n}=\begin{bmatrix} \displaystyle\frac{\partial{\boldsymbol{f}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{\boldsymbol{f}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{\boldsymbol{f}}}{\partial{x_{n}}} \end{bmatrix} Am×n=[f1x1f1x2f1xnf2x1f2x2f2xnfmx1fmx2fmxn]A_{m\times n}=\begin{bmatrix} \displaystyle\frac{\partial{f_{1}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{n}}} \\ \displaystyle\frac{\partial{f_{2}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{n}}} \\ \vdots & \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial{f_{m}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{n}}} \\ \end{bmatrix}

为了与一般准则保持一致,我们使用JJ来表示雅可比矩阵,其分量可表示为

Ji,j=fixjJ_{i,j}=\displaystyle\frac{\partial{f_{i}}}{\partial{x_{j}}}

雅可比矩阵推导

方阵

在推导雅可比矩阵前,我们要先了解两个数学概念之间的不同:函数映射

根据《数学分析》中的概念:

RmR^m—>RR的这类映射我们称之为函数RmR^m—>RnR^n的这类映射我们称之为广义的映射

函数

根据上述的概念,我们可知函数是一种特殊的映射,其将mm维空间的向量映射为一个具体的数值

在函数领域,我们写出一般情况下的泰勒展开公式

一元函数的泰勒展开式:

f(x)=f(xk)+(xxk)f(xk)+12!(xxk)2f(xk)+o((xxk)2)f(x)=f(x_{k})+(x-x_{k})f'(x_{k})+\displaystyle\frac{1}{2!}(x-x_{k})^2f''(x_{k})+o((x-x_{k})^2)

二元函数的泰勒展开式:

f(x,y)=f(xk,yk)+(xxk)fx(xk,yk)+(yyk)fy(xk,yk)+12!(xxk)2fxx(xk,yk)+12!(xxk)(yyk)fxy(xk,yk)+12!(yyk)2fyy(xk,yk)+on\begin{gather*} f(x,y)=f(x_{k},y_{k})+(x-x_{k})f'_x(x_{k},y_{k})+(y-y_{k})f'_y(x_k,y_k)+ \\ \displaystyle\frac{1}{2!}(x-x_{k})^2f''_{xx}(x_k,y_{k})+\displaystyle\frac{1}{2!}(x-x_{k})(y-y_{k})f''_{xy}(x_k,y_{k})+\displaystyle\frac{1}{2!}(y-y_{k})^2f''_{yy}(x_k,y_{k})+o^n \end{gather*}

nn元函数的泰勒展开式: 。。。

将泰勒展开写成矩阵形式:

f(x)=f(xk)+[f(xk)]T(xxk)+12![xxk]TH(xk)[xxk]+on\begin{gather*} {f(\boldsymbol{x})}={f(\boldsymbol{x_k{}})}+[\nabla f(\boldsymbol{x_k{}})]^T(\boldsymbol{x}-\boldsymbol{x_k{}})+\displaystyle\frac{1}{2!}[\boldsymbol{x}-\boldsymbol{x_{k}}]^TH(\boldsymbol{x_{k}})[\boldsymbol{x}-\boldsymbol{x_{k}}]+o^n \end{gather*}

如果我们忽略泰勒展开中的高阶项,那么

f(x)f(xk)+[f(xk)]T(xxk)f(\boldsymbol{x})\approx f(\boldsymbol{x_k{}})+[\nabla f(\boldsymbol{x_k{}})]^T(\boldsymbol{x}-\boldsymbol{x_k{}})

可进一步得到:

Δf=[f(xk)]TΔx(1)\Delta {f}=[\nabla f(\boldsymbol{x_k{}})]^T \Delta \boldsymbol{x} \tag{1}

其中,Δf\Delta \boldsymbol{f}Δx\Delta \boldsymbol{x}的维度相同,将其转换成微分的形式,结合微分与梯度中多元函数微分的形式更好地理解

df(x1,x2,...,xn)=[fx1fx1fx2fxm][dx1dx2dxm](*) df(x_1, x_2, ...,x_n) =\begin{bmatrix} \displaystyle\frac{\partial{f}}{\partial{x_{1}}} \displaystyle\frac{\partial{f}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f}}{\partial{x_{m}}} \\ \end{bmatrix}\begin{bmatrix} dx_1 \\ dx_2 \\ \vdots \\ dx_m \\ \tag{*} \end{bmatrix}

映射

映射相比函数而言更为普适,其代表mm维空间向nn维空间的映射,即将一个mm维的向量映射为一个nn维的向量函数

对照公式* ,我们把等式左侧的单一值函数微分变为由mm个函数微分组成的向量函数微分,该向量函数微分的每个分量函数微分都是由nn 维向量x\boldsymbol{x}的微分与对应函数偏导的内积映射而来,共有mm个函数微分,故单一值函数的多元函数微分映射关系被执行了mm次,有:

[df1df2dfm]=[f1x1f1x2f1xnf2x1f2x2f2xnfmx1fmx2fmxn][dx1dx2dxn]\begin{bmatrix} df_1 \\ df_2 \\ \vdots \\ df_m \\ \end{bmatrix}=\begin{bmatrix} \displaystyle\frac{\partial{f_{1}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{n}}} \\ \displaystyle\frac{\partial{f_{2}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{n}}} \\ \vdots & \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial{f_{m}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{n}}} \\ \end{bmatrix}\begin{bmatrix} dx_1 \\ dx_2 \\ \vdots \\ dx_n \\ \end{bmatrix}

对等式左右两侧的微分作积分,即可得到一般形式下的nn维向量向mm维向量映射的关系式:

[f1f2fm]=[f1x1f1x2f1xnf2x1f2x2f2xnfmx1fmx2fmxn][x1x2xn]\begin{bmatrix} f_1 \\ f_2 \\ \vdots \\ f_m \\ \end{bmatrix}=\begin{bmatrix} \displaystyle\frac{\partial{f_{1}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{1}}}{\partial{x_{n}}} \\ \displaystyle\frac{\partial{f_{2}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{2}}}{\partial{x_{n}}} \\ \vdots & \vdots & \ddots & \vdots \\ \displaystyle\frac{\partial{f_{m}}}{\partial{x_{1}}} & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{2}}} & \cdots & \displaystyle\frac{\partial{f_{m}}}{\partial{x_{n}}} \\ \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{bmatrix}

© 2025 All rights reservedBuilt with DataHub Cloud

Built with LogoDataHub Cloud