\(Y_i=f(z_i,\theta)+\epsilon_i\), \(i=1,...,n\),

and in the Gauss-Newton method, the update on the parameter $\theta$ from step $t$ to $t+1$ is to minimize the ￼sum of squares

\(\sum_{i=1}^{n}[Y_i-f(z_i,\theta^{(t)})-(\theta-\theta^{(t)})^Tf'(z_i,\theta^{(t)})]^2\).

Can we prove that (why) (part 1) the update is given in the following form:

\(\theta^{(t+1)}=\theta^{(t)}+[(A^{(t)})^TA^{(t)}]^{-1}(A^{(t)})^Tx^{(t)}\),

(part 2) where \(A^{(t)}\) is a matrix whose \(i\)-th row is \(f'(z_i,\theta^{(t)})^T\), and \(x^{(t)}\) is a column vector whose \(i\)-th entry is \(Y_i-f(z_i,\theta^{(t)})\).

How to derive those relations? Thanks in advance!