If we study model fit on a nonlinear regression model

Y_i=f(z_i,\theta)+\epsilon_i, i=1,...,n,

and in the Gauss-Newton method, the update on the parameter $\theta$ from step $t$ to $t+1$ is to minimize the sum of squares

\sum_{i=1}^{n}[Y_i-f(z_i,\theta^{(t)})-(\theta-\theta^{(t)})^Tf'(z_i,\theta^{(t)})]^2.

Can we prove that (why) (part 1) the update is given in the following form:

\theta^{(t+1)}=\theta^{(t)}+[(A^{(t)})^TA^{(t)}]^{-1}(A^{(t)})^Tx^{(t)},

(part 2) where A^{(t)} is a matrix whose i-th row is f'(z_i,\theta^{(t)})^T, and x^{(t)} is a column vector whose i-th entry is Y_i-f(z_i,\theta^{(t)}).


How to derive those relations? Thanks in advance!