relation within Gauss-Newton method for minimization

If we study model fit on a nonlinear regression model

\(Y_i=f(z_i,\theta)+\epsilon_i\), \(i=1,...,n\),

and in the Gauss-Newton method, the update on the parameter $\theta$ from step $t$ to $t+1$ is to minimize the sum of squares


Can we prove that (why) (part 1) the update is given in the following form:


(part 2) where \(A^{(t)}\) is a matrix whose \(i\)-th row is \(f'(z_i,\theta^{(t)})^T\), and \(x^{(t)}\) is a column vector whose \(i\)-th entry is \(Y_i-f(z_i,\theta^{(t)})\).

How to derive those relations? Thanks in advance!