Stata: ml versus nl

I need to estimate an equation using some iterative method to optimally choose the coefficients. I won't go into the reasons that I can't use a simpler methods like OLS, GLS or instrumental variables (it's a long, sad story).

My question is about weather I can expect equivalent results from ml and nl. Say I want to estimate

y_t = \alpha + \beta y_{t-1} + \gamma x_t + u_t

where I have autocorrelation in the error terms. Do I need to put in the effort to specify a likelihood function and ask Stata to maximize it, or will I get the same result if I ask Stata to minimize the sum of squares using nl?
What is the distribution of the error terms u_t? If it i.i.d. Gaussian, then ML will be the same as NL. If not, they won't be the same, and ML will be more efficient when it is correct (but arguably NL will be robust when ML is misspecified, although you may have hard time getting Newey-West standard errors out of NL). If the distribution of the errors is Cauchy, then NL may not be defined/may not converge.