Finding confidence interval for Maximum Likelihood Estimation on a probit model

I have reached a bit of an impasse in my work, and think I need some advice.

I am trying to find the optimal set of parameters for my data in a probit model, which (according to literature) can be done through MLE.
So ind this case the log-likelihood function is described as:


And it consists of three parameters that I wish to estimate.
I know there probably are some optimization algorithms that can be used to do this, but you can also just brute force it, and choose a grid of several parameters for each, and then just calculate the log-likelihood for each parameter set for all patients (in this case). So I have about 500 patients, where I know whether y is equal to 0 or 1 (an incident). So basically I get one log-likelihood value when all 500 patients is used with one set of parameters. In turn, quite a few values depending on the size of the grid.
But I then end up with a huge grid/matrix of log-likelihood values, and the idea is then to find the largest value, and that value should then correspond to the most likely set of parameters for this group of patients. That's all and done.

My main problem is now: How do I define a confidence interval (fx 95%) for this most likely parameter set, given that I have the entire log-likelihood matrix with ALL log-likelihood values available ? Is it even possible, or...?


Ambassador to the humans
You can use the Hessian matrix to get standard errors. If you use an optimization routine that provides the Hessian you could use that.
You can use the Hessian matrix to get standard errors. If you use an optimization routine that provides the Hessian you could use that.
Yeah, I've read a bit about that, but I am unsure about how to create that matrix in this case ?

What I'm doing now is that I have a grid/matrix for each patient where P has been calculated for a range of parameters, i.e. it's a large grid depending of the range of each parameter. For each of those P's in each patient I take the log, and then I sum up according to the equation/figure in the OP. But the Hessian matrix is supposed to be the 2nd derivative matrix of the log-likelihood function, right ? So would I just, instead of taking log(P) or log(1-P) take 1/P or -1/P (first derivative), or in fact the second derivative i.e. -1/P^2 or 1/P^2 ?

The reason for the confusion is that the maximum value of the log-likelihood is supposed to be found from the first derivative and then setting it equal to zero. However, I am not taking the first derivative in my case, since I am just calculating the log-likelihood value, and then just finding the largest value, so therefore I am not sure as to whether the Hessian should be the true 2nd derivative, or the 1st derivative will be the same as 2nd in this case ?

I apologize if I have explained it a bit confusing. But just ask away if you want something elaborated.

Thanks in advance.