Compound Distribution Hyper-Parameters

If a compound distribution is made up of a gamma dist with parameters (a,b), and b is gamma-distributed so that its parameters are (c,d), such that I have a compound distribution -> f(a,c,d) (i.e. b has been marginalised out).
My question is:
If I am using the log-likelihood of the compound dist and running optimization code on this function to estimate the parameters for a,c,d then are the values I get for a,c,d equivalent to the values I would get for a,c,d if I solved for (a,b) and (c,d) using their original gamma functions? I am asking because the compound function does not solve using standard optimization algorithms, but the individual gamma functions do, so I was wondering why one would use the compound dist in the first place? I was not sure how to go about proving this mathematically.
Thanks for your responses!


TS Contributor
As you said, once you marginalized out the parameter \( b \), then you left with a compound distribution with \( 3 \) parameters and this is your model.

I am not quite sure if I get the meaning in the latter part of question. You do not observe any realization (data) of \( b \) (isn't it?) so I do not see how you estimate \( c, d \) based on the original gamma distribution.