Why Do We Use Sample SD to Estimate Pop SD in NHST?

Hello. I'm trying understand how we derive the properties of a null distribution in null hypothesis significance testing. A one-sample z-test scenario (seeing if a sample mean is significantly different from a population where µ and σ are known) makes sense to me. My understanding is that the central limit theorem states that the sampling distribution of the mean will have a mean of µ and SD of σ/√n (i.e., standard error), so those are the properties of the null in that scenario. Once I've established the mean and SE of the null distribution, I understand how to work out the probability associated with a sample mean at least as extreme as the observed one.

But if we take the same scenario, except that σ is unknown, the reasoning starts breaking down for me. My understanding is that we estimate σ using the sample standard deviation (s). That means the properties of the null distribution will be mean = µ and SE = s/√n, and you can work out the probability value from the t distribution. I can do the calculations, but what I don't understand is why s is taken as a trustworthy estimate of the σ to plug into the SE formula in the first place. If the sample is representative of the population, then it makes perfect sense to me that s could estimate σ. But isn't the whole idea of a one-sample t-test that perhaps the sample doesn't belong to the population we're comparing it to? And if it doesn't, then why would s is a good estimate of σ? The reasoning only seems to make sense when the null is true.

I've looked through several intro stats books, but none give a justification for this. I've also tried Googling and searching this forum, but I don't see where this specific question has been posed. I would appreciate any insights on this! Is it just a limitation of the analysis, or am I misunderstanding how it works? Thanks!