As far as I know, you need the equivalence study if you want to have any measure of statistical reliability attached to your inference. Failing to find a difference and saying they are similar is not at all valid and is a common bad move perpetuated by many. It's not minutiae or "technicalities." I'll offer a simple example but direct you to this paper, points 4-6,8. If a 95% CI for the mean change in HR is (-5,+5) this would indicate nonsignificance at the 5% level on a two tailed test where the null hypothesis is Ho: delta mu HR = 0. The logic other put forth is that failing to find significance means there is no difference in the groups and the CI supports this because 0 is in the interval. This is a mistaken understanding of the methods and the output. If the CI contains zero so we conclude no difference, why can we not also conclude the difference is -4, -3, -2, -1, 1, 2, 3 ,4...anything between and including (-5,+5)? The center of the interval is not necessarily the value most compatible with the data, for example, so we quickly see that many more hypotheses may explain our observations and therefore, we cannot conclude no difference by failing to reach significance. Finally, it is a logical fallacy to assert that an absence of evidence constitutes evidence of an effect "absence."

You're very correct in your statement that the above method is flawed and outright incorrect. You're going to need an equivalence study, otherwise, the most you can conclude with frequentist testing is that there is insufficient evidence of a true mean difference in HR between the populations at the chosen alpha level, or that there is sufficient evidence that there exists a true mean difference in HR between the populations.