Cohen's d for one-sample/paired-samples vs. independent samples, and pooled SDs

I have a few questions for you all about Cohen's d and pooled SDs! This is my first post, hope it goes in line with forum guidelines.

1) one-sample / paired-sample vs. independent samples Cohen's d calculation
It seems that for calculation of Cohen's d (an estimate of effect size), you should use the difference in means (or in one-sample case, from baseline), divided by SD, if available:
(M1 - M2) / SD
For paired samples, you use the mean of the difference scores divided by the SD of the difference scores.

However, when you only have the t-statistic and the degrees of freedom (df), I cannot figure out how the formula for d is derived. In the one-sample/paired-sample case, it's intuitive how to derive d from t, since t is calculated as the difference in means over standard error:
t = (M1 - M2) / se
which equals:
(M1 - M2) / (SD / sqrt(n))
so d equals:
d = (M1 - M2) / SD = t / sqrt(n)

And that's what you find in online articles here (formula 3) and here.

But for independent samples, the equation is instead:
d = 2t / sqrt(df), as seen here and here
or even
d = t * sqrt(2/n), as seen here

So where does the "2" come from? I think I'm okay regardless, because I have the original data so I can use the actual means and SDs, but I would really like to know why this difference exists, and where this "2" comes from. Is it related to pooled SD's?

2) For independent samples, which Pooled SD is used for Cohen's d calculations?

If I have the means, SDs, and n's of two independent groups, I should be able to calculate Cohen's d without using t-statistics and df's, and using the pooled SD in formula (M1 - M2) / SD above. However, I have seen both of these:
SDpooled1 = square root of ( (n1 - 1)s1^2 + (n2 - 1)s2^2 ) / (n1 + n2)
SDpooled2 = square root of ( (n1 - 1)s1^2 + (n2 - 1)s2^2 ) / (n1 + n2 -2)

From what I've seen, SDpooled1 is a biased estimator of SD, and SDpooled2 is unbiased, that is, I think it means the biased estimator (SDpooled1) underestimates SD. It seems like Cohen's d uses SDpooled1, whereas Hedges' g uses SDpooled2 (see Effect size on Wikipedia, and this Rosnow et al. 2000 Psych Science article, formula (5). )

Is this true? And if so, why would one Effect size estimator (Hedges' g) use the unbaised estimator, SDpooled2, while the other (Cohen's d) not? Other sites list SDpooled as sqrt((s1^2 + s2^2) / 2), which is the same as SDpooled1 when sample sizes are equal: here and here.

Thanks so much everyone!
Alon Hafri