weighted correlation

Lazar

Phineas Packard
#1
Hi Folks,

So I have data that looks like:
Code:
myData <- structure(list(total = c(0.154, 0.223, 0.252, 0.157, 0.196, 0.252, 
0.198, 0.162, 0.222, 0.135, 0.196, 0.193, 0.192, 0.274, 0.177, 
0.17, 0.214, 0.214, 0.107, 0.22, 0.177, 0.177, 0.165, 0.226, 
0.196, 0.267, 0.174, 0.12, 0.159), totalse = c(0.009, 0.014, 
0.009, 0.006, 0.02, 0.01, 0.012, 0.013, 0.008, 0.007, 0.012, 
0.01, 0.015, 0.012, 0.009, 0.013, 0.01, 0.011, 0.008, 0.019, 
0.012, 0.011, 0.01, 0.01, 0.011, 0.01, 0.01, 0.012, 0.006), strat = c(-1.043, 
1.817, 1.018, -1.321, -0.138, 1.621, 1.862, -0.087, -1.02, -0.87, 
-0.474, -1.043, -0.474, 1.421, -0.302, -0.063, 0.166, -0.474, 
0.072, 0.7, 0.937, -1.043, -0.419, -0.083, -0.327, 1.621, -0.138, 
1.201, -1.321)), .Names = c("total", "totalse", "strat"), row.names = c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
16L, 17L, 18L, 19L, 20L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 
30L), class = "data.frame", na.action = structure(21L, .Names = "21", class = "omit"))
I need the correlation between total and strat but weighted by totalse such that cases with smaller totalse get more weight in the estimation of the correlation.

What would be the best way to do this?
 

Jake

Cookie Scientist
#2
I don't think I've seen anything like this before, but the first idea that comes to my mind is to just do a WLS regression and then multiply the regression coefficient by sd(x)/sd(y) to make it like a correlation coefficient.
 

Lazar

Phineas Packard
#3
I think it is meta analysis esk. I think the issue I am having is that I know how to adjust a mean but not a correlation.
 

Lazar

Phineas Packard
#4
How about something like the following:

Code:
inv.weight <- function(se) {
  u.weight = 1/se
  u.weight.tot = sum(u.weight)
  a.factor = solve(u.weight.tot, length(se))
  u.weight*a.factor
}

w.cor <- function(y,x, weight, d){
  sd.x = sd(x, na.rm=TRUE)
  sd.y = sd(y, na.rm=TRUE)
  coef1 = lm(y ~ x, data = d, weight = weight)$coef[2]
  cor = coef1 * sd.x/sd.y
 cor
}

w.cor(myData$total, myData$strat, weight = inv.weight(myData$totalse), d = myData)
EDIT: To sum up
Step 1 - Take the inverse of the standard errors
Step 2 - Adjust step 1 so the weights sum to the N in the data set
Step 3 - Run regression as Jake suggests
 

Jake

Cookie Scientist
#5
Yeah that is what I was thinking, although it definitely feels like it's one of these solutions where it's just the first not-crazy thing that popped into the head, rather than an optimal solution in any way (except perhaps by accident).