Before asking this, I read similar questions, but none of them lead to satisfying answer for my specific interest.

I want to homogenize a 64 years (1940-2003) climate time series of precipitation of Dominican Republic. For that, it is really important to select a reference series among a group of candidates.

Let's say "sjo" is the base series, for which I want to find a good reference series; "bani", "plc" and "ra" are reference candidates, because they are close to "sjo". In the jpg attached map, the red point is the base station, and the green ones are the reference candidates:

I made three correlation analysis (done in R, function cor()), considering this monthly variables: raw precipitation value, normalized difference, and transformed values with Box-Cox. Those variables correspond, respectively, to fields that begin with "p", "dian" and "pnorm".

Normalized difference comes from the first difference series method (FDM), which was proposed by Peterson, consisting of:

[Pm(t) - Pm(t-1)] / [Pm(t) + Pm(t-1)],

where Pm(t) is the precipitation value for the month m, and Pm(t-1) is the precipitation for the same month 1 year before. I followed Peterson et al. (1998) remark, which says that FDM applied to precipitation might work better using normalized difference.

As can be seen in page 1 the attached PDF, correlation was calculated for the whole time series (1940-2003). For raw precipitation and Box-Cox transformed values, "bani" is the best correlated with "sjo" (yellow background cells shows the maximum correlation index). Notice that for raw precipitation, "bani" is significantly more correlated than others. For normalized difference, "ra" is only a bit more correlated than the rest. However, each candidate station has statistically significant correlation index with "sjo" at a 0.05 significance level, suggesting ANY of them could be used as a reference series.

This is a bit confusing, so, I was unsatisfied and decided to make a more detailed analysis, spliting the series in 5 years periods intervals, and evaluating correlation for between series for the same 3 variables: raw precipitation, normalized difference and Box-Cox transformed.

Tables from page 2 to 8 in the attached PDF, show the results of these partial correlations; the last page summarizes the times each station has had the maximum correlation value for each variable. As can be seen, "bani" is the most frequently correlated value for the 3 variables analyzed (in all cases, more than 7 times of the twelve 5-years periods analyzed).

With these results, I think that "bani" is the best candidate as a reference series of "sjo", but I'm not sure about it. Is the five-years period analysis OK? Should I accomplish some other analysis?

Thanks.

José

I want to homogenize a 64 years (1940-2003) climate time series of precipitation of Dominican Republic. For that, it is really important to select a reference series among a group of candidates.

Let's say "sjo" is the base series, for which I want to find a good reference series; "bani", "plc" and "ra" are reference candidates, because they are close to "sjo". In the jpg attached map, the red point is the base station, and the green ones are the reference candidates:

I made three correlation analysis (done in R, function cor()), considering this monthly variables: raw precipitation value, normalized difference, and transformed values with Box-Cox. Those variables correspond, respectively, to fields that begin with "p", "dian" and "pnorm".

Normalized difference comes from the first difference series method (FDM), which was proposed by Peterson, consisting of:

[Pm(t) - Pm(t-1)] / [Pm(t) + Pm(t-1)],

where Pm(t) is the precipitation value for the month m, and Pm(t-1) is the precipitation for the same month 1 year before. I followed Peterson et al. (1998) remark, which says that FDM applied to precipitation might work better using normalized difference.

As can be seen in page 1 the attached PDF, correlation was calculated for the whole time series (1940-2003). For raw precipitation and Box-Cox transformed values, "bani" is the best correlated with "sjo" (yellow background cells shows the maximum correlation index). Notice that for raw precipitation, "bani" is significantly more correlated than others. For normalized difference, "ra" is only a bit more correlated than the rest. However, each candidate station has statistically significant correlation index with "sjo" at a 0.05 significance level, suggesting ANY of them could be used as a reference series.

This is a bit confusing, so, I was unsatisfied and decided to make a more detailed analysis, spliting the series in 5 years periods intervals, and evaluating correlation for between series for the same 3 variables: raw precipitation, normalized difference and Box-Cox transformed.

Tables from page 2 to 8 in the attached PDF, show the results of these partial correlations; the last page summarizes the times each station has had the maximum correlation value for each variable. As can be seen, "bani" is the most frequently correlated value for the 3 variables analyzed (in all cases, more than 7 times of the twelve 5-years periods analyzed).

With these results, I think that "bani" is the best candidate as a reference series of "sjo", but I'm not sure about it. Is the five-years period analysis OK? Should I accomplish some other analysis?

Thanks.

José

Last edited: