I am evaluating whether utility-scale solar energy plants affect the surrounding climate (initially temperature).
An effect has been found in one paper using the approach described below/attached but when repeating this approach I find no effect for the same site.
I want to be sure that there isn't a test or approach that I am unaware of that might be more appropriate than the existing methodology and which might do a better job of identifying or quantifying any effect.
Please assume good background knowledge of stats at degree level (though I am not a statistician).
Existing approach
The paper actually describes using ANOVA with month as a repeated measure and presence of the solar power station as the explanatory variable, though the lead author, upon checking, tells me a paired T-test was actually used).
Replicating this process I have calculated a t-statistic of 0.312 (3dp, 18 df) for the sample site, with critical values of 1.73 (1 tail) and 2.10 (2 tail) which suggests there is no effect.
This approach seems a bit reductive. I have a huge amount of detailed spatial data that seems to have been reduced because it is difficult to analyse.
Can anyone suggest an approach to investigate that might be able to better process spatial data or better use the available data statistically?
I have about 30,000 pixels per site - each of which holds around 200 data (including the calculated LST, it's precursors, and significant additional data from the original Landsat images). I can process this as geoTIFFs in GIS (QGIS or ArcGIS Pro), or I could extract the data (using Eart engine/GIS) to process in R/SPSS/whatever).
An effect has been found in one paper using the approach described below/attached but when repeating this approach I find no effect for the same site.
I want to be sure that there isn't a test or approach that I am unaware of that might be more appropriate than the existing methodology and which might do a better job of identifying or quantifying any effect.
Please assume good background knowledge of stats at degree level (though I am not a statistician).
Existing approach
- A site is identified and Landsat 8 imagery for 12 consecutive months pre- and post-construction is identified.
- The panels, site boundary, and any potentially problematic areas (e.g. disturbed ground, infrastructure) within 2km of the site boundary are masked using geometry. (The working assumption is that these areas would increase the RMSE of the data though this is on a theoretical basis and has yet to be quantified).
- A series of 100m-wide buffers is created from the site boundary outwards to 2km.
- A series of scripts in Google Earth Engine is used to extract and process data for each site, eventually calculating the land surface temperature (LST) for each pixel.
- Pixel] values in each 100m buffer are averaged.
- 'To normalise for the effect of changing temperatures on different days the percentage deviation from the average LST of all buffers [is] calculated for each buffer.
- The difference in temperature deviation between adjacent buffers is calculated (e.g. Buffer 1 LST (0-100m) - Buffer 2 LST (100-200m)).
- Finally, a paired T-test is used to test the null hypothesis that there is no difference between pre- / post-construction LST around the installation.
The paper actually describes using ANOVA with month as a repeated measure and presence of the solar power station as the explanatory variable, though the lead author, upon checking, tells me a paired T-test was actually used).
Replicating this process I have calculated a t-statistic of 0.312 (3dp, 18 df) for the sample site, with critical values of 1.73 (1 tail) and 2.10 (2 tail) which suggests there is no effect.
This approach seems a bit reductive. I have a huge amount of detailed spatial data that seems to have been reduced because it is difficult to analyse.
Can anyone suggest an approach to investigate that might be able to better process spatial data or better use the available data statistically?
I have about 30,000 pixels per site - each of which holds around 200 data (including the calculated LST, it's precursors, and significant additional data from the original Landsat images). I can process this as geoTIFFs in GIS (QGIS or ArcGIS Pro), or I could extract the data (using Eart engine/GIS) to process in R/SPSS/whatever).