- We estimate a simple linear regression model y = b0 + b1*x + error The input data in our model estimation shall be:
- y = 1 month stock returns (cross sectional) for month n
- x = market cap for each stock (cross sectional) for month n

- We test if our coefficient b1 is statistically significant. And record the result for month n. We also record the R squared.
- We do step 1. & 2. for a total of N month.
- Finally we have recorded the results of N significance tests for b1, and calculate the percentage of times our coefficient was significant. For R squared we calculate the average of all N R squared.

Is this approach of aggregating over multiple cross sectional regressions violating any best practices in statistical research?