# Absorbing features in panel data regression

#### mthelm

##### New Member
I'm using the Julia programming language to do some regression analysis on some panel data that I have. The package that I'm using I believe is modeled after Stata and I'm hoping someone can help me understand what the absorb feature does. I have some simple panel data that measure the median wage, median age, and percent of the workforce that consists of minorities for 13 industries across 11 years. Here's a sample:

Code:
Row │ hryear4  prmjind1  prcnt_minority  median_wage  wage_10  median_age
│ Int64    Int64     Float64         Float64      Float64  Float64
─────┼─────────────────────────────────────────────────────────────────────
1 │    2010         1        0.549525      10.0      8.0           36.0
2 │    2010         2        0.285844      18.0     11.0242        40.0
3 │    2010         3        0.389158      16.0     10.0           37.0
4 │    2010         4        0.372013      14.5      9.0           43.0
5 │    2010         5        0.33117       10.0      7.5           34.0
If I don't absorb anything, and just model it as median_wage ~ median_age + prcnt_minority, I get:

Code:
Continuous Response Model
Number of observations: 143
Null Loglikelihood: -366.63
Loglikelihood: -325.47
R-squared: 0.4378
LR Test: 82.32 ∼ χ²(2) ⟹  Pr > χ² = 0.0000
Formula: median_wage ~ 1 + median_age + prcnt_minority
Variance Covariance Estimator: OIM
───────────────────────────────────────────────────────────────────────────────
PE         SE      t-value  Pr > |t|      2.50%     97.50%
───────────────────────────────────────────────────────────────────────────────
(Intercept)       6.0373    2.52437     2.3916     0.0181    1.04648  11.0281
median_age        0.380779  0.0465285   8.18378    <1e-12    0.28879   0.472769
prcnt_minority  -13.9391    3.37746    -4.12709    <1e-04  -20.6165   -7.26168
───────────────────────────────────────────────────────────────────────────────
However, if I absorb the industry variable (prmjind1), I get:

Code:
Continuous Response Model
Number of observations: 143
Null Loglikelihood: -366.63
Loglikelihood: -178.47
R-squared: 0.9285
Wald: 85.49 ∼ F(2, 128) ⟹ Pr > F = 0.0000
Formula: median_wage ~ 1 + median_age + prcnt_minority + absorb(prmjind1)
Variance Covariance Estimator: OIM
─────────────────────────────────────────────────────────────────────────────────
PE         SE      t-value  Pr > |t|         2.50%     97.50%
─────────────────────────────────────────────────────────────────────────────────
(Intercept)     -7.10788   3.64651    -1.94923    0.0535  -14.3231       0.107359
median_age       0.175965  0.0849071   2.07244    0.0402    0.00796144   0.343968
prcnt_minority  37.7827    2.8899     13.0741     <1e-24   32.0645      43.5008
─────────────────────────────────────────────────────────────────────────────────
Now the coefficient on percent minority has flipped from negative (expected) to positive. I think what's happening is that, over this time period, the percent of minorities in the workforce as well as median age have increased for all industries and so have the nominal median wage rates so it seems like absorbing industry causes it to not look at differences across industries anymore. Does that sound correct?

I would really appreciate any guidance and/or suggestions for reading materials to help me get a better handle on panel data modelling techniques.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
So this is a multilevel model? If so, perhaps the absorb is placing a variable into a group. How would you like to model these data? What nested into what and what type of covariance structure?

#### mthelm

##### New Member
So this is a multilevel model? If so, perhaps the absorb is placing a variable into a group. How would you like to model these data? What nested into what and what type of covariance structure?
Unfortunately, I don't know enough to answer your questions - I'm trying to learn this kind of econometric modelling and have very little knowledge/experience in this area .

In this case, my response variable and my predictor variables have all steadily increased over the 11-year period and I'd like to account for that when modelling the data. I would also like to account for the fact that different industries have different workforce sizes and their own set of unique factors that may be impacting changes in wage rates over time.