I am seeking your guidance with the following issue. So, I am estimating a "classic" difference-in-difference model (using pooled OLS estimator): Y = time + treated + time*treated, where Y is the average Medicare payment (amp) in $.

Here is the descriptive statistics for amp: Mean = 37.2, SD = 51.4, min = .2, max = 929.7, Median = 2.99. Additionally, attached below is the distribution of amp visualized via histogram:

As one of the first approaches to address such an "abnormal" distribution, I winzorize (i.e., the extreme values are replaced by 1st and 99th percentiles) the outcome. The descriptive statistics for amp_w: Mean = 36.8, SD = 49.6, min = 2.8, max = 154, Median = 2.99. Below is another histogram for amp_w:

As you can see, (1) there is a

*substantial*number of values around $3 dollars; (2) there are a couple groups between $10 and $100; (3) another group between $100 and $150; and (4) some extreme values between $200 and $900 (in case of amp).

I am somewhat concerned about running pooled OLS with such outcome. What do you think? Is there any particular estimator that you would recommend for such outcome? Any help advise will be greatly appreciated