I intend to perform and impact evaluation, but I am not sure my data is enough and, even if so, what would be the most appropriate method. I could certainly use some help.
I want to evaluate the impact of a new law, reducing costs of formalization for self-employed, on the quantity of contributors of a social insurance program. I have a panel with microdata, which gives me the monthly number of contributors on different categories of insured workers [employees, domestic employees and three types of self-employed (SE) – SE_1 (those who pay the highest contribution rate); SE_2 (those who pay an intermediate contribution rate); and SE_3, the category of interest in the study, created by the new law and entitled to the lowest contribution rate].
All categories experienced an increase on the number of contributors over time, but I need to isolate the effects to estimate if and in what measure the new law produced a specific impact on these results for the treatment group – precisely, how much, if any, of the increase on the number of contributors is due to the reduction on contribution rates; what portion of the affiliates on SE_3 is due to formalization and what portion is due to migration from other groups already contributing in other categories.
One problem is that the levels of informality in this labor market is high and there are almost no obstacles for workers migrate towards the SE group with lower contribution rates. So, even though the new law is focused on SE with lower income levels, the transition between categories makes it much harder to isolate the effects and to specify a control group – since there are reasons to believe the new law may have produced effects on other categories of insured, not only affecting the informal workers, only affiliated because of the new program.
The monthly data set covers a period of 10 years, including 48 months before the intervention started. The only variables available are: sex; state/geographic region; age; monthly value of the contributions, even though most workers contribute under the same minimum salary; categories of insured workers [employees, domestic employees, SE_1, SE_2 and SE_3]. I would like to use diff-in-diff, but I am not confident about its adequacy, in this case, and the model specification. I would appreciate any suggestions.


Less is more. Stay pure. Stay poor.
TLDR; If you have multiple months of data, interrupted time series may be an option. Would you also have a potential control group not exposed to the law?