## A regression problem with optional treatment

I have an interesting problem here and really need your suggestion to analyze the data in a proper way.

In 2012 we collected data about retention of the students from first semester to second semester along with some other variables. The retention variable is a binary type variable- 'retained' and 'did not retain'. In 2013 we introduced a new system. Student success advisers were assigned and any student when needed could go to them. We collected a few new variables associated with the introduction of the student success advisers along with the variables collected in 2012. Now, going to these advisers was not mandatory for the students. Those who felt problem could go to them. That's why just looking at 2013 data alone makes it look like the advisers make things worse: the students they see do worse, and are more likely to leave university. Only the students who faced problems probably went to the advisers and those who were doing well probably didn't. If 10 students went to the advisers with problem and 8 of them retained to semester 2, that is surely a success. But if 80 out of 100 students who didn't have any problem and thus didn't go to the adviser retained to semester 2, that will have log odds similar to those who went to the advisers and thus tell you going to the advisers did not bring any change or had no significant effect!

Part of the reason for this is there was a major policy change that effected how universities in our country took in students. So, there was something called the "cap" in place before 2013...a limit on the number of students a university could take on, and expect government funding for. In 2013 that cap was removed. What that meant was that successful, desirable universities started opening their doors to more students, and more students were able, suddenly, to get into desirable universities. Our university is not such a desirable university. So what probably happened (I don't know for sure) was that our university was forced in 2013 to take significantly "worse" students...which could well lead to greater retention difficulty.

What my bosses think is that we need to weight the 2012 data so that it matches the 2013 data on some of the variables that may have been sensitive to the removal of the cap or limit on student entry (or, alternatively, weight the 2013 data so that it matches the 2012 data set). Then we can compare two different cohorts as if they were a proper control group for each other. In fact, this is illusory. Some suggested me to use SPSSINC RAKE. But I am not getting any clue why.

What we want to find out (very broadly) is whether the system (introducing the advisers etc.) worked, and the degree to which it worked.

Thanks in advance for any suggestion.