I am conducting a quantitative research project that looks at a patenting over time in companies. Specifically, I look at three years prior to an event, and nine years after to see if companies patent more or less after the event. My data is set up so that my observations are done by case, and I have 12 variables (-3 to +9), which the observed patenting amount for that year for each case (what would be a wide data set I suppose?). My questions are the following:

1. Each variable is not normally distributed because in a given year, many cases will have zero patents. How can I correct this, or is it even necessary to do so?

2. Is there a way to time-order each variable so then I can analyse a pattern over time? Or should I combine the mean of each variable into it's own case in a new data sheet, and go from there?

3. I planned to look at a non-linear regression. Is that correct? If not, which type of analysis would be correct in this case? Someone had suggested moving averages, but this seems a bit simplistic.

Thank you in advance for your help! ]]>

this is the first time I am using stata. I am trying to analyse the change in household expenditure with respect to the head of the household.

I have been trying to merge different sets together using the same key variable but after the first merging it shows its already merged and the third one is happening .

could someone please help me with that ?

thanks :) ]]>

I'm using logistic regression to analyse a dataset on managerial decision making. Managers have made decisions to fire 2 employees out of a group of 8 from a fictional company used in the research setup. This data is coded binominally (0=chosen to retain, 1 = fired). To find out what variables have an impact on the fictional employees getting fired, I gathered data on their opinions using 7-point likert scale questions (i.e. academic background is important, 1-7), and ran binominal logistic regressions using SPSS with a number of these kind of variables on the decision of firing a certain employee or not.

The problem I'm having is that in the complete model, some variables are not even close to significant (with values in the sig. column of more than 0,500). However, when I run logistic regressions with just one variable at a time, in some cases they are significant, with values in the sig. column of <0,05. The omnibus chi-square is sometimes not significant for these models, other times it is, I'm not sure how important this is for the relevance of the variable.

I've tried looking at correlations between variables, but in some cases a variable that doesn't correlate with any of the other variables will become significant if tested individually.

Now, I'm not sure how to find out or interpret which of my variables are of influence in the decision whether or not to fire a certain member, and which aren't.

Also, I don't know how to interpret the unstandardized B-values and the Exp(B) values, because testing a variable individually results in a different B-value from those found when I include all the variables in the model. This means I'm not sure what to report on the actual effect (Exp(B)) of these variables on the decision.

Thank you for any and all help! :) ]]>