I would do this as:
rename Product* Price*
reshape long Price, i(Date) j(Product)
I am master finance student and i have a dataset which have to be reshaped in stata. I know that this is possible with stata, and that one can use ID numbers for each variable in the dataset. However, those ID numbers can be created from the row, and i have the variables in my dataset in colums with time series of the prices. My dataset looks like this (a simple version):
Date Product1 Product2
1/1/2004 100 98
1/2/2004 101 97
1/3/2004 102 96
..... ..... ......
4/16/2012
Here the dates are daily data ranging from 1/1/2004 till 4/16/2012. Furthermore, in the other columns I have the prices of the products. In total I have 1662 products.
Now, my teacher wants me to reshape this dataset in an other way. Looking like this:
Productid Date Price
Product1 1/1/2004 100
Product1 1/2/2004 101
Product1 1/3/2004 102
..... ..... ...
Product1 4/16/2012 .....
Product2 1/1/2004 98
Product2 1/2/2004 97
Product2 1/3/2004 96
...... ...... ...
Product2 4/16/2012 .....
So the problem is that I don't know how to create those productid's, since in the original dataset those products are displayed in the columns. furthermore, after I have created the productid's, is it possible to reshape it to this format? And if so, what would be the codes to type in? I have tried this using the help function in stata, and I read some articles on the internet. But most of the productid's are created when the variables are displayed in rows and not in the columns. So reshaping couldn't help me.
I hope you guys could help me with this problem. It would really help my research.
Looking forword to response.
I would do this as:
rename Product* Price*
reshape long Price, i(Date) j(Product)
goosey (03-05-2013)
Thank you for your help, I did the reshaping and it worked. It took my pc about 3 hours for reshaping the dataset.
Unfortunately, I am facing a new problem with my dataset. After the reshape I generated some additional variables. My dataset looks like the following (a simplified version):
t Date Productid ProductReturn Traded
1 1/1/2004 2 . 0
2 1/2/2004 2 . 0
3 1/3/2004 2 0.0006789 1
.....
200 5/6/2006 2 . 0
1 1/1/2004 3 . 0
etc.
I created t to sort on date since date is a string variable. Furthermore, I sorted by productid. The products were introduced at some point in time, and some of them were eliminated before the end of the dataset. This 'dead product' has a 0 for traded at the last observation, the dataset has also current traded products which have a return for the last observation (note: has also a 1 for the 'traded-variable' at the last observation). The aim is to compare the 'dead products' with the 'current products'. I want to check whether the bad performance of some products caused eliminating the products from the exchanges. I know that I have to use logit regression, however I did this and I got an error. I think I have to make a dummy-variable which checks whether a product was not traded at the end of the observation period, and if this is true it has to create 1 on all observation on that specific productid. If the product was traded at the end of the observation period, it has to create all zeros on that specific productid.
Does someone know how to deal with this?
Here's a startquietly gen Tradedattheend=0
forvalues i='first product'(1)'number of products' {
sort `i' t
quietly replace Tradedattheend=1 if Traded==1 & t==200 & Productid==`i'
}
What do you mean with 'i'?
`i' is used in the loop. When writing "forvalues i=1(1)666", I am telling the program to run the code for i=1, i=2, i=3,...,i=666. i has to be written `i' when used in the loop. Doh, I see now that the sort command is unnecessary. In writing the following code
quietly gen Tradedattheend=0
forvalues i=1(1)666 {
quietly replace Tradedattheend=1 if Traded==1 & t==200 & Productid==`i'
}
you´re telling STATA to give the variable Tradedattheend the value 1 if it has been a trade at t=200, which is done for every product. If you get this right then it's easier to fix the rest. Just make sure to replace 666 with the total number of Products that you've got in your data set.
****. You don't even have to use the forvalues loop. You only have to write the following
quietly gen Tradedattheend=0
quietly replace Tradedattheend=1 if Traded==1 & t==200
When that is done, you write the following
forvalues i=1(1)666 {
sum Tradedattheend if Prouctid==`i'
qui replace Tradedattheend=1 if r(mean)>0 & Productid==`i'
}
and the variable Tradedattheend should have the value 1 in all cells for all the variables that has been traded at t=200.
Last edited by Englund; 07-17-2012 at 01:56 PM.
Statahulk (07-17-2012)
Alright, I am going to try your latest post. The first step I already did, but indeed it creates only a 1 for the latest observation. That was my problem. Hopefully with the second part I will get ones for al observation if the product is eliminated. I'll keep you posted. Thank you in advance.
That is strange. I got the follow error:
'i' invalid name
r(198)
I am sorry. Stupid mistake for me. He's runnin right now. Just wait. If this works I only have to convert it, because I want ones for when they were not traded at the end of the observation period, and zeros if they were traded at the end of the observation period.
It worked. I filled the recode formula to transpose the zeros in ones and the ones in zeros. Now I have ones when the products are eliminated at the last observation date. Thank you very much!
Your welcome. I'm glad it worked. I suggest you learn how and when to use the forvalues code. It can be very useful![]()
|
|