Showing all categories in table even if no observations in some

#1
Dear all,

I have been having this problem for a while and tried a number of options after googling intensively without any success so I am hoping someone will be able to help!

I am outputting proportion tables using the proportion command and I would like all the categories to appear in the table even if there are no observations in some of the categories.

To be clearer, this is what I’d like to see:

Cat Prop
1 0.45
2 0.05
3 0
4 0.4
5 0.1

This is what I get:

Cat Prop
1 0.45
2 0.05
4 0.4
5 0.1

I have tried the fillin command but it didn't do the trick and I have downloaded the fulltable package but it doesn't deal with survey design (svy) and I need to use that.

In case someone has more time to read, I'll give more information. I am trying to create a table of HIV prevalence by region and civil status. This is saved in excel and will then be used to parameterise a model in matlab. I am using DHS data which involves accounting for survey design and the plan is to have this automated to be able to do it for all countries that have DHS data.

This is the code I prepared to do it:

matrix HIVMen3 = J(8,6,0) **create empty matrix (8 regions, 6 civil status groups)

**fill it with data from the table after converting this into a matrix too
**i (the row) corresponds to the region.

forvalues i=1/8{
svy: tab HIV group if region==`i'
matrix A_`i' = e(b)
matrix list A_`i'
matrix HIVMen3[`i',1] = A_`i'[1, 7]
matrix HIVMen3[`i',2] = A_`i'[1, 8]
matrix HIVMen3[`i',3] = A_`i'[1, 9]
matrix HIVMen3[`i',4] = A_`i'[1, 10]
matrix HIVMen3[`i',5] = A_`i'[1, 11]
matrix HIVMen3[`i',6] = A_`i'[1, 12]
}
matrix rownames HIVMen3 = "Nairobi" "Central" "Coast" "Eastern" "Nyanza" "Rift Valley" "Western" "Northeastern"
matrix colnames HIVMen3 = "Couple" "YMCirc" "YMNCirc" "WidowMCirc" "WidowMNCirc" "Not_sexActive"
matsave HIVMen3, dropall
outsheet using HIVMen3.csv,c

PROBLEM: In 2 of the regions there are no people for 1 of the groups so the matrix produced by Stata has 10 rows instead of 12 and so I can no longer fill the matrix in an automated way

Many thanks in advance!

Annick
 
#3
Hi Bukharin,

Thanks for your reply. I didn't know contract (I should have mentioned that I am new to Stata) and it could have been an option as I could have divided the counts of HIV+ over the counts of HIV- for each group in each region but unfortunately it is not supported by svy and I need to incorporate sample weights. (Also I was having trouble converting the output of contract into a matrix)
Is there a way to incorporate zero observations in the following commands?
svy: proportion HIV over (group, region)
or
svy: tab HIV group if region==1?

Many thanks again!

Annick
 

bukharin

RoboStataRaptor
#4
Oh I see, sorry, I missed the sample weights issue.

As you know the proportions are available in e(b), but the categories are also available in e(Row) and the number of categories in e(r) - you can use this information to fill in your matrix.

Here's an example:
Code:
webuse nhanes2f, clear
svyset psuid [pweight=finalwgt], strata(stratid)

* we're interested in the proportion of people in each houssiz
* which ranges from 1 to 14
* over the 4 categories of region
matrix housing=J(4, 14, 0) // 4 regions, 14 house sizes

forvalues r=1/4 { // r=region
	svy: tab houssiz, subpop(if region==`r')
	matrix row=e(Row)
	matrix prop=e(b)
	forvalues i=1/`e(r)' {  // i loops through the values of houssiz in this region
		local c=row[1, `i'] // c = column of full matrix
		matrix housing[`r', `c']=prop[1, `i']
	}
}

matrix list housing
 
#5
It works!!! Thank you so much! I was going a bit mental on this side.

In case anyone comes across this thread because they have a similar problem, the tab command will output proportion in each group so if you're looking for prevalence values (rather than the proportion of people who are HIV+/married/from region X out of the total population) you just need to add an extra step where you create a matrix containing the prevalence values. This is how I did it but there's probably a more elegant solution:

matrix HIVMen4=J(8,6,.)
forvalues r=1/8{
svy: tab HIV group if region==`r'

matrix prop=e(b)
svy: tab group HIV if region==`r'
matrix row=e(Row)
matrix pos=prop[1,e(r)+1..e(r)*2]
svy: tab group if region==`r' & (HIV==1| HIV==0)
matrix tot2=e(b)
matrix prev2=J(1,e(r),0)

forvalues j = 1/`e(r)' {
matrix prev2[1,`j']= pos[1,`j']/tot2[1,`j']
}

forvalues i=1/`e(r)' { // i loops through the values of group in this region
local c=row[1, `i'] // c = column of full matrix
matrix HIVMen4[`r', `c']=prev2[1, `i']
}
}

Thanks a lot for your help once again,

Annick
 
#6
Hello again,

I am back with an extension of this problem for which I haven't managed to find a solution so I hope someone will be able to help: I am trying to produce a table with the weighted mean of a variable (duration of sexual activity) for non-married individuals in each of the 8 regions in the country and I am using this:

svy: mean(DurationSexActivity) if civil_status==1, over(region)

The problem is I have no observations for region 7 and I was trying to use e(Row) to assign each value to the corresponding column and leave region 7 blank (as done to produce a table of proportions above in this same thread). However, e(Row) does not exist for "mean" and I have found no equivalent when using "ereturn list".

Any ideas will be greatly appreciated!

Many thanks,

Annick
 
#7
You might use the following version of a foreach loop:

Code:
foreach region of numlist 1/6 8 {
  * Operations
  }
Alternatively, you could do something like this with a forval loop:

Code:
forval region = 1/8 {
  if `region' == 7 continue
  * Operations
  }
 
Last edited: