Keep IDs - "too many literals" error

#1
Hey guys,

I want to keep only certain sample of my population based on agency_id. My try was keep if agency_id==1 | agency_id==2 |agency_id==3|agency_id==4|agency_id==5.
The list it is pretty big 93 IDs. when I run this, I got an error message "too many literals".

How can I solve this?

Thank you in advance,
Marvin
 
#2
I suspect you are running into a problem where there are just too many characters in your command.

Try this:

keep if inlist(agency_id, 1, 2, 3, 4, 5, ...)

where, of course, you fill in the "..." That might still have too many characters. If it does, you can try ranges, like

keep if agency_id >= 1 & agency_id <= 93
 
#5
Guys inlist or inrange only gives you certain number of cases.. I have too many IDs. Please help!

keep if inrage(agencyid, "AK0011",
"AK0012" ,
"AL0080" ,
"AR0015" ,
"AR0081" ,
"AZ0016" ,
"CA0006" ,
"CA0017" ,
"CA0020" ,
"CA0083" ,
"CT0144" ,
"DC0024" ,
"DE0087" ,
"FL0025" ,
"FL0088" ,
"FL0126" ,
"GA0026" ,
"HI0142" ,
"IL0029" ,
"IN0032" ,
"IN0121" ,
"KS0092" ,
"KS0093" ,
"KY0153" ,
"LA0035" ,
"MA0037" ,
"MA0040" ,
"MA0094" ,
"MA0147" ,
"ME0096" ,
"MI0042" ,
"MN0045" ,
"MO0047" ,
"MS0097" ,
"MT0098" ,
"NC0136" ,
"ND0099" ,
"NE0100" ,
"NJ0102" ,
"NJ0138" ,
"NM0050" ,
"NM0051" ,
"NY0009" ,
"NY0010" ,
"NY0052" ,
"NY0056" ,
"OH0059" ,
"OK0062" ,
"OR0064" ,
"PA0066" ,
"RI0068" ,
"TN0106" ,
"TX0072" ,
"TX0073" ,
"TX0074" ,
"VA0110" ,
"VA0111" ,
"VT0114" ,
"WA0112" ,
"WA0113" ,
"WV0077" ,
 
#8
I have no many IDs .. like 300.. what can I do? I tried this.




# delimit ;
drop if resultId== 4932408 |
resultId == 4932409 |
resultId == 4932411 |
resultId == 4939170 |
resultId == 4902481 |
resultId == 4902480 |
resultId == 4902517 |
resultId == 4902521 |
resultId == 4902522 |
resultId == 4902518 |
resultId == 4902519 |
resultId == 4902520 |
resultId == 4902611 |
resultId == 4902693 |
resultId == 4927309 |
resultId == 4927308 |
resultId == 4927302 |
resultId == 4927310 |
resultId == 4927304 |
resultId == 4927305 |
resultId == 4927307 |
resultId == 4927303 |
resultId == 4927301 |
resultId == 4921480 |
resultId == 4921483 |
resultId == 4921485 |
resultId == 4921484 |
resultId == 4921481 |
resultId == 4921482 |
resultId == 4921726 |
resultId == 4921727 |
resultId == 4921725 |
resultId == 4921724 |
resultId == 4921963 |
resultId == 4921962 |
resultId == 4921960 |
resultId == 4921961 |
resultId == 4921970 |
resultId == 4921969 |
resultId == 4921968 |
resultId == 4922277 |
resultId == 4922278 |
resultId == 4922275 |
resultId == 4922276 |
resultId == 5437337 |
resultId == 5437157 |
resultId == 5437234 |
resultId == 5437132 |
resultId == 5437124 |
resultId == 5437313 |
resultId == 5437133 |
resultId == 4927306 |
resultId == 4932629 |
resultId == 4932667 |
resultId == 4932672 |
resultId == 4932670 |
resultId == 4932669 |
resultId == 4932673 |
resultId == 4932647 |
resultId == 4932661 |
resultId == 4932671 |
resultId == 4932654 |
resultId == 4932640 |
resultId == 4932809 |
resultId == 4932820 |
resultId == 4932819 |
resultId == 4932815 |
resultId == 4932813 |
resultId == 4932812 |
resultId == 4932818 |
resultId == 4932810 |
resultId == 4932817 |
resultId == 4932814 |
resultId == 4932811 |
resultId == 4932821 |
resultId == 4932816 |
resultId == 4939556 |
resultId == 4939557 |
resultId == 4939560 |
resultId == 4939554 |
resultId == 4939553 |
resultId == 4939555 |
resultId == 4939559 |
resultId == 4939561 | etc etc
 
#9
You tried something. Did it work? Why not? (if no.)

Is -resultid- string or numeric? (Run -describe resultid- and post the result.) Do those "numbers" follow any pattern? If not, then you'll have to list them somewhere.
 
#10
You tried something. Did it work? Why not? (if no.)

Is -resultid- string or numeric? (Run -describe resultid- and post the result.) Do those "numbers" follow any pattern? If not, then you'll have to list them somewhere.
The error message that I see is "too many literals" my ID variables is a numeric variable. I think this error is because I have too many ID's. The command is too long.
 
#11
This reproduces your error:

Code:
clear all
set more off

*----- example data -----

set obs 1000
gen var = _n

forvalues i = 1/500{
	if `i' == 500 {
	    local li `li' var == `i'
	}
	else {
        local li `li' var == `i' |
	}
	
}

*----- gives error -----

// a very long expression
drop if `li'
This is one way out of it:

Code:
clear all
set more off

*----- example data -----

set obs 1000
gen var = _n

*----- what you want -----

// put values to drop in a local
forvalues i = 1/500{
    local li `li' `i'
}

// now drop
foreach val of local li {
    quietly drop if var == `val'
}

summarize
A loop was already suggested in an earlier post.
 

maartenbuis

TS Contributor
#13
If all you have is some list of id you want to keep, than why not:

Code:
# delimit ;
drop if resultId == 4932408 ;
drop if resultId == 4932409 ;
drop if resultId == 4932411 ;
drop if resultId == 4939170 ;
drop if resultId == 4902481 ;
...