# Predictor

#### noetsi

##### Fortran must die
I have a predictor with 49 distinct levels. Its not truly interval (possibly interval like) but certainly not categorical either. I am not sure how this impacts interpretation.

#### Dason

How exactly is it neither of those

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Are they regions? If so, but we have no idea , could some type of covariance matrix be used.

#### noetsi

##### Fortran must die
They are unemployment rates. Over the course of a single year by month and county. But over that year there were only 49 distinct levels of unemployment in all the counties.

#### noetsi

##### Fortran must die
How exactly is it neither of those
I thought interval had to have a very large number of distinct values. Not only 49 even if rate is on an interval scale.

#### noetsi

##### Fortran must die
I have a related question. I have a personal theory I am testing for my agency that how much the median wage is in a given county determines how high an income you make when we place you (we is a state agency that finds people jobs in the various counties).

I have an ordinal scale that ranks the county from highest income to lowest by median income (there are 67 in my state). I am trying to decide how to use that measure. One possibility is to just use 67 values, but the regression will assume this is an interval measure when it is not (here it is clearly ordinal unlike my other question) . Another is to build a dummy or set of dummies, but I have never seen a discussion how to build dummies in this case. Do I do it bottom half, top half of counties (one dummy). Top ten percent bottom ninety percent....

Nothing in the literature I have seen addresses how you should split the data if you build dummies in Vocational Administration (or anything actually that I have read).

#### Miner

##### TS Contributor
They are unemployment rates. Over the course of a single year by month and county. But over that year there were only 49 distinct levels of unemployment in all the counties.
This is probably due to round-up to fewer decimal places. Can you gain access to the data used to calculate the rates?

#### noetsi

##### Fortran must die
This is probably due to round-up to fewer decimal places. Can you gain access to the data used to calculate the rates?
I don't think so miner although I can try. They send you the table.

#### Miner

##### TS Contributor
You can still analyze it as continuous data, but it may show up as "chunky" on normality or residual plots. This can throw off the p-values in a normality test even though all the data points fall along a straight line. It may violate all sorts of assumptions and prevent you from publishing in a journal, but I have built many perfectly useful models with it.