katerina
04-14-2009, 05:56 AM
Hi there
the folowing code is for training dataset (trPima) and testing dataset (tePima) I want to use k-nearest neighbours but first I need to scale the data so that the range of the 7 explanatory is roughly 1.I have tried to calculate the ranges
but i don't know how to chose a suitable rough rescaling.
my code is:
> mins <- c(0, 56, 38, 7, 18.2, 0.085, 21)
> maxs <- c(14, 199, 110, 99, 47.9, 2.228, 63)
> rangs <- maxs - mins
> rangs
[1] 14.000 143.000 72.000 92.000 29.700 2.143 42.000
and I want to fill in the following (?).
scaletrPima<-data.frame(npreg=npreg/?,glu=glu/?,bp=bp/?,skin=skin/?,bmi=bmi/?,ped=ped/?,age=age/?)
> summary(tePima)
> mins <- c(0, 65, 24, 7, 19.4, 0.085, 21)
> maxs <- c(17, 197, 110, 63, 67.1, 2.42, 81)
> rangs <- maxs - mins
> rangs
[1] 17.000 132.000 86.000 56.000 47.700 2.335 60.000
scaletestx<-data.frame(npreg=npreg/?,glu=glu/?,bp=bp/?,skin=skin/?,bmi=bmi/?,ped=ped/?,age=age/?)
and again to fill in the (?)
Note that both the training and the test datasets need to be scaled using the same scale factors.
Any suggests???
Thanks in advance
Katerina
xx
Edit/Delete Message
the folowing code is for training dataset (trPima) and testing dataset (tePima) I want to use k-nearest neighbours but first I need to scale the data so that the range of the 7 explanatory is roughly 1.I have tried to calculate the ranges
but i don't know how to chose a suitable rough rescaling.
my code is:
> mins <- c(0, 56, 38, 7, 18.2, 0.085, 21)
> maxs <- c(14, 199, 110, 99, 47.9, 2.228, 63)
> rangs <- maxs - mins
> rangs
[1] 14.000 143.000 72.000 92.000 29.700 2.143 42.000
and I want to fill in the following (?).
scaletrPima<-data.frame(npreg=npreg/?,glu=glu/?,bp=bp/?,skin=skin/?,bmi=bmi/?,ped=ped/?,age=age/?)
> summary(tePima)
> mins <- c(0, 65, 24, 7, 19.4, 0.085, 21)
> maxs <- c(17, 197, 110, 63, 67.1, 2.42, 81)
> rangs <- maxs - mins
> rangs
[1] 17.000 132.000 86.000 56.000 47.700 2.335 60.000
scaletestx<-data.frame(npreg=npreg/?,glu=glu/?,bp=bp/?,skin=skin/?,bmi=bmi/?,ped=ped/?,age=age/?)
and again to fill in the (?)
Note that both the training and the test datasets need to be scaled using the same scale factors.
Any suggests???
Thanks in advance
Katerina
xx
Edit/Delete Message