formatting issues

noetsi

Fortran must die
#1
I am having trouble with formatting data in r.
I do
setwd("S:\\CIU\\Testfolder")
mydata<-read.csv(file="datacsv.csv")
str(mydata)

which shows the field rehab.rate is a character that looks like "48.67%"

I tried
mydata1<-mydata
mydata1$rehab.rate=as.numeric((gsub("%","",mydata1$rehab.rate)))

but end up with NA in the field. I am guessing the decimal place is the issue.
 

Dason

Ambassador to the humans
#2
The decimal place is not the issue. The as.numeric function can handle an actual decimal just fine. Are all values NA or are you just getting a warning that some values were coerced to NA?
 

noetsi

Fortran must die
#5
I found the answer after lots of searching. This does work

mydata$Month<-as.Date(mydata$Month, format="%d-%b-%y")

can't use m apparently with non-numeric months

I am baffled by this error
> mydata<-read.csv("S:\\CIU\\Testfolder\\DataforTS2.csv")
> head(mydata)
Month Spend
1 1-Dec-14 5790000
2 1-Jan-15 5114841
3 1-Feb-15 7240821
4 1-Mar-15 7482837
5 1-Apr-15 6640341
6 1-May-15 5476163
> mydata$Month<-as.Date(mydata$Month, format="%d/%m/%y")
> head(mydata)
Month Spend
1 <NA> 5790000
2 <NA> 5114841
3 <NA> 7240821
4 <NA> 7482837
5 <NA> 6640341
6 <NA> 5476163

I tried using - rather than / and the same thing occurred

this is how the data shows appears in the excel csv
12/1/2014

but I tried mydata$Month<-as.Date(mydata$Month, format="%m-%d-%Y")
and mydata$Month<-as.Date(mydata$Month, format="%m-%d-%Y") and it made no difference
still go na
 
Last edited:

noetsi

Fortran must die
#7
I solved that problem (I needed to use mydata$Month<-as.Date(mydata$Month, format="%d-%b-%y")) the key is the b and using - as the separation not / which is how it looked like in excel.

I am not sure I can send the data because I work for a state agency. I am asking about that now.

I have data stored this way

str(mydata$Month)
Date[1:66], format: "2014-12-01" "2015-01-01" "2015-02-01" "2015-03-01" "2015-04-01" "2015-05-01" there are six years of data (and part of a 7th).

I convert it to a time series
tsmydataMonth<-ts(mydata$Month, start = 1, frequency = 12)

and it eliminates the year
1593302126033.png
I want the actual years to show up not the 1-6. This matters because when I plot it the number 1-6 shows up in the graph and I want the actual years to show up.

1593302329657.png
 

noetsi

Fortran must die
#9
I don't understand what Check out the examples in ?ts means

I think this works.
tsspend=ts(mydata$Spend,start=c(2014,12),frequency=12)

I don't think you use months with ts objects per se (that is there is no column for months it is inherent based on the start period and the frequency you specify). I tried to define the month column and got strange errors.

How do you add the month labels to this? It is in the source data already.

I generate this with plot (tsspend)


1593373573249.png

This is a baffling error to me - I copied this from an article

time<-c(“2019-04-17", "2019-03-21”)

but I get this. Not sure what the error is
Error: unexpected input in "time<-c(“"
 
Last edited:

Dason

Ambassador to the humans
#10
It literally means type
Code:
?ts
and look at the example section.

I'm like 99.99999% sure we've showed you how to look at the help pages before so that should have made sense to you.