Search results

  1. E

    Automating VIF calculation on mutliple linear models

    I came up with this code to calculate VIF for variables for each model I am evaluating displayvif<-function(x){ for (i in 1:length(x)) vif(x[[i]]) } where x is models<-lapply(d1,function(data){lm(reformulate(termlabels=".",response=names(data)[1]),data)}) and where d1...
  2. E

    regsubset function in R

    I'm using regsubset for variable selection but when i specify nvmax to a given value lets say 2, it regresses 3 variables instead of 2. Does anyone know why that is the case? Thanks
  3. E

    Stepwise Regression in R Forward

    I came across this post which has some information about forward stepwise regression. However, I don't quite understand the solution. I was wondering if someone is able to explain what the code is doing?:)
  4. E

    Comparing multiple PDF documents using R

    Is there a way to compare more than two or atleast two documents using R to identify the differences/common content? I have about 60 PDF files at most one or two pages each that have similar content (but not the same). Some may be same. Each PDF is like an schedule/invoice kind of a document...
  5. E

    Standard Deviation of Mean Absolute Error

    Hi guys, As basic as this question is, I'm looking for a way to calculate standard deviation of mean absolute error using pivot table or atleast using excel. I have fitted and predicted values. I calculated abs(y-yhat). and then when i used pivot tables i used "average" value to be...
  6. E

    Group/bin/bucket data in R and get count per bucket and sum of values per bucket

    I wish to bucket/group/bin data : C1 C2 49488.01172 0.0512 268221.1563 0.0128 34775.96094 0.0128 13046.98047 0.07241 2121699.75 0.00453 71155.09375...
  7. E

    Best book on R

    Sorry guys, this post may not belong here but I had no idea where else to post. I'm looking for some recommendations for an "excellent" book for R programming focused on uses of R in statistics, data analysis, producing graphs -including making them look more than just vanilla etc. I...
  8. E

    Unique combination of variables

    Hi guys I have 15000 observations across 7 variables. I need some help with the code to generate all possible unique combinations of variables. e.g. if I had 2 variables and 4 observations per variables V1=a,b,c,d V2=1,2,3,4 Then, combinations = a 1, a 2, a 3, a 4,b 1, b 2, b 3, b 4, c 1, c...
  9. E

    Regression model with no constant term & more

    Hi guys, There are a bunch of things that I'm getting confused about and while I've researched online for resources to get clarification, I'm still not certain about the answers. So hoping to get some assistance from the experts 1.) can we manually select a regression model that has no...
  10. E

    SAS Missing Data Imputation: Arrays + Proc STDIZE

    Hi Guys I came across this following code: data dataset(drop=i); set data; array mi{*} mi_Ag mi_Inc mi_WR; array x{*} Ag Inc WR; do i=1 to dim(mi); mi{i}=(x{i}=.); end; run; I need to understand two things: 1.) there is a column...
  11. E

    hclust function-cluster analysis-text/document-function creation

    Hi guys Im working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics: max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn. I need to create the tables for 3...
  12. E

    Logistic Regression -some expert advice !

    Hi guys I've recently worked on a project involving Logistic Regression. Although I've managed to 'complete' the project, I'm unsure whether what I have done so far is correct and/or sufficient. It is a case of regular binary logistic regression. One dichotomous response variable and several...
  13. E

    Logistic regression with predictor variables with both categorical & actual values

    I've got a data set with 40 variables with 30 variables each having certain observations that are coded (i.e. have values) from 1-6 describing certain situations and some values that provide real values for what each of those variables represent as well. i.e. the variables are partially...