Search results

  1. hlsmith

    Seasonal Time Series

    Hey y'all, I am looking for suggestions for a seasonal time series. Ideally with at least four years (season cycles). I am soliciting suggestions for novel sets, not just the typical ones published in textbooks our used as online examples. Perhaps something current or perhaps esoteric. Thanks...
  2. hlsmith

    Multilevel regression with two clusters

    Well I am revisiting a project that is giving me mental fits. I am going to give a comparable example. Say I have a binary outcome for a picture interpreted as containing a man or woman and I know the truth so the DV is correct classification yes or no. Now I have fifty pictures reviewed by...
  3. hlsmith

    Dose Response Curve (log logistic regression)

    Hey, I had someone ask me if I could create a dose response curve for them. They have a dose amount and associated response which is a time to event. I believe everyone will have an event, so I have a continuous response value to work with (but I have a feeling there will only be 2-3 dose...
  4. hlsmith

    Continous (Ordinal) Outcome with Middle Group as Target

    I had someone send me an idea for a project. The dependent variable is continuous, though they are interested in predicting the middle region. So imagine a value of 0-infinity and they want to trichotomize data into 0-10; 11-20; 20+, with 11-20 as the target. The context is that the middle...
  5. hlsmith

    Sad Shiny App won't work for Logistic Reg MLM Power Calc

    I am trying to use a shiny app for a simple logistic regression multilevel model power calculation. But every time I click on the 'Covariates' option to change it to '1', I get 'disconnected from server'. Is there anyone out there at all that can walk me through this. I have 47 subjects with...
  6. hlsmith

    Weighted Cohen Kappa (R)

    Well I am attempting to get some code ready to calculated two rater Cohen Kappa inter-rater reliability with Fleiss-Cohen weights with 99% CIs. I am trying to use R and figured I would post this to see if you all can save me some time. Toy dataset: NB <- read.table(header = TRUE, text = "...
  7. hlsmith

    Logistic Reg Complete Seperation

    Hey y'all, I was running a stacked ensemble (weighted combo of base machine learners) model yesterday (R: H20: autoML). And I noticed the top contributing model had an accuracy of 99.995%. It was a gradient boosted model (per random grid search) for a classification problem. I thought hey maybe...
  8. hlsmith

    Permutation test with skewed data

    Below are my data - small samples (n=22; n=14). The gaps in the histograms are just due to small sample sizes. In the parent population, data are normal looking with a very long stretched out right-tail. (EDIT: Of note, distributions are left-side bounded by zero) What are the pros and cons to...
  9. hlsmith

    Thought experiment (standardized binary variable)

    OK, I am working on a corrected LASSO logistic model, which addresses the model building dependence (i.e., variables were not declared a priori but established via the modeling process) and all candidate covariates were standardized to unify their scales before entering them into the model. But...
  10. hlsmith

    Favorite/Important R Packages

    If you could only download R packages for a small window of time, which packages would you download: favorite/staples???
  11. hlsmith

    R package installation issues

    I have been having connectivity or firewall or something iissues when trying to install.packages. I believe the former, because eventually it goes through. Does the "" portion mean it is trying to download from UK? I am in central United States. I ran the following code and it seem to...
  12. hlsmith

    Creating or editing Matrix

    The following code generates a correlation matrix. I would like to either edit the generated matrix (don't know how to do this but I am guessing it very easy) or create my own matrix like this one, but not using the cor function, but using a list of correlations and variables I already have...
  13. hlsmith

    R plot

    There is a function "plotcp" which you feed your rpart output into and it generates the following plot: I would like to replicate this using the following dataset. Of note, the whiskers are just the xerror +/- xstd. The xstd are actually SE values since they are taken from cross-validation. I...
  14. hlsmith

    Decision Tree Selection and Pruning based on Complexity

    Half-way down the page at this known site (see link below) is the code and example. The author selected 0.011 as the complexity penalty for their decision tree. I don't get why they didn't use 0.29 instead for a more parsimonious tree with a comparable error rate. I thought you always selected...
  15. hlsmith

    Jupyter notebook

    When I installed a jupyter notebook on my computer via anaconda, it auto-created a bunch of folders in it. Consisting of Music, photos, desktop, etc. Not sure what's up with that. Also, it tells me I can't delete some of the folders since they aren't empty, but they appear to be empty. Anybody...
  16. hlsmith

    Bayesian Posterior: SD = SE?

    I was curious if the standard deviation in a Bayesian posterior is equal to the standard error?
  17. hlsmith

    Bayesian MAP Estimates and Regression Assumptions

    I am new to Bayesian regression models. I am trying to learn how to assess regression model assumptions while within the Bayesian context. Much of the literature is about implementing Bayesian models, but little to no information on model assumptions. I am planning to get maximum a posteriori...
  18. hlsmith

    LASSO (binomial) p-value and confidence intervals (selectiveInference) error

    I was just trying to quickly check out the "selectiveInference" package to slap confidence intervals on my beta coefficients from a glmnet (LASSO: binomial) model. The package is suppose to provide more robust CIs that account for initially using LASSO to get subset of features. So it is...
  19. hlsmith

    Sample Size Calc for Right Skewed Data

    I have a request from a person for a sample size calculation. When I requested information about the potential study data they provided two means and standard deviations from a prior study. Given the values (x1=7; s1=6 and x2=0.5; s2 = 2) it appears the data are right skewed. I can simulate...
  20. hlsmith

    ARIMA(0,1,0) random walk with drift

    What code do I use to fit the above model, or what type of differencing do I do? Something like a lag 1 subtraction? My model seemed like it needed differencing per plots and the auto.arima came back with the above structure. Thanks.