+ Reply to Thread
Results 1 to 2 of 2

Thread: Two way repeated measures with zeros, non integer values, and non-normal distribution

  1. #1
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Two way repeated measures with zeros, non integer values, and non-normal distribution




    I would really appreciate some help with a few models I am trying to run. Essentially my data looks at how often a subject was visited depending on treatment and subject type across two years. The data looks like this:

    Year | Subject | SubjectType | Treatment | BlockNum | NumberOfVisits | DurationOfVisits
    --------+---------+-------------+-----------+----------+----------------+---------------+
    1 | 1 | Type1 | Treatment1| 1 | 14 | 15.6
    2 | 1 | Type1 | Treatment1| 1 | 0 | 0
    1 | 2 | Type2 | Treatment2| 2 | 3 | 4.3
    2 | 2 | Type2 | Treatment2| 2 | 0 | 0

    and so on for 200 subjects with a measurement for each year.

    Essentially I want to create a model that tests if the number of visits / duration of visits are different between treatment and subject across both years, and if there are any interactions. BlockNum refers to the experimental design being split into three randomised blocks (three blocks of plants growing in a greenhouse). I have tried a bunch of different models and cant seem to get a good resolution:

    Repeated Measures ANOVA:

    Code: 
    model1 <- aov(NumberOfVisits ~ SubjectType*Treatment*Year*BlockNum + Error(Subject/Year), data=dframe1)
    However, the issue with this is that the data is left skewed with zeros in it (so log wont work), and I cannot successfully transform with any of the following:

    Code: 
    trans_Y <- (dframe1$NumberOfVisits)^3
    trans_Y <- (dframe1$NumberOfVisits)^(1/9) 
    trans_Y <- log(dframe1$NumberOfVisits) 
    trans_Y <- log(dframe1$NumberOfVisits+0.1)
    trans_Y <- log(dframe1$NumberOfVisits+0.000001)
    trans_Y <- log10(dframe1$NumberOfVisits) 
    trans_Y <- exp(dframe1$NumberOfVisits) 
    trans_Y <- abs(dframe1$NumberOfVisits) 
    trans_Y <- sin(dframe1$NumberOfVisits) 
    trans_Y <- asin(dframe1$NumberOfVisits)
    As such I then tried a Generalised GLMM:

    Code: 
    library(lme4)
    
    model1 <- glmer(NumberOfVisits ~ SubjectType*Treatment*BlockNum + (1|Year), family = gaussian (link = inverse), data = dframe1)
    However this returns:

    Code: 
        Warning message:
    In glmer(NumberOfVisits ~ SubjectType*Treatment*BlockNum + (1 | Year),  :
      calling glmer() with family=gaussian (identity link) as a shortcut to lmer() is deprecated; please call lmer() directly
    And so trying lmer:
    
    model1 <- lmer(NumberOfVisits ~ SubjectType*Treatment*BlockNum + (1|Year), family = gaussian (link = identity), data = dframe1)    
    Warning in lme4::lmer(formula = NumberOfVisits ~ SubjectType * Treatment * BlockNum +  :
      passing control as list is deprecated: please use lmerControl() instead
    Error in (function (optimizer = "bobyqa", restart_edge = TRUE, boundary.tol = 1e-05,  : 
      unused arguments (tolPwrss = 1e-07, compDev = TRUE, nAGQ0initStep = TRUE, checkControl = list(check.nobs.vs.rankZ = "ignore", check.nobs.vs.nlev = "stop", check.nlev.gtreq.5 = "ignore", check.nlev.gtr.1 = "stop", check.nobs.vs.nRE = "stop", check.rankX = "message+drop.cols", check.scaleX = "warning", check.formula.LHS = "stop", check.response.not.const = "stop"), checkConv = list(check.conv.grad = list(action = "warning", tol = 0.001, relTol = NULL), check.conv.singular = list(action = "ignore", tol = 1e-04), 
        check.conv.hess = list(action = "warning", tol = 1e-06)))
    In addition: Warning messages:
    1: In lmer(NumberOfVisits ~ SubjectType * Treatment * BlockNum + (1 | Year),  :
      calling lmer with 'family' is deprecated; please use glmer() instead
    2: In lme4::glmer(formula = NumberOfVisits ~ SubjectType * Treatment * BlockNum +  :
      calling glmer() with family=gaussian (identity link) as a shortcut to lmer() is deprecated; please call lmer() directly
    And when I a log/inverse link function (I presume this wont work because of the zeros?):

    Code: 
    model1 <- glmer(NumberOfVisits ~ SubjectType*Treatment*BlockNum + (1|Year), family = gaussian (link = log), data = dframe1)    # random intercept
    Error in eval(expr, envir, enclos) : 
      cannot find valid starting values: please specify some
    The following will work for 'Number of visits', but not 'Duration of visits' (as it is non-integer values)

    Code: 
    model1 <- glmer(DurationOfVisits ~ SubjectType*Treatment*BlockNum + (1|Year), family = poisson, data = dframe1)
    However this returns the following, which doesn't tell me the significance of 'Treatment' itself, but rather the significance of each subset within treatment:

    Code: 
    summary(model1): 
    
     Call:
    lm(formula = DurationOfVisits ~ SubjectType*Treatment*BlockNum, data = dframe1)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -8.3251 -3.9093 -0.5325  2.1748 18.9394 
    
    Coefficients:
                                              Estimate Std. Error t value Pr(>|t|)  
    (Intercept)                                 3.9967     1.6053   2.490   0.0138 *
    Treatment2                                  3.1750     2.1018   1.511   0.1328  
    Treatment3                                  0.1306     2.1018   0.062   0.9505  
    Treatment4                                 -0.7279     2.1018  -0.346   0.7295  
    ...
    I would really appreciate some help with this. I feel like I'm missing something obvious here, it has been quite a number of long days deep in R and my brain is a bit frazzled.

    I'm relatively new to R, so explanations in relatively simple terms would be appreciated!

    Thanks a lot ahead of time!

  2. #2
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Two way repeated measures with zeros, non integer values, and non-normal distribu


    Just to note (I think) I have made progress today, have altered my models, and after a while I have arrived at the following:

    Code: 
    model2 <- glmer(NumberOfVisits ~ SubjectType*Treatment*Year + (1|BlockNum)+ (1|Subject), family = poisson (link=sqrt), data = dframe1)
    and as this wont work for non-integer values, I am using the following for 'duration':

    Code: 
    model1 <- lmer(DurationOfVisits ~ SubjectType*Treatment*Year + (1|BlockNum)+ (1|Subject), data = dframe1)
    I can't get any other families or link functions to work in either of them for some reason.

    In addition, I have figured out that I can use
    Code: 
    Anova(model1, Type="III")
    to generate test statistics for treatment/subject type/year.

    Am I along the right lines? Essentially I am trying to test if the dependent variable is significantly different between subject type / treatment / year with any interactions between these, however I generally have left skewed non-normal distributions, some variables are non-integer values, and it involves repeated measures, so it is a little more complicated than I am used to!

    Thanks again, and have a nice evening!

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats