Recent content by staassis

  1. staassis

    Multiple scores per subject

    Try GEE and random effects models. Stata has powerful implementation of both.
  2. staassis

    Best way to present result from large dataframe (26x1522) in an easily understandable manner

    You can run PCA on rows and not columns. This would allow you to better understand how the diversity values of nucleotides move together. As your study conjectures, different genetic groups may have different pictures. Then you can create a heat map for [K principal components] * [26 populations].
  3. staassis

    Why is Kendall's Tau always so high?

    I do not quite understand the data you have shown. The data you have attached have only one variable, not two. What is W? Kendall's tau must take values in [-1,1]. In fact, the absolute value of Kendall's tau tends to be lower than that of Spearman's rho or Pearson's correlation. It is hard to...
  4. staassis

    Is my approach correct for this binomial distribution question?

    The answer is [(9 choose 3) * (6 choose 2) + (9 choose 4) * (6 choose 1) + (9 choose 5) * 1] / (15 choose 5). This is not a case of binomial distribution because the databases are sampled without replacement.
  5. staassis

    Analyse a survey?

    Try Repeated Measures ANOVA. Should have a chance of working as long as each time point is represented with a sufficient number of people.
  6. staassis

    Assessing statistical significance of spread.

    Dan, much depends on the data you will end up collecting. Their type and size. Generically speaking, you may end up using a time series model which depends on the group ID. It is hard to say more at this point, unfortunately... Once you have collected the data, you can post them here and we will...
  7. staassis

    Sample size for creating a reference interval

    Without seeing the data, it is impossible to say what sample size would be sufficient for a prespecified accuracy. However, based on many data analyses I have performed over the last decade, 120 observations is unlikely to be sufficient for studying very low concentrations of some of the...
  8. staassis

    Is my approach correct for this binomial distribution question?

    The original question does not display.
  9. staassis

    Factor Analysis Question

    Seems like a programmatic issue. Perhaps, something related to in-memory variables. Try to reverse score in a separate Excel file and then load it into a fresh session of whatever software you are using.
  10. staassis

    Calculating a weighted mean/SD of x number of means/SDs?

    This statement is unclear: "Because of heterogeneity in patients, region X in one patient may have 40 separate data points, in another 90 points, in another 17 points." The whole thing may be a simple case for meta-analysis.
  11. staassis

    Is my hypothesis test correct?

    The hypotheses go the other way. "At least" means H0: ... >= ... And therefore H1: ... < ...
  12. staassis

    Is there a Mann Whitney test alternative when all variables are categorical?

    You can use chi-square test for independence if the expected frequency in each cell (for each combination of the categories) is >= 5. Separately, you can always us a randomization test, which is a variation of bootstrap.
  13. staassis

    Need help evaluating a PCA

    You do not. Factor loadings (if using the traditional definition) tell you how to represent the original variables in terms of factors. They do not tell you the reverse: how to calculate factors in terms of the original variables. The easiest approach is saving the factor scores (in SPSS, R...
  14. staassis


    Yes, you can. Choose the optimal penalty coefficient (λ) using leave-one-out cross-validation. It is likely to be substantial.
  15. staassis

    What statistical test for my data?

    You have to build a generalized linear model (GLM) of the form: Var3 ~ Var1 + Var2 GLM types to consider: Poisson regression, negative binomial regression, Poisson regression + zero-inflated component, negative binomial regression + zero-inflated component. You can choose the "optimal" GLM...