Recent content by Miner

  1. Miner

    Missing data satisfaction research

    While 75% is an excellent response rate, you still have the potential for non-response bias there as well. Regarding the other, it does make sense, but it is something that would vary depending on what you are studying. That example illustrates a worst case scenario, but there are other where...
  2. Miner

    Employee satisfaction database

    The other big problem is that there is no consistency in the satisfaction scale or in the questions asked across companies. This would have a different focus, but The American Customer Satisfaction Index is available by year and broken out by many sectors. You could match that up with public...
  3. Miner

    When is a difference between two percentages more likely to be statistically significant

    The power of a proportions test decreases when you are in the middle, so you have to compensate by increasing the sample size.
  4. Miner

    When is a difference between two percentages more likely to be statistically significant

    I believe there are also differences depending on whether you are in the middle (e.g., p ~ 0.5) versus the extremes (e.g., p ~ 0.1 or 0.9).
  5. Miner

    Kaplan-Meier usage for forecasting

    I recommend using a different approach. You are essentially trying to predict the probability of an event occurring at a point in the future. This is the basis of reliability analysis. I would use a survival analysis for arbitrary censored data. With the number of policies involved, you...
  6. Miner

    Why do experts continue to use stepwise regression?

    This is a good, easily read article from Minitab on the perils of stepwise and best subsets regression. They approached it empirically using a known model.
  7. Miner

    Help with understanding correlation ...

    r is an indicator of how strong the correlation is. Specifically, how strong the signal is relative to the noise. The "cutoff" for how good an r value needs to be depends on your needs and the purpose of your study. I work in industrial statistics, and an r of 0.5 would be of little practical...
  8. Miner

    Best way to run regression

    I don't know whether the following is technically correct, but I can tell you that it worked for me. We do an annual survey on a wide variety of measures, which use an ordinal scale of 1 - 10. I used multiple linear regression to determine the relationship between possible IV and an important...
  9. Miner

    Type II error textbook question

    Are you confusing the concept of Power with the Type II error (Beta)? Power = 1 - Beta
  10. Miner

    Determination of optimal number of clusters (stpping rule) with similarity measure (Jaccard coefficient)

    I use the method described in this post. You always need to make sure that your number of cluster makes practical sense. You may find value in clustering at more than one level. For example, if you were to perform a cluster analysis using data on vehicles, you might cluster at a high level and...
  11. Miner

    Discrete normal distributions

    Technically, you should be using the Poisson distribution instead of the Normal distribution. If an approximation is good enough versus an exact fit, you could use a quincunx approach. Quality people have used this for decades to demonstrate the concept of variation and the effect of making...
  12. Miner

    Suggestions for analysis of my experiment

    If I understand your design, this sounds like a repeated measures design, so a repeated measures MANOVA should be the appropriate analysis.
  13. Miner

    Design of Experiments with existing data

    One factor at a time experiments are inefficient and often unable to detect interactions.
  14. Miner

    Design of Experiments with existing data

    I recommend that you analyze this using regression.