Search results

  1. Miner

    Continous (Ordinal) Outcome with Middle Group as Target

    Sorry for the delayed response. I just returned from a 10-day cruise from Vancouver to Hawaii. In an industrial situation, I would treat the response as a continuous response with a specification of 11-20. This maximizes the information content available in the data. For example, 0 is...
  2. Miner

    One way ANOVA or two way ANOVA? + non-parametric equivalents

    The 2-way ANOVA would be the appropriate parametric test. During the analysis, you can test the specific contrast in which you are interested and not test those that are not of interest. Don't worry about the distribution of the raw data. The distribution of the residuals is what is...
  3. Miner

    What P value level do you suggest for this study?

    I practice in industrial statistics where there is no pressure to publish. When I see p-values in that region, I will target that factor for further investigation. As an outsider, observing the reproducibility and replication crisis, the biggest failing that I see is the rush to publish based...
  4. Miner

    Compute prediction intervals in Random Forest Regression

    What is the problem that you are trying to solve? Beyond prediction, that is. What is unique about this process that you wouldn't use SPC to control the process?
  5. Miner

    Design of experiments across the time

    I'm not a SAS user, but if you are willing to share your data, I could look at it in Minitab.
  6. Miner

    Hello statisticians and data enthusiasts!

    Welcome Cedric I too am a practitioner in industrial statistics, and am well versed in SPC and DOE. Not so much in sampling though I can get by if necessary. I am new to machine learning, but am accomplished in reliability, so I may be able to assist in applying that to your maintenance problems.
  7. Miner

    Question abou autocorrelation

    It depends on the measure used in the times series. If you were tracking GDP, which is a measure of the performance of the economy, you would expect to see autocorrelation over shorter time periods. Whether you are in a boom or a depression/recession, consecutive measures of GDP would...
  8. Miner

    Generating info about a population from a sample

    Are the apartments of differing sizes (e.g., 1, 2 or 3 bedrooms), or quality? Or are there different levels of repair (e.g., painting, plumbing, HVAC)? Any of these might explain the grouping.
  9. Miner

    Generating info about a population from a sample

    It may be a little more complicated than that. A histogram of the data shows some gaps, and a probability (Q-Q) plot shows some dog leg bends that may indicate a mixture of 3 possible groups. See the attached graphs for this. If this is true, you will need to identify these groups and apply...
  10. Miner

    Correlation of two variables through others

    I think you are out of luck. The variability between each 1000 piece sample is likely to be much greater than the variability within each sample. This would overshadow any potential correlation that you are seeking. If the material were relatively homogeneous (e.g., steel, aluminum, etc.)...
  11. Miner

    Correlation of two variables through others

    Can you provide a little more context to your scenario? MOE = Modulus of Elasticity? What does frequency mean in this context? How were the samples collected? Are they all from the same batch of material? One group per batch? One/two piece(s) per batch?
  12. Miner

    Regression with error in the dependent var.

    In Minitab, use Orthogonal regression. This is designed for error in both independent and dependent variables.
  13. Miner

    Unfortunately, Minitab 18 took their Help files to the cloud and do not provide the formulas as...

    Unfortunately, Minitab 18 took their Help files to the cloud and do not provide the formulas as completely as before. However, notice that the SE are increasing as the number of At Risk decreases, so it is a sample size function.
  14. Miner

    Kaplan-Meier Survival Analysis - advice wanted

    I use Minitab for reliability (survival) analysis. Because the data are left censored (no idea when the trees sprouted), right censored and interval censored (no idea when during a given year the tree died, I used a Turnbull estimate rather than Kaplan -Meier, but still duplicated your...
  15. Miner

    Statistical Validation of a Labelling Method

    Are these categories nominal or ordinal in nature? If nominal try Fleiss' Kappa. If ordinal, use Kendall rank correlation coefficient. A minimum of 50 samples Pick a policy, then select samples that both pass the policy and fail the policy
  16. Miner

    Group has lower overall % despite winning each category of %'s

    This is known as Simpson's Paradox.
  17. Miner

    How to deal with observations error in linear regression

    I think they are referring to errors in variables regression. Specifically, error in the independent measure.
  18. Miner

    Vector Autoregressive Models

    I thought I would throw this in from a course I teach in problem solving. There should be some overlap in statistics. Three rules of causality: A correlation or association exists between the hypothesized cause and the effect Cause must precede effect in time The mechanism linking cause to...
  19. Miner

    Histograms Reveal Potential Problem

    Can you use nonparametric counterparts such as median, confidence interval for the median, quartiles, etc.?
  20. Miner

    Unusual distribution of the continuous outcome

    It appears there are 4 major groupings. Would it make sense to perform a cluster analysis of the HCPCS codes into 4 clusters then treat these clusters as an indicator variable in your regression? Would the results be interpretable?