How to determine distribution for failure data?

#1
I am trying to figure out a structured way to deal with failure data. My dataset has about 100 observations of machine life where out of this 100 datapoints, only 40 machines experienced failure during the tests and the remaining is right censored. It also includes several parameters that is used for testing the machines such as temperature, pressure, etc.. I am using survival analysis to work on this data and have so far used non parametric methods. But I think I am missing a step in first understanding the data distribution before deciding which modeling approach I should use. From what I have gathered online, Weibull distribution seems to come up often but I wish to know what I should first do to be able to properly analyse the data and determine how to predict when the machine will fail using survival analysis.

I hope to get some insights into this as I feel that I am currently not able to properly justify my results as just fitting various non parametric or semi parametric models is not enough without making some valid assumptions and hypothesis testing.

Thank you.
 

Miner

TS Contributor
#2
There are several options for you to investigate. Some software (e.g., Minitab, Reliasoft) have distribution identification analyses that will account for censored data. This is your best option. Next is to use software to plot the data on the probability plot for various distributions. Unfortunately, this will not take the censored data into account. Last is the manual version of the preceding option.

The Weibull distribution is frequently used by reliability practitioners because it is very flexible and can fit many data sets close enough. However, it is not always the best fit. Several things to consider: What type of failures are you seeing? Random, wearout? Do you have a mix of different failure modes that may each have their own unique distribution? Reliasoft can handle mixed failure modes, but few others will be able to do so.
 
#3
Hi Miner, thanks for your inputs. I am not familiar with using these softwares and primarily using R and Python at the moment but I will try using minitab.

I am observing time to wearout, so the moment the machine reaches the wearout threshold, it is said to have failed and requires to be replaced. There is no failure mode in this sense, just the point when the machine reaches its usable threshold.
 

Miner

TS Contributor
#4
Is the wearout threshold based on leading indicators such as temperature or vibration? I attached a table of how the more common reliability distributions are used.


1569245322882.png
 
#5
yes there are covariates involved that includes temperature and vibration and this a test data with replicate for each testing condition. I have included an image of how it looks like, note that this is just as an example of how the structure of the data looks like. So have the same type of data for Machine Type B. So for the 100 machines in Type A machines this includes the test replicates as well where each test configuration is replicated at least 3 times.

At the moment I am trying out some non parametric approach, eg. Kaplan Meier and Log Rank but I am not entirely sure if I am making the correct assumptions and hypothesis testings
 

Attachments

Miner

TS Contributor
#6
The big problem with nonparametric reliability is that you cannot make predictions beyond your data window. That is the big advantage of parametric reliability. It allows predictions beyond that window.

Since you have temperature and vibration data, you have another option called degradation analysis. This eliminates the censored data concerns and uses 100% of your data.
 
#7
Thats true, but what I am actually trying to avoid here is the need to extrapolate the data as there is no way to validate whether the extrapolation beyond the time window for the censored data is acceptable when there is no presence of the true data to confirm. That is why I am using survival analysis that I understand considers censored observations as well by treating it as a latent feature.
 

Miner

TS Contributor
#8
You should be using Maximum Likelihood Estimates in your parametric analysis. MLE uses the censored data in it's estimates.
 
#10
You should be using Maximum Likelihood Estimates in your parametric analysis. MLE uses the censored data in it's estimates.
I have a follow up question to this as an afterthought of looking into weibull distributions for failure data. If my intention is to predict the time of failure for each machine, can I still use the parametric method by way of weibull distribution because I am now a little confused on whether I will be estimating the lifetime of the population or of a particular machine. Thanks
 

Miner

TS Contributor
#11
Weibull analysis would provide probabilities for the entire population of machines. If you want the expected life of an individual machine, use the degradation analysis approach.
 
#12
Thanks for your reply. So if I would still want to take into account censored data, would cox regression model be a better approach to predict the life for each machine?
 

Miner

TS Contributor
#13
That might work provided that you have the values of the predictor variables for each machine. You should also consider the lower confidence limit rather than the point estimate of the life.
 
#14
okay, I think I am going into a rabbit hole with this because I am more familiar with machine learning models where the predicted value would be compared against the response value and the error rate is computed by how far the predicted value is.

I realised that using the cox regression model, the outcome of the prediction is not actually the time to event but rather a hazard rate for each subject. So I am not entirely sure now how to get the expected time to event for each machine using cox regression. The documentation on this is by far quite misleading or perhaps it is the lack of my own understanding when trying to interpret the results of the model.
 
#15
That might work provided that you have the values of the predictor variables for each machine. You should also consider the lower confidence limit rather than the point estimate of the life.
I tried to send you a pm but the site does not allow me to do so for now. I'd appreciate some expert advice on the research topic I am working on so if you have permissions to send me a message, I would appreciate that. Thank you very much.