Some help needed regarding how to conduct an external validation study

#1
Hello everyone. Hope you're well.

I'm a second year surgical resident from Pakistan, and I'm hoping to conduct an external validation of a 2016 study conducted in France. I've linked the study below.

https://pubmed.ncbi.nlm.nih.gov/27329073/

This was them developing the score. I want to conduct an external validation of it.

So my initial questions are:

1. What sample size should I take for my study to get a good result? (and how would I go about this?)
2. What tests do I need to conduct for this external validation?
3. How do I run those tests? (all I'm familiar with is entering data in SPSS and running some very basic cross-tab stuff)

Any help here would be greatly appreciated. Also, please do ask if there's any questions. (I also have a PDF of the original paper)
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
See if you can post/attached the article - it is behind a paywall and I am too lazy to log-in to access it.

Just a heads up, given the description in the abstract the study seems, meh. Full text version will help to evaluate its methods and see if they presented confidence intervals on the accuracy estimate.

Are you hoping to applied these criteria to your retrospective data and examine its utility? If so, this means your study is based on secondary data - so will you have access to accurate data for the listed covariates? If desired you could also add additional information/predictors of your own. A big question is how common is the outcome they modeled for. If it is rare, their model may be too saturated.
 
#3
See if you can post/attached the article - it is behind a paywall and I am too lazy to log-in to access it.

Just a heads up, given the description in the abstract the study seems, meh. Full text version will help to evaluate its methods and see if they presented confidence intervals on the accuracy estimate.

Are you hoping to applied these criteria to your retrospective data and examine its utility? If so, this means your study is based on secondary data - so will you have access to accurate data for the listed covariates? If desired you could also add additional information/predictors of your own. A big question is how common is the outcome they modeled for. If it is rare, their model may be too saturated.

This is also a reply to the other reply in the welcome post. My god you have no idea of the width the sides of my face went to, what a smile that brought to me. I'm a PGY2 myself. The hospital I work at is one of the three big ones in the city. I checked the definition for a Level 1 trauma center and I'm not sure if any of the three hospitals here meet all those criteria but we very rarely refer cases to other hospitals, so there's no dearth of cases here (I don't know if that's pertinent at all but I thought it would help to show the place I'm working at.)
I don't mind being questioned about the research at all but I just need the help of a research methodologist to plan and then a statistician (or anyone who understands) to teach me. I don't know how much help I'll get here (always keep expectations low) but your response filled me with so much hope. Thank you for that!

I'm hoping to collect data prospectively because even though our data collection has gotten better over the years, it's still not comprehensive (I suppose it'll be another 10 years before then). So it'll all be prospective data collection.

Specifically in reply to your questions, I'm hoping to prospectively apply these criteria to new patients, so that should make it primary data. And as far as additional information/predictors go, they already did that when making the score itself. The outcome (difficulty level of cholecystectomy) is quite common so rarity shouldn't be a big problem I'd assume.

I have like a thousand thoughts racing through my mind and I don't want to type it all out and make this post a load of verbal vomit. So if you've got any more questions please do ask away!
 

Attachments

hlsmith

Less is more. Stay pure. Stay poor.
#4
Well I am out of the office due to US holiday season. I'll skim the paper next week, but prospective you will be able to see if the criteria works real time. Is there a threat you will change practices or actions that would interfere with the consistency of data collection going prospectively? Or are all data collected prior to outcome?

If the latter, your next concern would be, how do you define 'valid'?

Are you in Pakistan? I am in the Midwest of the US.
 
#5
Oh happy holidays!

And on the point of data collection, I don't think there's any threat of me (or my seniors, who would be the surgeons in this case) changing any of their practices. All we'll be doing is some extra data collection for the patients, an extra blood test on top of the baseline tests that we already do, and we'll be timing the surgeons. The inclusion and exclusion criteria are the same as given in the paper. All data are indeed collected prior to outcome.

If the latter, your next concern would be, how do you define 'valid'?
Alright this is the first time I'm a little stuck. I'm not sure I understand the question.

And yes I'm in Pakistan. Peshawar specifically.
 
#6
Well I am out of the office due to US holiday season. I'll skim the paper next week, but prospective you will be able to see if the criteria works real time. Is there a threat you will change practices or actions that would interfere with the consistency of data collection going prospectively? Or are all data collected prior to outcome?

If the latter, your next concern would be, how do you define 'valid'?

Are you in Pakistan? I am in the Midwest of the US.

Is the holiday over yet? Haha.
 

hlsmith

Less is more. Stay pure. Stay poor.
#7
How does your health system work - in that would an extra blood test mean using collected blood for another test or collected more blood or collecting blood when it was not previously collected? In any of these scenarios will you inform the patient of this via consent and who pays for the test?

I'll warn you that just because a study is published doesn't mean it is not without fall. The referenced study states multivariate when they mean multiple logistic regression and they also use backwards stepwise regression, which most stepwise regression for variable inclusion are frowned upon. Also, they used the same dataset to create the model as to test its abilities - which will always over-represent it's utility. The do a bootstrap correction which is better than nothing, but they needed to test it on a holdout dataset to better know it's utility.

Of note, they did not provide confidence intervals around the AUC values, so we don't know its precision. I just skimmed the paper, but we also dont have estimates for SEN, SPEC, PPV, NPV for cutoffs with confidence intervals so we do know if it tended to over or under predict the outcome.

The study is also based on a little more than 400 patients, which gets sparser when they break them into risk categories and you need to question whether these people represent all similar patients or your patients.

So given all this, yes you could attempt to examine its predictive utility in your patients, but how do you want to define validation? Well you could do it by desiring a certain level of precision on each category estimate, show how many people would you need to have confidence interval so wide. It will get tricky for the lowest risk category since it is so close to null.