Thread: data analysis for retrospective cohort study

1. data analysis for retrospective cohort study

Hi!!

Im embarrassed to be asking v basic questions

I proposed a study topic for my group assignment, looking at cancer rates between overseas born (OB) residents and local born (LA) residents. I proposed looking at 10 years of data (1999 to 2009).

I was thinking of calculating odds ratio/ relative risks using the formula:

OR=(Cancer(OB)/NoCancer(OB) / (Cancer(LA)/NoCancer(LA))

I will be able to obtain the number of new cancers each year and separate them by OB or LA. I will also be able to obtain numbers of OB and LA residents each year.

Should I be calculating an OR for each year of the 10 years and then averaging the 10 values, or should I be adding up the totals for the 10 years and calculate one OR based on the totals?

Also, what I proposed is to look at new cancer incidences every year, split them into OB and LA, and wanted to find out is OB a "risk factor" for developing cancer, compared to LA. We are looking at retrospectively cancer data collected previously, which is also collected with country of birth (COB) data.

Would this be a retrospective cohort study, or a case control study?

Getting majorly confused here. Help would be appreciated.

Thanks
Nile

2. Re: data analysis for retrospective cohort study

Originally Posted by niledee
Should I be calculating an OR for each year of the 10 years and then averaging the 10 values, or should I be adding up the totals for the 10 years and calculate one OR based on the totals?
Your question should drive this decision. Averaging over the 10 years will put an equal weight on each of the ten years. Its reasonable to me if you want to consider the years equally. Adding up the totals does not do this. Instead, it puts the weight in the population in each of the ten years.

So for example, if you had 10 million ppl on year 1, and 10 ppl on year 2, totaling over the two years would focus more on the overall effect while averaging between the two years would focus more on the marginal yearly effects.

Originally Posted by niledee
Also, what I proposed is to look at new cancer incidences every year, split them into OB and LA, and wanted to find out is OB a "risk factor" for developing cancer, compared to LA. We are looking at retrospectively cancer data collected previously, which is also collected with country of birth (COB) data.
I see no problem with this. However, if you really want to answer the question, you'll need to look for confounders that will bias your results. I'm sure OB ppl will have some traits that make them more vulnerable to cancer. Likewise, LA is known for its pollution. Residents are exposed to it from childhood, which may make them more vulnerable as well

Originally Posted by niledee
Would this be a retrospective cohort study, or a case control study?
Please define each of these types of studies for me. You may find the answer while doing so.

3. Re: data analysis for retrospective cohort study

THANK YOU!!!!

Its a relief to know that what I thought might work. I am currently studying a health information management course, and am doing an epidemiology unit. My stats experience has primarily been in RCTs and associated stats.

We have a group assignment in study design and I proposed the study which I mentioned in last posting. I was in the process of working out how the data could be analysed and got stuck, so thanks for the help.

I will go with calculating ORs for each year and averaging it over 10. I just wasnt sure if it was the done thing but now you have clarified that it would seem reasonable. The problem I have with summing the totals for each year and then calculating one OR based on the 10 year totals, is totalling population numbers. I would be quite happy to sum up the cancer incidence totals for each year as the data would reflect the number of cancer diagnoses each year, regardless of what cancer and if they occurred in repeated people doesnt matter- I would still be getting the incidence of cancers each year which is what I want. However, I was uncomfortable with summing up the population each year as there would be a large majority of repeated counts each year. Ie if the population was 10 million one year, and 10.2 million the next, and the cancer incidence was 1000 one year and 900 the next, if I summed them, I would be saying that 1900 cancers were diagnosed in 20.2million people which is not correct. Whereas if I did 1000 cancers in 10 million people, and 900 cancers in 10.2 million people, and averaged it, I'd be happier.

You really didnt have to read all that but thanks for helping me figure it out.

As for confounding factors, I am aware that I have to consider them. For the purposes of this assignment, what I may do is assume that the significant known cancer risks (ie smoking, obesity etc) are equal in both OB and LB groups.

The other stat that I am going to find out more about is the logistic regression and how it may apply to what I am proposing.

For this study, I am looking at people who have the disease and looking back at the exposure factor (ie OB or LA), therefore I would say that its a retrospective cohort study.

Case controls look at people with and without the disease, and then look back at the exposure factors which may be different between the two groups.

I hope I am right to say that. I just got a bit confused because the textbook which I am using listed odd ratios and relative risks as statistics used in case control studies, so I wasnt sure if I could also use the same stats for a cohort study design.

Once again- thanks alot for your help. Very much appreciated.

Nile

 Tweet

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts