Matching in discovery nested case-control study

Hi, I want to identify differentially expressed RNAs in cases versus controls from a cohort study.

First, I would like to identify promising candidate RNAs in a study on 10 cases versus 10 controls. Then I would validate them in the larger cohort.

What would be the best way to match cases and controls in this discovery nested case-control study? Should I just match by age and sex? Or should I make cases and controls as similar as possible with matching on multiple possible confounders (e.g. by propensity score matching)?


Less is more. Stay pure. Stay poor.
We don't know your context, so can't make recommendations. Typically you would want to make the two groups as close to possible on all factors that may impact the exposure and outcome and other factors impacting the outcome as possible. So that the only difference is the exposure of interest. However, I believe C-C designs are trickier, since if you matched based on cases y=1 and controls y=0, then you will never see an effect, since the prevalence of the exposure in the outcome will be perfectly balanced, correct?
Thank you for your reply. There is not much more context to it. I would like to identify RNAs through total RNA sequencing associated with being a case (patients who will develop a cardiovascular event) versus control.

And I know that there is this trade-off with making cases and controls too similar that in the end there is not really any difference between them. Further, for every covariable that I match for, I also need to adjust them in the analysis. But with 10 cases vs 10 controls I cannot adjust for many variables.
So I am still unsure if I should match for sex and age only, or to include more variables in matching to make them more similar.
Any recommendations?