Blocked Randomization for varying sample sizes within sites

#1
Greetings,

I am reading about randomization techniques and have a curiosity: for multicenter trials, if one has a bunch of sites, each with varying sample sizes (for example, one site has only 4 participants enrolled, a few sites have no participants so far, another has 59, another has 68, another has 7 and so on), is it possible to use blocked randomization within each site to randomize the existing participants in 1:1 ratio to either group A or B?

For 59 participants, if you want a block of size 6, I guess you could do 9 blocks but then you'd have 5 participants left over. Is it customary to do block randomization for the 54 participants and then just simple randomization for the remaining 5? Or does it not make sense at all in such a situation to even bother with block randomization?

I have found plenty of examples but they all seem to assume that you have the same sample size (and furthermore, that sample size is an even number) for each site..

Thank you kindly for any advice/resources you can provide about this.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Tell us more about your context. Typically you would draft out the randomization scheme prior to starting. Do you already know your sample sizes at the sites? A note, they may not all necessarily volunteer if the are humans with autonomy.
 
#3
Tell us more about your context. Typically you would draft out the randomization scheme prior to starting. Do you already know your sample sizes at the sites? A note, they may not all necessarily volunteer if the are humans with autonomy.
Oh I'm sorry, the context is the following: there is an ongoing multicenter trial (unrandomized trial) with participants enrolled at 17 different sites; these participants are receiving a drug per standard of care so no randomization was performed for this study. But as part of an exploratory objective, the investigator is now interested in randomizing the existing participants in a 1:1 ratio to either intervention A (watching a certain educational video) or group B (not watching that video) to see whether intervention A (i.e. watching the video) has any impact on the participants' decision to enroll in another study.

So, what I'm trying to do is produce a randomization list of existing participants at each site, with their respective assignments (either group A or B), while also accounting for an extra 10-20 participants at each site since enrollment is still ongoing for another 3 months.

What I've done so far is I performed simple randomized for the existing participants where for the sites with even numbers of participants I've assigned half to one group and half to the other, whereas for the odd numbered sites, I've allocated in such a way that the difference between the number of participants in the 2 groups is no more than 1. Then I created a dataset of 340 additional participants (20 for each site), randomized this list (data 2) and finally I combined the two lists and got my final list.

But I'm not sure if this approach is valid or the best one? or if I can/should also do blocked randomization for the existing participants? The thing that confuses me about this case is the fact that the sites have different sample sizes: anything from no participants enrolled yet to just 1 participant, 3, 59, 17, 68, 4, 6, 14 etc.

Thank you for any input/help!
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
I still don't get how you know there are even or odd numbers at sites if the project is still enrolling?

There are three basic approaches when creating a randomization scheme, simple random assignment (expect 50 people and you randomize all 50 at once, limitation is you can get runs of say five in one group and if enrollment drops off or patients are temporally similar there can be issues), block randomization (which is the same thing but you break the potential subjects into say groups of four that are randomized, this will balance the groups if enrollment falls off or there is temporal covariance in subjects), random block (where you do the same thing but the sizes of the blocks is randomized as well, so block sizes vary: 4, 6, 2, 6, etc., this helps to blind the participants and researchers, since they can figure out the pattern).

You are probably fine with using the second option with maybe blocks sizes of 4. And yes, you can do this for each site individually and it would be preferred. So determine the speculated number of subjects from each site and conduct block randomization scheme for say 40% more people than you would expect to enroll and use a different randomization seed for each location, so no two sites have the exact same scheme.
 

hlsmith

Less is more. Stay pure. Stay poor.
#5
P.S., As you likely know - the primary purpose is to balance background characteristics that may influence the outcome and make the two groups balanced on known and unknown confounders.
 

hlsmith

Less is more. Stay pure. Stay poor.
#6
Thanks for the thumbs up, what I was trying but didn't directly say, is if there isn't really much confounding and there is a big association with the outcome and intervention - then if you slightly botch up your randomization it isn't that big of a deal. Also, given you collected the known confounders as well, you can always after randomization, put them in your outcome model and control for them with weights or covariates!
 
#7
I still don't get how you know there are even or odd numbers at sites if the project is still enrolling?

There are three basic approaches when creating a randomization scheme, simple random assignment (expect 50 people and you randomize all 50 at once, limitation is you can get runs of say five in one group and if enrollment drops off or patients are temporally similar there can be issues), block randomization (which is the same thing but you break the potential subjects into say groups of four that are randomized, this will balance the groups if enrollment falls off or there is temporal covariance in subjects), random block (where you do the same thing but the sizes of the blocks is randomized as well, so block sizes vary: 4, 6, 2, 6, etc., this helps to blind the participants and researchers, since they can figure out the pattern).

You are probably fine with using the second option with maybe blocks sizes of 4. And yes, you can do this for each site individually and it would be preferred. So determine the speculated number of subjects from each site and conduct block randomization scheme for say 40% more people than you would expect to enroll and use a different randomization seed for each location, so no two sites have the exact same scheme.
Oh, when I say even or odd numbers at sites, I'm referring to the number of participants we currently have enrolled at those respective sites. Given that there's only 3 months left during which they can still enroll and given the enrollment rates so far, we have reason to believe that there's not going to be that many more participants enrolling, definitely not more than 10, if that. So, it is highly possible that a site that now has let;s say 7 participants enrolled might very well not enroll any other participants, hence the sample size at this site is odd, right? Or they may enroll 1 additional participant turning it to 8 (even), or they might enroll 10 additional participants in which case the sample size would still be odd (17).

So taking the case of this site with 17 participants (the current 7 + additional 10 max), as an example, how would I do block randomization? I know that the block size chosen would have to be a multiple of the number of treatments so in my case given that I have 2 treatments, I would need block sizes of 2, 4, 6, 8 etc. But 17 doesn't divide evenly into any of these so I'm confused about how to do this.

Sorry for the potentially silly questions, I'm just confused because all the examples I'm reading about and the sample codes I could find seem to assume the ideal situation where you have something like here's 40 participants per site, do 10 random blocks of 4 lol.

Thank you very much for your helpful feedback, I greatly appreciate it!
 

hlsmith

Less is more. Stay pure. Stay poor.
#8
Yeah for an odd number you will just have a slight imbalance in group numbers, nothing you can do about that, except limit participation when hitting an even number, but that seems unnecessary. To limit this issue you could use blocks of two I guess. This may unblind the scheme to the study personnel, but it is up to you to decide if that is an issue.
 

hlsmith

Less is more. Stay pure. Stay poor.
#9
With these smaller numbers at multiple sites, I may wonder if you plan to control for site (random effects) in analyses.
 
#10
Yeah for an odd number you will just have a slight imbalance in group numbers, nothing you can do about that, except limit participation when hitting an even number, but that seems unnecessary. To limit this issue you could use blocks of two I guess. This may unblind the scheme to the study personnel, but it is up to you to decide if that is an issue.
Yeah, blinding should not be an issue as the investigators and study personnel are not blinded. Or would you say that in this situation, I might be able to just get away with a simple randomization for each site? See, the reason I was thinking about block randomization is because I already did the simple randomization but for sites that have a relatively large number of participants (i.e 50+) I noticed I'm getting sequences where there can be up to 6 or 8 participants in a row assigned to the same group, like "AAAAAAAABB" for example and that seemed a bit off..