First, I read an example in « pseudoreplication is a pseudoproblem » where we wish to determine which of two urns contains the greater proportion of red to blue marbles. Each urns contains several thousand of marbles. Authors wrote to sample 10 times 10 marbles, to compute the frequency and perform a two-sampled t test with 18 degrees of freedom.

Why can't we sample one time 100 marbles, code in binary (1 for blue, 0 for red) and use a glm with binomial family ?

If I study diameter of marbles, should I sample in the same way ?

The authors wrote that urn can be considered as experimental unit for a design without replication. If now, we have a replication (4 urns, 2 for each condition). How can I include the replication in my model ? As a random effect ?

Another example, I want to know if there is a difference of mortality in fish between 2 conditions. I have two water compartments, one for each condition. Inside each compartment, fish raise in 3 different cages. I sample 10 times 10 individuals or 100 individuals in each cage.

Should the cages be a random effect ? If I have replications (4 water compartments) and cages, how to analyse data ? For information, I’m using R. Thank you.