Could you plz elaborate on your design? I mean how many Alanines are there in your protein sequence? How the random sequences have been allocated? If they are random, how you call them "normal"? I think a normal protein sequence cannot be random, right?

Then what is your question? the one you stated in your title (multiple comparison problem)? or the one stated in your post (is the number of Alanines in your protein greater than expected?

And if you have 9 amino acids? or 9 proteins? I see you have said both.

-------------

As some suggestions, I think you don't have a multiple comparison case, if you are comparing a single sequence with 1000 sequences.

If you care about only the number of Alanines in your protein, well you are comparing a single ratio with 1000 other ratios. Lets say your sequence has 3 Alanines. So your Alanine ratio is 33.3%. You will check if this ratio differs from the ratios in those 1000 normal sequences... You can use a chi-square test. You have for example 3 Alanines in your 9-AA sequence. You have 1430 Alanines in your 9000-AA collective random AA bank. So your chi-square would compare 3 / 6 versus 1430 / (9000 - 1430). You won't need correcting for multiple comparisons.

You can also use a chi-square goodness of fit test.

Also instead of having 1000 random sequences, you can calculate the expected ratio by calculating the possibility of Alanines and other amino acids. So you have 9 cells, each of which can take 20 amino acids, and only one of them is Alanine.... This way, you will have a potential number of sequences with a limited number of sequences including different numbers of Alanine. You can count those Alanine-included sequences and calculate the average of them, as the expected Random value.

-----------------------------------

As long as you are concerned with the number of Alanines in your 9-AA sequence, you have actually only one single comparison.Is the chi-square still the best option? Or do I need something to correct for the 1 vs 1000 comparisons?