Confidence Intervals / Chi Square Methodology

I'm working on a project for school and I need some help determining how to best understand the following situation (any help is greatly appreciated):

I have a database of approximately 55,000 instances. Each of the instances has 3 possible outcomes - A, B and C. These aren't exact figures but about 49% of the instances are A, 48% are B and 3% are C.

I have created a number of theories that attempt to explain when you'd be more likely to expect to see B's, for instance. One theory in particular isolates 610 instances where A shows up 63% of the time, B's show up 33% of the time and 4% are C.

I'm not sure how best to determine a) the degree of confidence I should have that theory has indeed isolated A's at a higher rate than random and that I'm not just looking at noise and b) what degree of confidence I can have that this higher rate of A's will continue into the future.

I'm sure these are basic confidence interval questions for many of you but I'm new to this world so any help is appreciated. Thanks.
I made a typo previously. I created theories to explain when you'd be more likely to see B's, for instance, and one theory results in 610 instances where B shows up 63% of the time (not A), and A shows up 33% of the time and C's are 4%.

In other words, my theory seems to work but I don't know how to quantify just how well it's working or what confidence I can have that it will continue to work.


TS Contributor
I think it might be helpful to hear a little about how the theory works, or what kind of classification scheme it is.

You may have to worry about specificity and sensitivity of your test.

If you only care about whether or not the classification was correct, you may be able to reduce this problem to a chi square test. Ho: got .5 right.

Depending on how your test works you may want to construct a ROC curve.