independence from a distribution

dart

New Member
#1
Hello,
I have a distribution of observations according to 3 classes
classes 1, 2, 3
observation 119, 858, 314
these classes are related to the uncontrollable observation time,
respectively (in h) 1032, 5981, 1634.
which test can I use to show or not a dependency?
either globally or in 2 (119 versus 1032h; 314 versus 1634h;...)
thank you for your attention to my question
 

dart

New Member
#3
Hello staassis, hello everyone,
Sorry for not being clear, English is not my language.
I will try to reformulate the question.

I have a distribution of observations (counts of an event) according to 3 modes of a variable, mode: 1, 2, 3.
The related observations (counting) are: 119, 858, 314.
But these modalities are linked to the uncontrollable observation time,
respectively (in hours): 1032, 5981, 1634.
which test can be used to show a distribution of observations depending on the modalities or not, despite the obvious effect of observation time?
Thancks for the time spent
 

staassis

Active Member
#4
None. Mate, you have only 3 observations in your data set. Statistical science is not a joke. Rather than writing on this forum, your are better off spending time on getting more data.
 

dart

New Member
#5
Thank you Staasis for your quick return,
and very direct, maybe too direct. Perhaps you too would do well to take the time to read what is written there before answering (the number of observation hours is almost one year and I have more than 10 years of this type of observation. The number of classes is three because this factor (tides) is classified a priori according to low, high values or in between: three modes is common.
And I'm not your mate. Yours answers is not very polite - I don't know what your moderators think about it.
It is my turn to advise you not to answer rather than this kind of answer.
 

staassis

Active Member
#6
Thank you Staasis for your quick return,
and very direct, maybe too direct. Perhaps you too would do well to take the time to read what is written there before answering (the number of observation hours is almost one year and I have more than 10 years of this type of observation. The number of classes is three because this factor (tides) is classified a priori according to low, high values or in between: three modes is common.
And I'm not your mate. Yours answers is not very polite - I don't know what your moderators think about it.
It is my turn to advise you not to answer rather than this kind of answer.
And again, my dear stranger, you have only 3 data points containing 3 variables (Class, Count, Observation Time). The duration of an observation period matters only if you record the aggregate count regularly and repeatedly. I have read your messages carefully. Over the years, I have seen many such messages, written by people who think that statistics is a joke and they can run regression on 2 data points. Best regards.
 
Last edited:

Dason

Ambassador to the humans
#7
I think it would be possible to rest the hypothesis but it would require a pretty strong assumption. If we assume the observations come from a poisson process we could test if the three different modes have different parameters.
 
#8
Friends and participants!

Let us try to not be too polemic.
Let us try to be more friendly as @TheEcologist said. (Although I am one of those who have made mistakes about this.)

When I read the first post I did not understand what it was about. When I read the second post I thougth that it could be about many events. Maybe Poisson distributed? But are they statistically independent? And what are they about? It is not meaningful to make a statistical model without taking into account the biological, social or chemical circumstances.

Let us remember that millions of 0/1 events (Bernoulli events) can be summarized in just two binomial numbers (n, y). And many events in just one Poisson variable. (But is the distribution such that it can be summarized in a sufficient statstic?)

There are no "holy number" of what the n should be. Let us remember that in daily life we are satisfied with a sample size of one (n=1). You go to the "doctor" and they take just one blood sample. You check your eyes just once for new glasses.

On the other hand you might need thousands of observations to discover a rare side effect of a new medicine.

If the origina poster returns and explains more about the data, then maybe someone might be interested in saying something about it.
 

Karabiner

TS Contributor
#9
Or, is this perhaps a matter for one-sample chi2 Test? Using more simple figures as an example:

Say, in classes a, b, and c we have counted 100, 200, and 300 events, respectively, which makes
a total of 600 events. But observation time for a was 10 hrs, for b it was 200 hrs, for c it was
also 200 hrs. Now, if frequency of events was independent from class, then the 600 events
would expected to be distributed according to 10/410, 200/410, and 200/410. We can
calculate the expected frequencies and compare them with the empirical frequencies using chi2.

With kind regards

Karabiner
 
#10
Thank you Staasis for taking the time to explain again, I assure you, the form changes a lot. Thanks aslo for others to take time,
for the latter it's almost how i understand my problem but i doesn't find adequat statistical test witch excludes/controls the prorata time factor

For Staasis

I am well aware that there are only 3 classes but statistically, regularly 2 percentages for example - so 2 modalities between them - are compared with robust methods. This distribution in 3 modalities, could return in 3 tests of each of the modalities transformed into % versus its opposite (% of 1 and % non-1; % of 2 and % non-2,...). So three modes is not necessarily unquestionable? No?

I could increase the number of modalities (for example, from small/medium/large to small/medium/medium-high/high) but this brings little, what is interesting is the effect of extreme modalities where the counted numbers are already the lowest.
(I deliberately disguised a quantitative variable as a categorical variable by grouping values together, hence the trace in the ordering of the 3 modalities small/medium/large).

I am also aware that the time factor (hence my initial title prorata temporis) is a problem but I do not control it, the modality occurs in the year in an uneven way, I can only measure their effect on the fish stock.

I'm giving you another lead, can you tell me what you think? If I artificially free myself from the time factor, for example by levelling it by drawing the same number of hours of observations within each of the modalities, there would be an appropriate statistical test on these three values (sum of fish), type Chi2, other ?
Does that make sense?
I lose information but I remove a variable, duration, which parasitizes me:
1 single factor tested, of 3 - possibly 4- modalities (of the same collection effort, duration), high numbers of fish per modality (well over 5), possibly 2 series of values (2 potentially possible random draws in each modality) ? or again "a naivety"?

thank you in advance,
 
Last edited: