Is a Chi Square an appropriate test for this situation?

Hello : ) I've been struggling for a while with this now; any help will be very much appreciated.

I'm working on a project, which in part looks at whether the same whales are seen together more often than would be expected by random chance. Looking through my text book I thought the Chi Square sounded like an appropriate test. (The book says it's an r x c contingency table).

I am working in Excel. I've made a table with the whale names as row and column headings, and within the table how many times they meet. So, for example, following the column for Whale A and the row for Whale B shows you the frequency of their meetings. (I hope this makes sense).

I did the chi square manually following my text book. So I calculated the expected frequencies using (column total x row total) / grand total.

Next I made another copy of the table with the results of (O-E)^2/E (O = actual frequency and E = expected frequency).

Then I have totalled up the results of this table to give 692.48.

I also calculated the degrees of freedom, as (rows-1)x(columns-1). I have 32 different whales, so I calculated that as 31 x 31 = 961.
This is a very high number and I cant find so many degrees of freedom in any table of critical values, which made me think I've done something wrong or maybe I've used the wrong test.

Most of the frequencies are 0 or 1. Some whale pairs met up to 8 times, so I'm trying to show that this is significantly higher than the others (or maybe it isn't...)

So if anyone can help guide me to the right path, thank you. And thank you for reading about my problem! ^ ^;;
Hi axispeace,

In general, a chi-square test tends gets less useful with larger degrees of freedom. If you get a significant result, you don't really know what caused it.

In your case, it's pretty extreme. Your null hypothesis is that all whales meet each other equally often. Your alternative hypothesis is that they don't. As you may see, this is not a very useful construction.

Your course of action will be determined by what you want to know. What is your research question?
Thanks for your help Junes : )

Well. I'm trying to determine whether pairs of whales meet significantly more often than can be explained by random chance. So, I've calculated that the mean number of meetings between them is 0.9, and some pairs have been seen together on up to 8 separate occasions.

I've been researching further and have found a 'half-weight index' used in a similar study with dolphins (it's described in BEJDER, FLETCHER & BRAGER, 1998: A method for testing association patterns of social animals).

Using this formula HWI=x/{x+yab+0.5(ya+yb)}
(where x = meetings with dolphin A and dolphin B in the same group, ya = meetings with dolphin A and not B in the same group, yb = vice versa, and yab = meetings of dolphin a and b in different groups at the same time).

That gives a number between 0 and 1, with 0 being no meetings between the pair and 1 means they were always together. It seems to work for my data, although it gave some unexpected results, which according to that paper can be caused by 'randomness in the data'. This is as far as I've got.

I guess what I am trying to find out is quite complicated? I didn't think it would be so tricky when I planned my report!

Again, thank you for your guidance. I'll keep this in mind with future chi-square tests too!
To confirm you data 1st. Are you saying you have a matrix which both row and column headings are the names then in each cell is a number of encounters, OR, is each cell a 1 if encounter and blank or a zero if not like the dolphin study?
If I understand the first post correctly it's the number of encounters.

I would like to have a look at the paper (not sure whether I will understand it though :). Could you attach it to your message? (You can use the advanced post option). I don't have access to most journals :( And maybe your data too.

I'm not familiar with this kind of problem, but it's interesting. Another option would be to use some kind of Monte Carlo simulation.
Also, how does one measure these kind of data? I find it a little hard to visualize, maybe you can provide some context?

Do you just want to show that they socialize or do you also want to identify specific pairs of whales (up and above the "normal" socializing)?
One way might be to use combinations to determine the probability that a more extreme result would be found. You have 435 unique combinations of whale (30*29/2); you need to divide by two because you don't distinguish between A meeting B or vice versa.

Now, given your null hypothesis that meetings are random, the probability that a certain combination of two whales meet 8 times is:

\(P(8) = {{435}\choose{8}} {(\frac{1}{435})}^8 \cdot {(\frac{434}{435})}^{427}\)

You need to multiply that with 435 to get the probability of any pair of whales meeting 8 times.

That gives:

\(P(8) = {{435}\choose{8}} {(\frac{1}{435})}^8 \cdot {(\frac{434}{435})}^{427} \cdot 435 = 0.0038\)

P(X>=8) is not much greater.

So, it does seem that the pairing is not random (if I did not make a mistake). Just out of curiosity, doesn't this seem a bit obvious? Obviously I'm not a marine zoologist, but aren't all mammals social creatures?
Hello, thanks for all your replies! : )

I'll try and give some more context here: I've been given a lot of photographs of whales and identified individuals from these. The photographs also contain the date they were taken on. So from that, I've worked out which individuals were seen in this area on each day, and assumed that if the photos were taken of individuals on the same day then those individuals met. (That might be incorrect, but with just photos to go on then it's hard to be more exact).

I posted the article and the table I started with, with the whale ID (it's the names I gave them, not very scientific I'm afraid! : P ). I don't mean for anyone to do the work for me, but maybe it will help.

Junes - yes, you're right with that, it doen't seem to be random. Actually these are minke whales, which are usually solitary but can sometimes travel in groups. It might be because this is an area where they feed, so the same individuals keep returning but don't necessarily stay together after they leave. I'm still looking into their behaviour to try and work out what's going on here.

EDIT~ Ah, I also remembered that the paper mentioned using a Monte Carlo simulation, but I'm not sure what that is. Is it to generate random results for the data?

Again, thank you for your help! : )