I'm making 2D and 3D charts for presentation to illustrate how various television networks share audience. To do this, I have source data on what networks are watched by some 80,000 people, and I'm tackling it with some multidimensional scaling routines.

The MDS routine requires a similarities matrix as input. To serve that purpose, I'm supplying a table of ratios of actual-to-expected values of percentages. BUT IS THAT A GOOD CHOICE TO SERVE AS A SIMILARITY MATRIX?

Example: We have a target audience of 118,649,704 people (projected from the sample, weighted). ABC is watched by 70,253,665 people and CBS is watched by 71,463,480. (See attachment, Network_counts.jpg)

Expressed as percents, ABC is viewed by 59.2% and CBS is viewed by 60.2%. (See attachment, Network_percents.jpg)

Now if viewership to these networks were independent, we would expect that the portion of the total who watch both ABC *and* CBS would be 35.6% (59.2 x 60.2). In reality, there is more substantial overlap, with 45.6% watching both. The actual-to-expected ratio is thus 45.6/35.6, or 1.28. (The completed table of actual-to-expected ratios is attached as Net_exp_act.jpg) Because the audiences of networks can vary widely, especially as we look into cable networks, we need a measure that is not affected by these sizes, and this ratio serves that purpose, I think.

So, can this serve as a similarity matrix? If not, what might be better?

Thank you!