Testing for similarity between countries

For my thesis, I am looking at the effect of environmental controversies on the profitability of Chinese and European firms. As part of my exploratory analysis, I have to check whether or not the European countries are similar enough to consider them as 1 group.

I collected data on economic development and cultural dimensions. However, I am unsure which test I can use to check for similarity between the countries. The data looks as follows:

The end goal would be to have a test confirming that the European countries are similar enough to consider as 1 group, and to have a test confirming that the European group is significantly different from China.

If anyone has any idea on which tests I can use for this, please let me know! All help is very much appreciated.


TS Contributor
the goal would be to have a test confirming that the European countries are similar enough to consider as 1 group
That is a matter of your informed judgement, since there is no test which can tell
"similar enough for hannahb97's purpose yes/no."

In 9 or 10 of the variables, the differences within Europe appear small, compared with the
huge gap between every European country and China, but some within-Europe differences
might nevertheless be important (Austria's power distance looks a bit surprising).

Maybe you could start by creating a box-and-whisker plot for each variable.
This could show you whether China is the only extreme/outlier and whether
the European countries appear as one group.

A multivariate metric could be the Euclidian distance between all countries in the sample
(calculated after z standardization of the variables).

With kind regards

Hello Hannah,
I would take the most simple yet robust and sufficient approach. It sounds to me that clustering (see comments bellow) would be an excellent one, as staassis mentioned. Following by ANOVA would be a sound strategy.
You may want to consider hierarchical clustering, which should work well with averages, or proportions (percentages). This method creates clusters of similar groups, by as Karabiner wrote, Euclidean distances.
Another option I would recommend is to consider how discriminant analysis could fit the purpose of your analysis, case in which the Iris data set is the most classic and pretty example. SPPS does a nice job at both. I personally prefer and use R but I had to practise and practise before I could feel confident with its use. Whatever method you choose, look for literature that explains these topics in an all not-to-riddled way. Clustering and Discriminant analysis are sometimes described as Rocket Science, and there's no need.
All the best,