What is the effect of categorising categorical data as numerical data?

noetsi

Fortran must die
#2
I am not sure what you mean exactly. How you code the data, for example coding text with numbers does not change the basic nature of the data at all. For example if you have male and female and you code it 1 and 0 it does not change from nominal to interval data.

PCA uses, I believe, pearson correlations as its default which requires interval data. If you use nominal or ordinal data the correlations generated will be incorrect. You should use polychoric correlations instead.
 

noetsi

Fortran must die
#4
R is the software, PCA the method you run on R. What I meant is the method you are using, PCA, uses pearson's r to create a correlation matrix which will be incorrect if the underlying data is not interval.

I don't work much with R but I doubt that it knowing your data is not interval will lead its PCA module to automatically default from Pearson to polychoric correlations. It's not an AI yet:p