# Thread: Searching for data reduc

1. ## Searching for data reduc

Hello,

I am so confused and so depressed, I hope someone can help.
In my Database I have a plenty of datas about 1000 machines.
It is to hard to recognize something in a diagram, so I need to reduce the machines data.
Every machines has >5 variables (about their costs, energy etc.)
All variables has no special max or min.
I want to reduce these datas and find similary machines and machines, that are totally different from the others.

Now my problem:
I have been searching since 2 whole weeks(!) and canīt find a mathematical method or something, that can help me.
Maybe I found some, but my mathematical background knowledge was too low, that I couldnīt recognise it as such a method.
Many times I found something, but later after studying I think "No, I canīt use this methods on my data set"
Then I am back at the beginning again, very frustrating.

For example I found these:
-Principal component analysis
-Factor analysis
-Clustering
but I read many papers and I still do not know, if I can use it for my case.
Especially factor analysis seems not to work in my case, but I read everywhere, that it is almost the same like PCA...

Please, can someone, who knows about these things, call some methods/techniques that can help for this form of data and my intention?
The more, the better, because I want to try different methods for maximal results.

I would be deeply grateful!
Shari

2. ## Re: Searching for data reduc

So you have data for 1,000 machines, but how many observations per machine? And you want to see which machines are similar based on what (a single variable or a cluster)?

3. ## Re: Searching for data reduc

Hi hlsmith,

Each machine has at least 5 values(I ll add 2 or 3 more soon), separated in costs, energy, width, length, height.
I want to see which machines are similar/different based on all 5 values together (or at least 2 or 3, if 5 is not possible).

4. ## Re: Searching for data reduc

No one has any idea?

5. ## Re: Searching for data reduc

Initially I would try clustering and develop a dendrogram. I think that is the easier way to go.

As a second option, you can use PCA and look at the plots (that is a form of clustering) or after PCA construct a biplot. You might want to google that and find information how to do it because it is not the easiert thing to do.

See if that will work.

6. ## The Following User Says Thank You to szm For This Useful Post:

Shari (11-24-2013)

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts