# Thread: Meanings of a set of data

1. ## Re: Meanings of a set of data

If you use sapply I think you get the output closer to what you want

Code:
``````sapply(dat, summary)
# then we can transpose it...
t(sapply(dat, summary))``````

2. ## Re: Meanings of a set of data

Dason,
You're a gentleman and a scholar.

3. ## Re: Meanings of a set of data

Hi all,

I have reduced the size of the problem from a table of dozens of columns by thousands of rows to a manageable table of 8 columns by few dozen rows. Still, it is an intermediate product. I would like to reduce it further down to few numbers for the executive report.

Each row. the first row for example, can be read like this:
The experiment Exp1 has a successful rate of 87.89 % for only time period 3.86 with a summary of productivity, min:-2.13 1st:0.23 median:0.57 mean:0.71 3rd:1.00 max:12.26.

I can see here is a problem of probability rather than statistics so maybe I need to post this at the probability forum. The reason I post here is for continuity but if you want me to I will do.

Looking at the result table I think I can make a decision but that is non-scientific nevertheless and the client would not hire me for that. Can you guide me to reduce a row to a probability number then from there I can rank them. Thanks in advance!

The data is somewhat modified and the description is vague for the purpose of client's proprietary.

Code:
``````> result
success period   Min. 1st Qu. Median Mean 3rd Qu.  Max.
Exp1    87.89   3.86  -2.13    0.23   0.57 0.71    1.00 12.26
Exp31   70.15   8.23  -7.39   -0.17   0.79 0.86    1.75 15.98
Exp41   67.12   8.23  -8.81   -0.40   0.78 0.85    1.90 18.21
Exp12   91.08  18.14  -1.34    0.36   0.71 0.90    1.25  6.89
Exp42   78.81  18.14 -10.29    0.21   1.24 1.32    2.45  9.75
Exp13   95.04  40.32  -0.51    0.43   0.77 1.01    1.42  4.05
Exp23   85.95  40.32  -2.23    0.27   0.99 1.14    1.74  4.89
Exp33   85.95  40.32  -6.09    0.46   1.30 1.34    2.47  5.92``````

4. ## Re: Meanings of a set of data

I would suggest making a new thread. When you do so please be more descriptive about what you actually want to do. What you've asked for doesn't actually make sense at the moment. "Reducing a row to a probability number" is vague nonsense - what is it that you actually want to do?

5. ## Re: Meanings of a set of data

Hi guys, when you want to read a data in a txt file

Code:
``> num <- read.table(file, quote="\"")``
how do you define the file in this code so that we can import it like this? where do we keep this file? I've installed R-studio after seeing in this discussion..

6. ## Re: Meanings of a set of data

@viktus
You can define like this:
Code:
``file <- "C:/path/to/the/file/num.txt"``
In my case, the contents of num,txt is like this:
Code:
``````v1 v2 v3 v4 v5
1  2  3  4  5
6  7  8  9 10
11 12 13 14 15
16 17 18 19 20``````

7. ## Re: Meanings of a set of data

Now that Ngungo have done basic quality control of the data and assured that the data are reasonable and correct, then it is time to move on to check other procedures. It is time to use analysis of variance (anova) and regression.

From the other thread it appeared that different components in profit and time were important.

If the data contains variables that might be explanatory variables for response variables (like profit components and time) it can be useful to try linear regression.

Try the command:
Code:
``summary(lm(y ~ x1))``
However, since the data are from an industrial process and not from a designed experiment, you will need to include in the model all the relevant explanatory variables. If it had just been a designed experiment ngungo could only include the experimental variables, the rest could be ignored because of randomisation.

Ngungo need to figure out what explains quality and profit.

It would be nice to hear how ngungo advances.