+ Reply to Thread
Results 1 to 7 of 7

Thread: Not sure what method to use with this data

  1. #1
    Points: 76, Level: 1
    Level completed: 52%, Points required for next Level: 24

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Not sure what method to use with this data



    Hi,

    I hope I'm not asking a simple question, but I'm not exactly a statistician, so I was hoping someone can point me in the right direction.

    Let's say I have data of SAT scores, BMI, and 40yard times of students in Wyoming, New York, and Texas, but the data doesn't have the metric from the same student. We can assume SAT scores, BMI, and 40 yard times are independent. What the data might look like is in the attached file.

    Obviously BMI, SAT, and 40y are on completely different scales, but if necessary we can assume they are each normally distributed.

    Now, here is where I start to get vague and I apologize for not having better terms, but I want to figure out how "Similar" states are based on these metrics. If all three metrics are wildly different from each state, the states are not similar, and if all three metrics are similarly distributed, then the states are similar. If SAT scores are similar but 40y times are different, the metric should be somewhere in between.

    If someone can point me in the right direction on what kind of analysis I need to use, I would greatly appreciate it.

    Thank you in advance.
    Attached Files

  2. #2
    Points: 107, Level: 2
    Level completed: 14%, Points required for next Level: 43

    Location
    India
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Not sure what method to use with this data

    How about trying a chi-square to establish a relationship there incase you treat all your Xs & Y as discrete (categorical).

  3. #3
    Points: 76, Level: 1
    Level completed: 52%, Points required for next Level: 24

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Not sure what method to use with this data

    Quote Originally Posted by venkat5557 View Post
    How about trying a chi-square to establish a relationship there incase you treat all your Xs & Y as discrete (categorical).
    I'm sorry, I'm not sure what you mean by that...

  4. #4
    Test of Gnomality
    Points: 8,295, Level: 61
    Level completed: 49%, Points required for next Level: 155
    hlsmith's Avatar
    Posts
    1,514
    Thanks
    99
    Thanked 255 Times in 248 Posts

    Re: Not sure what method to use with this data

    Your data is confusing. Are these averages or individuals measurements? Why do the number of observations per variable vary (e.g., Texas only has one SAT score [hard not to make a Texas joke], but other states have more). The previous reply was asking you about using a Chi-Square test (for comparing two categorical variables). Though if we don't know what these data represent it is hard to propose suggestions. If they are means, standard deviations would be helpful in calculating t-tests. If these were means, then you may also be able to look at correlations. More information is needed.

  5. #5
    Points: 76, Level: 1
    Level completed: 52%, Points required for next Level: 24

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Not sure what method to use with this data

    Quote Originally Posted by hlsmith View Post
    Your data is confusing. Are these averages or individuals measurements? Why do the number of observations per variable vary (e.g., Texas only has one SAT score [hard not to make a Texas joke], but other states have more). The previous reply was asking you about using a Chi-Square test (for comparing two categorical variables). Though if we don't know what these data represent it is hard to propose suggestions. If they are means, standard deviations would be helpful in calculating t-tests. If these were means, then you may also be able to look at correlations. More information is needed.
    They are individual measurements, not means. That explains why the number of observations vary. Suppose the 40y, SAT, and BMI were different surveys. Then you will get varying number of responces for each category (and yes, there is an implied Texas joke).

  6. #6
    Test of Gnomality
    Points: 8,295, Level: 61
    Level completed: 49%, Points required for next Level: 155
    hlsmith's Avatar
    Posts
    1,514
    Thanks
    99
    Thanked 255 Times in 248 Posts

    Re: Not sure what method to use with this data

    There just doesn't seem to be much data here. Is the dataset larger than this? Comparing a single BMI to three from another state does not present well for comparisions. Typically you would compare measures of central tendency along with paying attentino to their measures of dispersion (e.g., means with standard deviations). I do not have any direct suggestions with this small dataset, and wonder about the representativeness of a couple of people from a larger state. Can I say that my age is 33 and my daughter is 1, are our ages different?

  7. #7
    Points: 76, Level: 1
    Level completed: 52%, Points required for next Level: 24

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Not sure what method to use with this data


    Quote Originally Posted by hlsmith View Post
    There just doesn't seem to be much data here. Is the dataset larger than this? Comparing a single BMI to three from another state does not present well for comparisions. Typically you would compare measures of central tendency along with paying attentino to their measures of dispersion (e.g., means with standard deviations). I do not have any direct suggestions with this small dataset, and wonder about the representativeness of a couple of people from a larger state. Can I say that my age is 33 and my daughter is 1, are our ages different?
    Yes, this is only a sample of the data. The actual data is much larger (say 1M records total). I know I could do a goodness of fit test within each category to see if they are significantly different, but I'm not sure how to combine the results from each category to come up with one aggregate metric.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats