Hi All,

I was wondering whether you could help me with this. I’m undertaking a piece of short coursework (assessed – pass/fail grading). Only going as far as some descriptive statistics is sufficient, so I'm at that stage at the moment, but for my own development I wanted to go a little further.

I’m looking at the impact of different factors on international students attainment at university. Forgive my lay talk about this, I don’t have much background in statistics.

EDIT: Research question is "What factors influence the grade attainment of international students". I also have a 'control' dataset of domestic students that I also plan to perform the below on, but the descriptions below are for the international students.

The measurements I have (amongst others, but these are the focus of my analysis) are:

(1) (ordinal, 5 questions, Likert 1-5) pedagogic familiarity

(2) (ordinal, 5 questions, Likert 1-5) language familiarity

(3) (continuous) 'additional' study hours

(4) (continuous, dependent variable) grade difference (pre-university gpa - current gpa)

In descriptive terms, if I look at the relationships between the above, I see that:

1 vs 2 no relationship

1 (x) vs 3 (y) creates a loose y = log(x) curve.

2 (x) vs 3 (y) creates a y = a.(x) linear relationship.

3 (x) vs 4 (y) creates a loose y = log(x) curve.

Looking into different methods and having some very basic familiarity with regression analysis (primarily from looking at results in journals), I thought that multiple regression analysis would be useful (with 4 as dependent and 1-3 as independent).

My questions are:

1) 1&2 are not related, but 1/2&3 are – can I reduce this to just a nonlinear regression analysis of 3 (independent) and 4 (dependent)?

2) If not, is it ‘OK’ to mix ordinal and continuous data (with a warning that end result of any regression can only be viewed as ‘approximate’)?

3) Should I even be thinking about nonlinear regression? Or is that overkill (should I be looking at a log transformation - I understand the concept but probably not as much as I need to)?

4) Is it useful or totally pointless to look at multiple regression to examine the relationship of 1/2&3, then as a separate regression 3&4?

5) Is there any other statistical method I could/should use to explore the relationships here? I'm not dead-set on regression, but from investigating different methods it seemed to be the best fit for me.

Sorry if this all sounds stupid – I’ve recently read a couple of regression textbooks and I’ve been googling but I can’t quite get my head around published guides and advice in the context of my research so I would really appreciate any guidance.

I was wondering whether you could help me with this. I’m undertaking a piece of short coursework (assessed – pass/fail grading). Only going as far as some descriptive statistics is sufficient, so I'm at that stage at the moment, but for my own development I wanted to go a little further.

I’m looking at the impact of different factors on international students attainment at university. Forgive my lay talk about this, I don’t have much background in statistics.

EDIT: Research question is "What factors influence the grade attainment of international students". I also have a 'control' dataset of domestic students that I also plan to perform the below on, but the descriptions below are for the international students.

The measurements I have (amongst others, but these are the focus of my analysis) are:

(1) (ordinal, 5 questions, Likert 1-5) pedagogic familiarity

(2) (ordinal, 5 questions, Likert 1-5) language familiarity

(3) (continuous) 'additional' study hours

(4) (continuous, dependent variable) grade difference (pre-university gpa - current gpa)

In descriptive terms, if I look at the relationships between the above, I see that:

1 vs 2 no relationship

1 (x) vs 3 (y) creates a loose y = log(x) curve.

2 (x) vs 3 (y) creates a y = a.(x) linear relationship.

3 (x) vs 4 (y) creates a loose y = log(x) curve.

Looking into different methods and having some very basic familiarity with regression analysis (primarily from looking at results in journals), I thought that multiple regression analysis would be useful (with 4 as dependent and 1-3 as independent).

My questions are:

1) 1&2 are not related, but 1/2&3 are – can I reduce this to just a nonlinear regression analysis of 3 (independent) and 4 (dependent)?

2) If not, is it ‘OK’ to mix ordinal and continuous data (with a warning that end result of any regression can only be viewed as ‘approximate’)?

3) Should I even be thinking about nonlinear regression? Or is that overkill (should I be looking at a log transformation - I understand the concept but probably not as much as I need to)?

4) Is it useful or totally pointless to look at multiple regression to examine the relationship of 1/2&3, then as a separate regression 3&4?

5) Is there any other statistical method I could/should use to explore the relationships here? I'm not dead-set on regression, but from investigating different methods it seemed to be the best fit for me.

Sorry if this all sounds stupid – I’ve recently read a couple of regression textbooks and I’ve been googling but I can’t quite get my head around published guides and advice in the context of my research so I would really appreciate any guidance.

Last edited: