Multiple Linear Regression Problem

I want to do a multiple regression analysis.
I have one Y which are numeric values (sample number = 85) and the following Xs.

Sr No Variable Data type Data example
1 Region 4 factor levels Region ABC
2 Class 3 factor levels Class 1
3 Start time 9 factor levels (9 start times) 9am
4 End time 5 factor levels (5 end times) 5pm
5 Age Numeric 15 years
6 X1 4 factor levels
7 X2 3 factor levels
8 X3 8 factor levels
9 X4 6 factor levels
10 X5 2 factor levels
11 XImp Numeric 87
12 X6 9 factor levels

My goal is to find out how much XImp (one particular important X) influences Y in presence of other 11 Xs.
I guess I can use a multiple linear regression for this to know how much each X influences Y and what is the share of XImp in presence of other Xs in influencing Y.
I am a bit confused about two things
1. How to prepare the data for analysis in R (Do we need to create dummy variables as we have to do in MS Excel by assigning 0 and 1 for non-numeric Xs? Or does R automatically uses them as dummy variables if the values are non-numeric tests).
2. I am a bit confused after reading the multiple regression analysis, step-wise regression (forward, backward) etc. What would be the best method to suit this analysis and if you could provide a link to a good resource on 'how to'?

Thank you,
Last edited:


TS Contributor
I would look at Crowley's The R Book. It would answer both of your questions very nicely in its chapter on regression.