Hi - I am trying to determine a good tool that will help me generate a probability of a sale for a list of of 300,000 products. I have a table of historical sales data (with about 300,000 records) that contains around 10 continuous variables along with a dependent variable that has a yes/no (i.e., binary outcome) value indicating whether product in the list has had a sale in the past 12 months.

The historical data essentially looks like this.

Product1,2,3 etc
Variable 1
Variable 2
Variable 3
Variable 4
Variable 5
Variable 6
Variable 7
Variable 8
Variable 9
Variable 10
Sold in past 12 months (Yes or No)

The last variable in the list is of course the dependent variable.

All I want to do is to find a simply tool that is going to be the best or easiest to use, so that I can assign a probability to each product in the list, essentially giving me the chance to condense my list to the products that are the highest likelihood to generate a sale, so that I can list those products instead of the others.

Ideally, the tool could do a quick lostic regression, or some other probability calculation based on the available variables, and thereby give me a (RVU-like) number (perhaps a probability ranging from 0 to 1) for each product, allowing me to quickly select the top 50,000 products to list on a website, since they have the higher probability of generating a sale according to the available variables.

I am of course assuming that the variables are somehow correlated to the outcome, but perhaps the tool will help me determine that.

(1) Does anyone have any suggestions of a good tool to accomplish this? I would presume that there is a simple way to set this up in Microsoft Excel, but if not, then a piece of software that does this would of course be great too.

(2) I am also open to suggestions as to which type of regression analysis (or other analysis) is the best to accomplish this.

Thanks for any suggestions.