Let's say this is the model we built: f(x)=a+b1x1+b2X2+e, X1 is awareness of our product, an indicator variable. X2 is number of inquiries, a continuous variable. We are trying to predict likelihood of belonging to group 1.

My understanding of logistic regression is that the outcome is a probability of something happening or not at individual record level. However, my colleague thinks it can be used for aggregate level as well. For example, at aggregate level, our awareness is 40% (40% of respondents replied with a yes-1, 60% with no-0), and our average number of inquiries is 3.8. So my questions are

1) can we use the group aggregate values to replace X1 with 0.4 and X2 with 3.8 in this model? Or do we have to use 0 or 1 for X1?

2) what does the dependent variable mean if we use aggregate values? Let's say if we replace with aggregate values, the outcome is 0.58. Does it mean a) the probability of belonging to group 1 is 0.58? or b)58% of people belong to group 1?

Thank you!