I'm trying to learn a bit here and I'm experimenting on some data I have available. The data is user records with data appended to each record that includes demographic info like gender, location (ZIP), age (in ranges like 30-35), kids (y or n), etc. There is also a bunch of interest/behavior data for each person such as interest in basketball. These interest/behavior data points are all yes or no. There are around 40-50 of these for each person.

Here's the question...what analysis should I be doing to try to understand what "clusters" (not sure if this is even the right term) exist in the database? For example, there might be a large group of people that are women, between 30-35, who are interested in skiing, reading, and hang-gliding, and have at least one child. I was thinking that k-means might work, but almost all the data points are binomial (y or n), so I believe I've read that this isn't appropriate. My next thought was some kind of hierarchical clustering. The additional challenge is that I don't know how many clusters to define. Ideally, I'd like the analysis to inform on that too, but I'd be satisfied stating in advance the number of "ideal" clusters (say, 5-6), which would be manageable to actually use in a marketing context.

The follow up question...what's the best way to perform this analysis? Assuming the answer is R, but if you can point me to some resources on the particular analysis, that would be great.

Thanks!