# Distinct counts / How many different products did a client buy?

#### Londoner

##### New Member
Hi all,

I'm still pretty new to SPSS and trying to find my way around the different analyses that are possible. I'm a bit stuck on replicating one analysis I tend to do in Excel, involving distinct counts for a pair of string variables.

Imagine that I have the following data set:

Client - Product - Price
Adam - Apples - $2.00 Berta - Apples -$1.80
Doug - Apples - $1.50 Doug - Apples -$1.80
Carl - Bananas - $1.50 Adam - Oranges -$2.20
Doug - Oranges - $2.50 I would like to know: How many different products did each client buy? So in this case, I would expect the following results: Client - Distinct Product Count Adam - 2 Berta - 1 Carl - 1 Doug - 2 (Note that for Doug it should be 2, not 3, because he bought Oranges once and Apples twice but I am interested in distinct products) In Excel, this is possible by using pivot tables and distinct counts, but I have no idea how to implement this in SPSS (or if it is even possible). Any pointers will be highly appreciated #### Berley ##### Member The feature you are looking for is called a "crosstab." It is at Analyze > Descriptive Statistics > Crosstabs. #### Londoner ##### New Member Hi Berley, Thanks for your reply. I have tried to use crosstabs, but it does not seem to be able to do distinct counts or at least I did not manage to find the correct options. Please see below for the output that I am getting. The problem is that for Doug, it is showing "3" in the total column, while I would like to see "2", as in he bought two different products, although he did make three transactions. It would be great if you could help me achieve that. I have pasted the syntax as well, because maybe that is where I am going wrong. Thanks again Edit: I spelt Oranges wrong in one of the data lines, which is why it shows twice in the attached image, sorry about that Last edited: #### Karabiner ##### TS Contributor [untested:] First sort by client and product. Use Aggregate with break variables client and product and check the "count number of cases" box. This will give you a new variable containing the frequency for each combination of client and product. Then create a new variable: IF(name ne LAG(name) | product ne LAG(product)) flag = 1. You can then select or filter by flag. With kind regards K. #### Londoner ##### New Member Hi Karabiner, Thanks for your suggestion, seems to work perfectly well. Very elegant approach to the solution! In case that anyone else is stuck with a similar problem, this syntax following Karabiner's hints got me to the point where Crosstabs is showing the desired results for the sample data mentioned above: SORT CASES BY Client(A) Product(A). AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=Client Product /N_BREAK=N. IF (Client NE LAG(Client) OR Product NE LAG(PRODUCT)) flag=1. EXECUTE. USE ALL. COMPUTE filter_$=(flag=1).
VARIABLE LABELS filter_$'flag=1 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$(f1.0). FILTER BY filter_$.
EXECUTE.

CROSSTABS
/TABLES=Client BY Product
/FORMAT=AVALUE TABLES
/CELLS=COUNT
/COUNT ROUND CELL.