We want to identify for which categories we need to differentiate in assortment over stores. We want to prioritize categories from highest potential to lowest (since we deal with 200 categories we can not do all categories at once).

One of the things we want to take into account is variation of value(or volume) share across stores. If we see a lot of variation we think there is more potential to make more customized assortment plans for stores than if there is less variation across stores.

We have some discussion about calculating this variance amd i hope someone can help us out!

Someone thinks it is necessary to normalize the variation per category. What he does is the following for example for soup:

Calculate the value share per store (500 stores)

Delete outliers (values 3x higher than standard deviation +- mean)

'Normalize' the values: put minimum value share to 0 and maximum to 1. So eg the minimaal share for soup is 1% and the maximum is 5%, after standadizing these values are 0 and 1.

Calculate variation

He wants to normalize the values because otherwise categories with large value shares almost always pop up with the highest variation. In this way it is more dependend On how we define categories, which is always a bit subjective...

I never heared of this methodolgy and I doesnt feel comfortable at all. I think we need to use variation (without normalization). Value share is comparable and categories with for example a value share beteren 1% and 2.5% are more interesting to further investigate for differentiation than a categorie with a share between 0,5% and 0,7%.

In My opinion there is a high risk in this normalization thing because he stretchers out categories with initially a low variation between stores and shrinkes categories with initially a hihg variation across stores. However I have difficulties convincing him.

Can someone help me with a more statistical explanation of why 'normalizing' these value shares is wrong? Or convince me that his way of working is right?

Kind regards,

Inge