Giant project - Need tips


Before I came to your forum, I had researched a lot on solution I am having issue to solve. So decided to register here and perhaps someone could give some suggestion. I need algorithm but I don't know where and how to start. It is for job where director expects me to do something, even if i am still new employer (!), what hasn't been solved for past 5 years. Yes, you read it correctly. Im trying to solve some data analysis to get the ultimate goal out at any time I would want to refresh values to the most recent version. My current data environment is MS Excel with addon called BW (Business Warehouse). If you aren't familiar with usage/meaning of this addon then please briefly google it or youtube it to get some basic idea how does it work. Its all about filters filtering the data on Excel's spreedsheets and how/when/with what are those filters and their settings/parameters used. Im not even sure if I should search for data analysis expert or engineering quality assurance expert - perhaps both. Noone has been able to build up the data analysis method yet but at least I would like to give it a try, although it seems impossible to do. Could have asked on and similar websites but don't have any money for this.

So to describe the situation extremly briefly:

I work in quality department of some manufactory (company) that is producing different kind of property's/apartment's accessories. Currently located in quality of data analysis subsection. Prefer to not reveal what exactly (will later if needed) is being produced, although at the end of reading my post you would probably know what accessories they are. In reality of life always can go something wrong. No matter what kind of environment we work in. My project contains the case that whichever kind of problem occurs on client end location (property/apartment). Clients might be end users or might be resellers if some intermediate locations exist. For that project this isn't revelant but relevant is that the project includes situations when/where accessories are already physically on locations where they should operate. This is most likely user's living place. I plan to analyze within my project only situations when users contact repairmans to physically come to the doorstep, visit them, and fix the problem that occured on our manufactured accessories. Users might recognize this problem either by failure or by its consequence - it depends what failure is and what situation it is. Most of times by its consequence unless if failure is, e.g. noise (which happens very rarely, we all know noise is consequence of something going wrong). Such failure is recognized (heard) by users. But usually consequence is the red alert which causes user to contact repairman. How contact is being done doesn't matter. The only important fact is that user contacts repairman. Entire data analysis project is based on this assumption. If user tries to fix the problem on his or her or their own and even successfully, this does NOT count in the project. I don't even want to know this, it has zero meaning. Only situations I include are that users contact repairmans when something is wrong AND (!) that repairmans physically come to visit them with purpose to fix the problem. Just those two conditions matter. It doesn't even matter if repairman comes within the deadline set/wished by user, neither matters if repairman successfully fixes (usually they do) the problem on their first visit of user's living place already. Also it is completely irrelevant (but in analysis I might need to include such info too) when user contacts repairman. Some problems for sure have priority than others: e.g. failures related to accessory' functionalities usually have priority than failures related to exterior shape/looking. Everything is in the database and I see all info/data. I analyse it in Excel, although I might need to work with data inside some other software such as mysql. I know data from Business Warehouse (excel's addon) can be transfered there from excel but I don't know much about mysql.

Currently I am working with few hundreds filters (I would list some of them on request) and few thousands of conditions/parameters inside those filters. Fortunately I do NOT need to care about way how data is being filled into database, who exactly is filling it, neither whether or not something is missied but most likely everything is filled. Im just using entire tons of data.

***So what I trying to do:

We have different manufactory departments. Each for its own type of accessories. Several types are being produced in our company. What I want to have is possibility to be able to announce for the FUTURE how many visits by repairmen will happen in unit % with accuracy up to two decimal digits. Such factor shall be number of visits divided by number of accessories produced. My announcement should be grouped based on manufactoring department and also based on country where visit happens (where user calls repairman). Many graphs comparions would need to be done. Different people have been working on this for months, sum of time-period for years and all failed. I asked my collegues where is the main problem of not managing to implement such method and they said everyone got confused due to too many factors being involved here. A lot needs to be done, towards getting this method, on how particular graphs are performing and compare pieces of them with same or similar pieces of new graphs. I would like to know some tips/suggestion where and how would I need to start with trying to develop such complicated data analysis algorithm. Probably I might need to start with few basic decisions such as:

- frequency of visit per X factors (type of failure, urgency, frequency of consequence occursing again,...)

- what would i actually analyze: accessories from aspect of sold ones or from aspect of produced ones, etc.

- average density of quantity of sold accessories per country

- what kind, if any, assumption with what kind of probability should be considered that A accessory would perform in Y country differently than in X country.

Any tip would be much appreciated.
I am still looking for some assist here but Im unsure if i should look for excel expert, business warehouse (=excel's addon - i assume you are familiar with it at least for what it is used) expert, data analysis expert or engineering quality assurance expert. Hopefully i could get some suggestion where, how, to start this big project.


TS Contributor
a first advice would be to find ONE department and ONE product and first work on that problem. Pick the one where the problem is most pressing and stick to tjat.If anyone tells you that you need ro work on all, you can simply refer to past failures. It is better to solve one problem and then move to the next than to try to solve all and never finish, in my experience.

Second, be sure to define precisely what the problem is and how it is measured and check that it IS measured accurately. Do not be too surprised by what you might find:)

Once this is clear, it is time to think about stats and methods - but by then all will be much clearer.

rogojel this won't help. I need a way more detailed and completely different instruction. I assume you didn't really understand what is needed to be done - implement method with which i can define service repairmans visits of user's home in % out of products manufactured.


No cake for spunky
I tried to follow the comments, but got lost in all the detail. It would help if you said, in a very few sentences, what specifically you are trying to achieve here (you actually spend more time discussing what you are not trying to do).

I am not certain this is the right forum for this question as you are actually asking a six sigma question rather than statistics question (if I understand the issue). I have some background in that, I might be able to help but remain unclear what exactly you want to do.
It has passed a while since I opened this topic. No worth mentioning progress so far from me and still looking to get started on this. There are so many parameters, filters, etc related to engineering quality assurance, statistics, excel, sap/bw, graphs in millions of possible ways, etc. Not sure with what to start. Would also appreciate some private message or reply here if someone could assist so i could at least come to the first step.