I would like to compare the performance of my algorithms on these problem instances. Performance can be measured by

- runtime until proven optimality (can be infinity if the time limit is exceeded)
- runtime until optimal solution is found (if proven optimal)
- quality of the best solution found within the given time limit
- memory consumption
- etc.

**For each of these performance measures, I would like to do a solid statistical test to determine if one algorithm is significantly better than others or not.**

My questions are:

- Is it reasonable to do independent tests for each of the performance measures? (For example, it could very well happen that one algorithm is much faster finding good solutions, but can barely prove optimality. In this case, it depends on the circumstances which algorithm should be preferred and a general conclusion is not possible.)
- Which tests are suited for my problem?
- On how many problem instances should I run the algorithms to obtain performance data?
- How often should I ran the algorithms on each problem instance (most of the algorithms are non-deterministic)?
- How do I set reasonable time limits?
- What other useful performance measures could be tested?
- Any other ideas?