Choosing the number of clusters in an unsupervised learning (clustering) model has no general approach. The reason may be that your choice of k depends on data beyond the features (variables, columns) used in your input data to that model. For instance, if you want to do well at capturing some structure of feature Y not used in the model, then you might cluster your features X into k clusters. Upon completion, you then review how Y is broken down by the k clusters associated with X. Using something like cross-validation (as discussed in the link Lazar mentioned), you can find the optimal k to relate the clustering of X in terms of Y. This, however, is not unlike TE's choice of "the level of k ... as to maximize satisfaction among my colleagues." Since the method is unsupervised, it will only do what it was designed to do: find linear separations in the feature space defined by X such that the within-cluster deviations from center is minimized. That objective may not meet the objective of the application. Thus, the choice of k or any learning model should be determined by the business objective of using the model in the first place.
There are many other methods with different parameter options to consider, also. Some of those methods do have ways of picking out the number of clusters, but they also aim at different objectives to k-means. I'd recommend checking out Coursera for their machine learning or clustering courses. At the very least, they are informative.
For doing clustering, I've found the flexclust package to be particularly well designed and very flexible. Something more advanced might be kernel methods, for which kernlab has a function for kernel k-means.