I was going to say this was a novel method, but unfortunately it isn't. The idea of splitting a data set into training and test sets non-randomly is one that is very appealing, seems quite reasonable at first, and leads to great results. And those results are completely misleading.
The safest procedure when doing the two-third/one-third split (or whatever) of a dataset into training and test sets is to choose the split randomly. As soon as you do anything else, the performance of your results on the test set will not be related to its real-life predictive performance.
Now and then a paper comes out where the training set is chosen to be a diverse set (whether by a Kennard-Stone type procedure, gridding the points, or initial clustering). At first this seems quite reasonable: you've made sure that you have chosen points that span the entire space of your dataset. In actual fact, what you've just done is to ensure that all of your test set points are close to a point in your training set. This means that the predictions on the test set are now woefully optimistic and unrepresentative of predictions on a true test set (which probably won't have the courtesy to be close to points in your training set, nor be distributed evenly across the whole space).
Here's another way of thinking about it. Let's imagine that our predictive model is meant to work on a particular chemical space of molecules (the domain of applicability or so). These molecules have a particular distribution of descriptor variables (or whatever). And both the training and test sets are supposed to be representative samples of this distribution. However, once you use a non-random method to split into training and test sets, both the training and test sets become less and less representative of the population and hence you will have little idea of the real-life predictive ability of your model.
I've never seen this idea written down anywhere, so I'd be interested to know whether anyone thinks there is justification for choosing a non-random training set in certain circumstances. My own advice though is to keep it real and keep it random.
Image credit: Arenamontanus