This submission gives a few tools for manipulating permutation (order, ranking) sets (or profiles). Basic theory and references are provided in the functions' help.
PermutationDistance computes the Kendall tau distance (or swap or bubble-sort distance) between permutations or sets of permutations.
MedianPermutation computes the order (or orders) that minimizes the sum of Kendall tau distances to each order in the set. This is the Kemeny-Young order, which has desirable properties as a consensus order. However it is a computationally hard problem. This implementation used various ideas from the literature to be efficient, but does an exact computation, and thus is limited in the length and number of permutations it can handle. It ran successfully on our data (as part of the clustering algorithm below): sets of 20 or so permutations of length 15.
MallowsClustering runs an expectation-maximization (EM) algorithm with a parametric exponential model (Mallows' phi distribution) to find the "best" mixture model to represent the data. This uses the (weighted) median permutation function as central parameter for the clusters.
This code was developed and used for analyzing preference interview data, which will be presented in a future publication:
Lalancette, A. & Lalancette, M., (in progress). Understanding Fishing Preferences with a Novel Approach to Preference Interviews
And was presented at a conference:
Preference Interviews: A useful Method for Engaging with Local Fishers, Quebec Center for Biodiversity Science Annual Symposium, Montreal, October 29 - 31, 2015
Marc Lalancette (2019). Expectation maximization clustering, median and distance for set of permutations (https://www.mathworks.com/matlabcentral/fileexchange/57931-expectation-maximization-clustering-median-and-distance-for-set-of-permutations), MATLAB Central File Exchange. Retrieved .