I am planning to convert my machine learning code from R to MATLAB in which I impute the missing variable using KNN. In the R code, I impute the missing data after I spilt them into training and testing sets to prevent the double dipping. So the R code simple will be as follow:
- Impute missing values in the training dataset (mltrain) only:
- mltrain2 <- DMwR::knnImputation(mltrain)
- Impute missing values in the testing dataset (mltest) using a data frame (here the training dataset) containing the data set that should be used to find the neighbours
- mltest <- DMwR::knnImputation(mltest,distData = mltrain)
In MATLAB, I tried to use (knnimpute) on the training and testing datasets seperatly in the same way as the R code above, however, there is no option to pass the training data frame during the imputation of the missing values of the testing dataset.
Any suggestion on how to solve this issue?