# how to divide a data set randomly into training and testing data set?

229 vues (au cours des 30 derniers jours)
chocho le 16 Avr 2018
Hello guys, I have a dataset of a matrix of size 399*6 type double and I want to divide it randomly into 2 subsets training and testing sets by using the cross-validation.
i have tried this code but did get what i want https://www.mathworks.com/help/stats/cvpartition-class.html
Could anyone help me to do that?
Expected outputs:
training_data: k*6 double
testing_data: l*6 double
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
chocho le 16 Avr 2018
Any help plz???

Connectez-vous pour commenter.

### Réponse acceptée

KSSV le 16 Avr 2018
Modifié(e) : KSSV le 16 Avr 2018
Let A be your data of size 399*6. To divide data into training and testing with given percentage:
[m,n] = size(A) ;
P = 0.70 ;
idx = randperm(m) ;
Training = A(idx(1:round(P*m)),:) ;
Testing = A(idx(round(P*m)+1:end),:) ;
##### 20 commentairesAfficher 18 commentaires plus anciensMasquer 18 commentaires plus anciens
Thank you!!
Abhijit Bhattacharjee le 4 Mar 2023
If it hasn't been covered already, you can also use cvpartition to split the dataset. See THIS answer for more details.

Connectez-vous pour commenter.

### Plus de réponses (8)

Jeremy Breytenbach le 24 Mai 2019
Modifié(e) : Jeremy Breytenbach le 24 Mai 2019
Hi there.
If you have the Deep Learning toolbox, you can use the function dividerand: https://www.mathworks.com/help/deeplearning/ref/dividerand.html
[trainInd,valInd,testInd] = dividerand(Q,trainRatio,valRatio,testRatio) separates targets into three sets: training, validation, and testing.
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
Koosha le 30 Mar 2022
Thank you

Connectez-vous pour commenter.

ALDO le 2 Fév 2020
you can use The helper function 'helperRandomSplit', It performs the random split. helperRandomSplit accepts the desired split percentage for the training data and Data. The helperRandomSplit function outputs two data sets along with a set of labels for each. Each row of trainData and testData is an signal. Each element of trainLabels and testLabels contains the class label for the corresponding row of the data matrices.
percent_train = 70;
[trainData,testData,trainLabels,testLabels] = ...
helperRandomSplit(percent_train,Data);
make sure to have the proper toolbox to use it.
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
Lucrezia Cester le 7 Fév 2021

Connectez-vous pour commenter.

sidra ashiq le 23 Nov 2018
Training = A(idx(1:round(P*m)),:) ;
what is the A function??
##### 2 commentairesAfficher AucuneMasquer Aucune
Mohamed Marei le 17 Déc 2018
A is the vector or array indexed by the elements inside the bracket. It is not a function.
madhan ravi le 17 Déc 2018
A is a matrix

Connectez-vous pour commenter.

Mehernaz Savai le 26 Mai 2022
Modifié(e) : Mehernaz Savai le 26 Mai 2022
You can partition data in a number of ways:
Let X be your input matrix. You can also use similar workflow for Tables.
If you have the Statistics and Machine Learning Toolbox, you can use cvpartition as follows:
% Partiion with 40% data as testing
hpartition = cvpartition(size(X,1),'Holdout',0.4);
% Extract indices for training and test
trainId = training(hpartition);
testId = test(hpartition);
% Use Indices to parition the matrix
trainData = X(trainId,:);
testData = X(testId,:);
If you have the Deep Learning Toolbox, you can use dividerand as follows:
% Partiion with 60:20:20 ratio for training,validation and testing
% respectively
[trainId,valId,testId] = dividerand(size(X,1),0.6,0.2,0.2);
% Use Indices to parition the matrix
trainData = X(trainId,:);
valData = X(valInd,:);
testData = X(testId,:);
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Pramod Hullole le 5 Mar 2019
hello sir,
iI'm new to the neuralnetworks..now i am working on my projects which is leaf disease detections using image processing. i am done with feature extraction and now not getting what is the next step..i know that i should apply nn and divide it in training and testing data set.. but in practically how to procced that's what i am not getting .please help me through this... please send steps..each steps in details. .
##### 1 commentaireAfficher -1 commentaires plus anciensMasquer -1 commentaires plus anciens
Savas Yaguzluk le 8 Mar 2019
Dear Pramod,

Connectez-vous pour commenter.

Hossein Amini le 15 Juil 2019
Hi there, it worked for me but I have problem in rest of the code. In newrb doc, it has been witten how to write the code but the more tried that I did, I got error like below.
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Hossein Amini le 15 Juil 2019
[z,r] = size(X);
idx = randperm(z);
TrainX = (X(idx(1:round(Ptrain.*z)),:))';
TrainY = (Y(idx(1:round(Ptrain.*z)),:))';
TestX = (X(idx(round(Ptrain.*z)+1:end),:))';
TestY = (Y(idx(round(Ptrain.*z)+1:end),:))';
If I'm not mistaken, in newrb doc, the size of input data and output data should be same like (4x266 and 1x266), that's why I transposed that matrixes. But the error which I got is specifying zeros matrix. I don't know how to prepare that.
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

ranjana roy chowdhury le 15 Juil 2019
the dataset is WS Dream dataset with 339*5825.The entries have values between 0 and 0.1,few entries are -1.I want to make 96% of this dataset 0 excluding the entries having -1 in dataset.
##### 0 commentairesAfficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

### Catégories

En savoir plus sur Object Detection dans Help Center et File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by