Hello all, I am new to machine learning and wanna use MATLAB for it... I am trying to form a training set in MATLAB on the basis of following expression:
where S denotes the training set, M = 10, m = 1 to M, is the training feature such that , denotes the training label such that .
My query is what should be the dimension of my training set. I think it should be .
Any help in this regard will be highly appreciated.

1 commentaire

chaaru datta
chaaru datta le 14 Mai 2022
Any help in this regard would be highly appreciated...

Connectez-vous pour commenter.

 Réponse acceptée

the cyclist
the cyclist le 14 Mai 2022

0 votes

If I understand all of your notation correctly, I think your training set needs to be an Mx3 matrix.
If means that each observation of x has two components (epsilon minus and epsilon plus), then for each observation of the training set, you need two values to represent x, and one to represent y. So
M = [0.2 0.3 -1;
-0.3 0.4 1;
...
0.6 0.5 -1];
would be the representation in which
  • 1st column is x (epsilon minus)
  • 2nd column is x (epsilon plus)
  • 3rd column is y

16 commentaires

chaaru datta
chaaru datta le 14 Mai 2022
Modifié(e) : chaaru datta le 14 Mai 2022
Thank you so much sir for your answer....
But I have a query that how to assign label to each observation.This doubt arises to me because the first column of training set is related to epsilon minus , second column is related to epsilon plus then how should I decide for the label of that observation to be minus or plus.
the cyclist
the cyclist le 14 Mai 2022
Is this a supervised learning task? If so, then you should know all the input features (x) and the label y.
You have to know the features and the labels, in order to train the model.
If you don't know the values of the features and the label, you might have an unsupervised learning task.
Maybe you could explain more about your problem, and post your data?
chaaru datta
chaaru datta le 14 Mai 2022
Yes sir ...it's a supervised learning task and I know all the input features (x).
Also, I know that the label is either -1 or 1.
But I am having doubt that if we consider the first row then in 3rd column what should I label? Plus 1 or Minus 1?
I'm still not sure I understand your question. Do you want separate arrays for input and label?
X = [0.2 0.3;
-0.3 0.4;
...
0.6 0.5];
Y = [-1;
1;
...
-1];
chaaru datta
chaaru datta le 14 Mai 2022
No sir I don't want separate arrays for input and label...
Basically, I want the same array as earlier one i.e. M×3.
But my query is how one decides that my first row third column label is minus 1 or plus 1.
the cyclist
the cyclist le 14 Mai 2022
I'm confused.
You wrote "Also, I know that the label is either -1 or 1."
So, use the information you know. If you know the value is -1, put -1. If you know the value is +1, use 1.
chaaru datta
chaaru datta le 14 Mai 2022
Ok sir...Thanks a lot once again...will implement it in MATLAB now...
chaaru datta
chaaru datta le 15 Mai 2022
Hello Sir, I had implemented this training set (Mx3) for SVM. However , I am getting accuracy around 50 % whereas I was expecting it to be around 98%.
the cyclist
the cyclist le 15 Mai 2022
Can you upload your data and code? (You can use the paperclip icon in the INSERT section of the toolbar.)
Without seeing your data/code, it's impossible to know whether you have implemented something incorrectly, or if you just are expecting too much accuracy.
chaaru datta
chaaru datta le 16 Mai 2022
Hi sir,
I had shared my code and Training set...
I'm confused again, because the code you uploaded ...
  • doesn't load the data
  • seems to just generate random data (maybe for testing the code?)
  • doesn't fit a statistical model
When you say you got low accuracy, I don't see where you have calculated that.
Also, I did fit a logistic regression model to the data in that file (and also looked at some scatter plots and correlation coefficients), and it doesn't look like En_minus or En_plus have much explanatory power at all for Target:
data = readtable("https://www.mathworks.com/matlabcentral/answers/uploaded_files/999375/Dataset_PIDpaper_7_pls15dB_prac18.xlsx");
data.Target = (data.Target+1)/2;
modelspec = 'Target ~ En_plus + En_minus';
mdl = fitglm(data,modelspec,'Distribution','binomial')
mdl =
Generalized linear regression model: logit(Target) ~ 1 + En_minus + En_plus Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ___________ __________ ________ ________ (Intercept) -0.0073256 0.0094742 -0.77322 0.43939 En_minus -0.00045798 0.00088569 -0.51709 0.60509 En_plus 0.0018022 0.00089727 2.0086 0.044583 100000 observations, 99997 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 4.04, p-value = 0.133
chaaru datta
chaaru datta le 16 Mai 2022
Sir, I would like to answer your queries one by one....
1) I am using SVM to do the classification of wireless signals.
2) Data is not loaded : because I used MATLAB to generate the data (training set) of dimension Mx3, where M = 10^5.
3) seems to just generate random data : Data generated has random values because it is related to wireless channels which are random in nature.
3) I don't see where you have calculated the accuracy: Using this training set, I calculated the accuracy in Python.
I am also sharing the research paper which I am trying to implement.
the cyclist
the cyclist le 16 Mai 2022
I have to admit I can't spend the time to fully understand your code or that paper. But, here is my impression.
In your code, it looks Train_label_final is not just random, but random with no relationship to Train_set_features. In other words, this is the case where the signal-to-noise ratio is tiny. [SNR(dB) very negative.] In the paper, notice that when SNR(dB) = -15, they also get an accuracy of about 50%. I think you are seeing exactly the same thing.
But I don't see anywhere in your code where you coded an example in which SNR is large, so you have never simulated a case where the accuracy would be high.
chaaru datta
chaaru datta le 17 Mai 2022
Sir, in the code the large SNR of +15 dB is shown on line 21. And it's effect is included in line 44 and line 57....
I see that the signal is used in the calculation of the features, but it doesn't affect the label, right?
The label you generated is completely random, not affected by the features. Here is the code to generate the labels, with all other code removed:
M_train = 1*10^5; % for training iteration, given in paper as 10^5
M_train_detail = int32(randi([0, 1], [1, M_train])); % generating random tag symbols
Train_label_final = [];
for kk = 1:(M_train)
if M_train_detail(kk)== 0
lab = -1;
else
lab = 1;
end
Train_label = [lab];
Train_label_final = [ Train_label_final; Train_label];
end
This is random, with no reference to signal or the features. Therefore, it is no surprise that you cannot predict these labels from the features.
chaaru datta
chaaru datta le 17 Mai 2022
Yes sir...you are right. I am generating the labels but they are not affected by the features.
Also, I would like to describe the system model given in paper in brief.
1) System model contains Radio frequency source, tag and reader. 2) Tag reflects (backscatters) two types of signal viz., -1 and +1. 3) When reflected signal from tag is -1 , then epsilon minus feature is obtained at reader else epsilon plus is obtained at the reader. 4) Thus my training set consists of epsilon minus, epsilon plus and labels for each reflected signal from the tag.

Connectez-vous pour commenter.

Plus de réponses (1)

the cyclist
the cyclist le 17 Mai 2022

0 votes

I spent a little bit more time with the paper.
It seems to me that in the paper, the labels y are supposed to be used when generating s (Eq. 5 & 6) and then epsilon (Eq. 7 & 8).
But you don't use your labels as part of the calculation of the features.

7 commentaires

chaaru datta
chaaru datta le 18 Mai 2022
Yes sir...you are right...but I had also generated the features according to the labels...
For e.g in code line 44 to 54 is for label -1 and line 57 to 67 for label +1.
the cyclist
the cyclist le 18 Mai 2022
But the labels used to generate the feature are not what you use in the variable Train_label (which is the 3rd column of Train_set). Shouldn't they be the same labels? Instead, Train_label is just random noise.
Can you also post the Python code with the model, so I can see how you are using the output of the MATLAB program?
chaaru datta
chaaru datta le 18 Mai 2022
Sir, I am sharing the Pyhton code....
Sir, in this paper we have two features based on energy of signals and they are en_min and en_pls as mentioned in MATLAB code on line 54 and 67. So how should I assign the label to these features?
chaaru datta
chaaru datta le 18 Mai 2022
Hello sir, I would like to clarify few of my doubts one by one.
1) As per our earlier discussion training set has to be M x 3. So if we assume M =10 then training set will be 10 x 3, in which first column belongs to en_min, second coulmn belongs to en_pls and last column is of label. Is this correct sir?
2) If the mth bit from tag to reader is -1 then only en_min will be obtained. Then my query is what should be the value of en_pls ?
3) If the mth bit from tag to reader is 1 then only en_pls will be obtained. Then my query is what should be the value of en_min ?
4) If we consider that for a mth bit both en_min and en_pls are available then what we should write in the corresponding label i.e., in the third column.
Sir, I would like to tell you my few observations.
1) Sir, I also did in the following way: if mth bit is -1 then en_min will have value and en_pls will be zero and label is -1. And if mth bit is 1 then en_pls will have value and en_min will be zero and label is 1. However, I get 100% accuracy which is definetly not correct.
2) I also form the training set wherein, I compare en_min and en_pls and assign the label 1 if en_pls is greater than en_min and viceversa. But here also, I got 99.92% accuracy even if SNR is -15dB which is again not correct.
the cyclist
the cyclist le 18 Mai 2022
I'm not sure I can spend enough time reviewing the paper, and your code, to be able to answer these for you.
But what is very clear to me is that in your current code, the labels you are using are completely unrelated to the energy level features, so they will be unpredictable.
It seems possible to me that in the training set, the label are supposed to be almost perfectly predictable, but the testing set (with different labels) will not be as predictable. That is normally what happens in machine learning problems.
I can try to take another look, but probably not for a few days.
chaaru datta
chaaru datta le 18 Mai 2022
It's ok sir...Thank you so much for your whole hearted support...I will keep trying to implement this paper...Sir, pls do let me know once you are free so that if I could further discuss with you...
Also it would be better if you could suggest some links to me to solve such machine learning problems..
chaaru datta
chaaru datta le 20 Juin 2022
Hello Sir, can you please share your insights on forming training set as done in this paper.

Connectez-vous pour commenter.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by