How can i compute Amino Acid composition for my protein sequence data?

How can i get/compute the amino composition for my protein sequences inorder to further use it to train my SVM classifier?
for example if, i have the following sequence as one of my sequence sample:
'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE'

 Réponse acceptée

Tommy
Tommy le 23 Avr 2020
Modifié(e) : Tommy le 23 Avr 2020
allAA = sort('ARNDCQEGHILKMFPSTWYV');
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = histc(seq, allAA);
freq = counts/numel(seq);
for aa = allAA
fprintf('%c: %d/%d (%.4f%%)\n', aa, counts(allAA==aa), numel(seq), freq(allAA==aa));
end
%{
prints:
A: 1/49 (0.0204%)
C: 0/49 (0.0000%)
D: 10/49 (0.2041%)
E: 12/49 (0.2449%)
F: 2/49 (0.0408%)
G: 1/49 (0.0204%)
H: 0/49 (0.0000%)
I: 5/49 (0.1020%)
K: 3/49 (0.0612%)
L: 4/49 (0.0816%)
M: 0/49 (0.0000%)
N: 3/49 (0.0612%)
P: 1/49 (0.0204%)
Q: 2/49 (0.0408%)
R: 0/49 (0.0000%)
S: 1/49 (0.0204%)
T: 0/49 (0.0000%)
V: 1/49 (0.0204%)
W: 0/49 (0.0000%)
Y: 3/49 (0.0612%)
%}

Plus de réponses (1)

If you have the Bioinformatics Toolbox, there's also the AACOUNT function:https://www.mathworks.com/help/bioinfo/ref/aacount.html
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = aacount(seq)
% Optional: plotting included
aacount(seq, 'chart', 'bar')

Catégories

En savoir plus sur Genomics and Next Generation Sequencing dans Centre d'aide et File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by