Arrange words and phrases separated by semi-colon into a single column

7 vues (au cours des 30 derniers jours)
Tejas
Tejas le 29 Sep 2020
I am analyzing some journal articles using their keywords. I extracted the keywords from a publication, and they have been saved in the form as shown in the figure.
Keywords and key-phrases are separated by a semi-colon in each cell. Since my least time interval is one month, I'd like to arrange all the key words and phrases in columns corresponding to each month of the year. For example, all the words and phrases in the image would come under a single column 'JAN 2001'. My ultimate goal is to do a frequency analysis of these keywords, and I want 'galaxies' to be considered separate from 'iras galaxies' or 'elliptic galaxies'. I guess repetition of keywords would be allowed within a month as well, since it just shows the trendiness of that concept. How can I separate strings using the semi-colon and arrange them by month? Thank you!

Réponse acceptée

Steve Eddins
Steve Eddins le 29 Sep 2020
% Simulated data covering two months.
Phrases = ["EARLY-TYPE GALAXIES; X-RAY; DENSE CLUSTERS"
"LOCAL CONVERGENCE DEPTH; TULLY_FISHER OBSERVATIONS; X_RAY"
"SEYFERT-GALAXIES; PERIODICITY; ASSOCIATIONS"
"LYMAN-LIMIT ABSORPTION; LUMINOSITY FUNCTION"];
Month = ["JAN" "JAN" "FEB" "FEB"]';
Year = [2001 2001 2001 2001]';
t = table(Phrases,Month,Year);
% Group all the phrase sets by month and year.
t2 = varfun(@(x) {x(:)},t,'GroupingVariables',["Month" "Year"]);
% Grab the grouped phrase sets from t2 as a cell array, one cell per
% month/year.
c = t2.Fun_Phrases;
% Join the individual phrase sets by a semicolon. Use UniformOutput = false
% to keep it in a cell array.
c2 = cellfun(@(x) join(x,";"),c,"UniformOutput",false);
% Now split by semicolon and remove leading and trailing blanks.
c3 = cellfun(@(x) strtrim(split(x,";")),c2,'UniformOutput',false);
% Put back in a table.
t3 = t2(:,["Month" "Year"]);
t3.Phrases = c3;
At this point, here's what t3 looks like:
  3 commentaires
Image Analyst
Image Analyst le 1 Oct 2020
Try findgroups().
Walter Roberson
Walter Roberson le 1 Oct 2020
t3.Datetime = datetime(t3.Month + " " + t3.Year, 'InputFormat', 'MMM uuuu');
sort(t3, 'Datetime')

Connectez-vous pour commenter.

Plus de réponses (0)

Catégories

En savoir plus sur Creating and Concatenating Matrices dans Help Center et File Exchange

Produits


Version

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by