creating a Dummy variable from a string vector

5 vues (au cours des 30 derniers jours)
Ruben Moreno
Ruben Moreno le 13 Mai 2022
Hi, i want to create a dummy variable of a string columnn cointaining information over investors. I have identified the names of the investors that i demme to have experience and would like to create a dummy = 1 if the cell contain the name of said investor, and 0 if not. The dataset is quite large so cant do this manually, not really sure how to proceed.
  2 commentaires
Walter Roberson
Walter Roberson le 13 Mai 2022
When you say "contains" do you mean that you want to compare the entries for exact equality? Or should "sam" be considered to be contained in "assam" or "samuel"? Should the comparison be case sensitive? Should "Billie-Joe" be considered the same as "Billie Joe"?
Ruben Moreno
Ruben Moreno le 13 Mai 2022
I think i solved the problem in excel, but basically vlookup one column (investor name) with another(experienced investor) assign value 1 / 0.
The format of the investor names are the same throughout the sample, so exact match.

Connectez-vous pour commenter.

Réponses (2)

Steven Lord
Steven Lord le 13 Mai 2022
Use the string manipulation and/or set membership functions. Let's start with a random set of names:
rng default
D = ["Doc"; "Grumpy"; "Happy"; "Sleepy"; "Bashful"; "Sneezy"; "Dopey"];
randomDwarfs = D(randi(numel(D), [20 1]))
randomDwarfs = 20×1 string array
"Sneezy" "Dopey" "Doc" "Dopey" "Bashful" "Doc" "Grumpy" "Sleepy" "Dopey" "Dopey" "Grumpy" "Dopey" "Dopey" "Sleepy" "Sneezy" "Doc" "Happy" "Dopey" "Sneezy" "Dopey"
We can use matches to look for an exact match; startsWith, endsWith, or contains to look for text inside a name; or ismember to ask for members of a group of names.
onlyGrumpy = matches(randomDwarfs, "Grumpy");
startsWithD = startsWith(randomDwarfs, "D"); % Doc, Dopey
bashfulOrSleepy = ismember(randomDwarfs, ["Bashful"; "Sleepy"]);
results = table(randomDwarfs, onlyGrumpy, startsWithD, bashfulOrSleepy)
results = 20×4 table
randomDwarfs onlyGrumpy startsWithD bashfulOrSleepy ____________ __________ ___________ _______________ "Sneezy" false false false "Dopey" false true false "Doc" false true false "Dopey" false true false "Bashful" false false true "Doc" false true false "Grumpy" true false false "Sleepy" false false true "Dopey" false true false "Dopey" false true false "Grumpy" true false false "Dopey" false true false "Dopey" false true false "Sleepy" false false true "Sneezy" false false false "Doc" false true false
I can extract the names from randomDwarfs using those logical vectors.
BorS = randomDwarfs(bashfulOrSleepy)
BorS = 3×1 string array
"Bashful" "Sleepy" "Sleepy"

Walter Roberson
Walter Roberson le 13 Mai 2022
ismember(investors, experiencedinvestors)

Catégories

En savoir plus sur Logical dans Help Center et File Exchange

Produits


Version

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by