Creating a variable problem
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
I am trying to create a (instrumental) variable for my linear regression.
The variable is intended to be: Number of drug generic products not offered by firm i. This is, I want to count all the generic products that all firms sell, except the "own" one.
The key variables are:
firm: A list of 44 firms numbered from 1 to 44
indicator: Being a dummy variable that takes value 0 if the drug is generic, 1 if it is branded.
productid: The unique identifier of each product in my dataset.
The thing is that my dataset is panel data, and I want to count only the unique first instance of generic for each firm and productid. Ideally, what I would like to do is to iterate ovear each productid for each firm then take the first instance of the generic for each firm/productid combination, and then sum that count. Once I have that count, I just have to take all the generics of my dataset (82) and then subtract the sum I just did for each firm. This is what I tried so far:
% Iterate over each firm
uniqueFirms = unique(m.firm);
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Get unique product IDs for the current firm
firmProductIDs = unique(m.productid(m.firm == firm));
% Iterate over each productid for the firm
for j = 1:length(firmProductIDs)
pid = firmProductIDs(j);
% Find the first generic product for the current productid within the firm
firstGenericIndex = find(m.firm == firm & m.productid == pid & m.indicator == 0, 1, 'first');
if ~isempty(firstGenericIndex)
m.first_generic_by_firm(firstGenericIndex) = 1;
end
end
end
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Iterate over each firm to perform the subtraction
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Count the first instances of generics for the firm
countGenericsByFirm = sum(m.first_generic_by_firm(m.firm == firm));
% Subtract from total and assign to the relevant rows
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm;
end
The final result is just a vector of zeros in the variable
m.generics_not_offered_by_firm
Also the variable
firstGenericIndex
only stores a vector of zeros.
Could anyone help me with that? Maybe you can propose another approach. If you need further information just let me know
Thanks,
Alejandro.
0 commentaires
Réponse acceptée
Shivam
le 14 Jan 2024
Hi,
Based on the information provided, I understand that you want to calculate the "Number of generic drug products unavailable from firm i," which involves pinpointing the initial introduction of a generic product by each distinct firm-productid combination within the data. Eventually, you want to get the overall generic drug count.
You can follow the below workaround to achieve the goal:
% Sort the table by firm, productid, and then by indicator to ensure generics come first
m = sortrows(m, {'firm', 'productid', 'indicator'});
% Find the unique combinations of firm and productid for generics (indicator == 0)
[uniqueComb, ia, ~] = unique(m(m.indicator == 0, {'firm', 'productid'}), 'rows', 'stable');
% Create a logical index for the first instance of each unique combination
firstGenericIndex = false(height(m), 1);
firstGenericIndex(ia) = true;
% Use accumarray to count the number of first generics for each firm
countGenericsByFirm = accumarray(m.firm(firstGenericIndex), 1, [], @sum);
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Use the countGenericsByFirm to fill in the generics_not_offered_by_firm
for i = 1:length(unique(m.firm))
firm = unique(m.firm(i));
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm(firm);
end
I hope it helps.
Thanks
2 commentaires
Shivam
le 17 Jan 2024
Hey,
Can you attach your files for me to debug the issue? Since, I tried by creating a dummy data and it worked.
Plus de réponses (0)
Voir également
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!