Bias Mitigation in Credit Scoring by Reweighting

Bias mitigation is the process of removing bias from a data set or a model in order to make it fair. Bias mitigation usually follows a bias detection step, where a series of metrics are computed based on a data set or model predictions. Bias mitigation has three stages: pre-processing, in-processing, and post-processing. This example demonstates a pre-processing method to mitigate bias in a credit scoring workflow. The example uses bias detection and bias mitigation functionality from the Statistics and Machine Learning Toolbox™. For a detailed example on bias detection, see the following example: Explore Fairness Metrics for Credit Scoring Model.

The bias mitigation method in this example is Reweighting which essentially reweights observations within a data set to guarantee fairness between different subgroups within a sensitive attribute. As a result of reweighting, the Statistical Parity Difference (SPD) of all subgroups goes to `0` and the Disparate Impact metric becomes `1`. This example demonstrates how reweighting works in a credit scoring workflow.

Load the `CreditCardData` data set and discretize the `'CustAge'` predictor.

```load CreditCardData.mat AgeGroup = discretize(data.CustAge,[min(data.CustAge) 30 45 60 max(data.CustAge)], ... 'categorical',{'Age < 30','30 <= Age < 45','45 <= Age < 60','Age >= 60'}); data = addvars(data,AgeGroup,'After','CustAge'); head(data)```
``` CustID CustAge AgeGroup TmAtAddress ResStatus EmpStatus CustIncome TmWBank OtherCC AMBalance UtilRate status ______ _______ ______________ ___________ __________ _________ __________ _______ _______ _________ ________ ______ 1 53 45 <= Age < 60 62 Tenant Unknown 50000 55 Yes 1055.9 0.22 0 2 61 Age >= 60 22 Home Owner Employed 52000 25 Yes 1161.6 0.24 0 3 47 45 <= Age < 60 30 Tenant Employed 37000 61 No 877.23 0.29 0 4 50 45 <= Age < 60 75 Home Owner Employed 53000 20 Yes 157.37 0.08 0 5 68 Age >= 60 56 Home Owner Employed 53000 14 Yes 561.84 0.11 0 6 65 Age >= 60 13 Home Owner Employed 48000 59 Yes 968.18 0.15 0 7 34 30 <= Age < 45 32 Home Owner Unknown 32000 26 Yes 717.82 0.02 1 8 50 45 <= Age < 60 57 Other Employed 51000 33 No 3041.2 0.13 0 ```

Split the data set into training and testing data. Use the training data to fit the model and the testing data to predict from the model.

```rng('default'); c = cvpartition(size(data,1),'HoldOut',0.3); data_Train = data(c.training(),:); data_Test = data(c.test(),:);```

Compute Fairness Metrics at Predictor and Model Level

Compute the fairness metrics for the training data by creating a `fairnessMetrics` object and then generating a metrics report using `report`. Since you are only working with data and there is no fitted model, only two bias metrics are computed for `StatisticalParityDifference` and `DisparateImpact`. The two group metrics computed are `GroupCount` and `GroupSizeRatio`. The fairness metrics are computed for two sensitive attributes, Age (`'AgeGroup'`) and Residential Status (`'ResStatus'`).

```trainingDataMetrics = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames',{'AgeGroup', 'ResStatus'}); tdmReport = report(trainingDataMetrics)```
```tdmReport=7×4 table SensitiveAttributeNames Groups StatisticalParityDifference DisparateImpact _______________________ ______________ ___________________________ _______________ AgeGroup Age < 30 0.039827 1.1357 AgeGroup 30 <= Age < 45 0.096324 1.3282 AgeGroup 45 <= Age < 60 0 1 AgeGroup Age >= 60 -0.19181 0.34648 ResStatus Home Owner 0 1 ResStatus Tenant 0.01689 1.0529 ResStatus Other -0.02108 0.93404 ```
```figure tiledlayout(2,1) nexttile plot(trainingDataMetrics,'spd') nexttile plot(trainingDataMetrics,'di')```

Looking at the `DisparateImpact` bias metric for both `AgeGroup` and `ResStatus`, you can see that there is a much larger variance in the `AgeGroup` predictor as compared to the `ResStatus` predictor. This suggests that users are treated more unfairly when it comes to their age as compared to their residential status. This example focuses on the `AgeGroup` predictor and attempts to reduce bias among its subgroups.

To begin, fit a credit scoring model and compute the model-level bias metrics. This provides a baseline for comparison.

Since `CustAge` and `AgeGroup` are essentially the same predictor and this is a sensitive attribute, you can exclude it from the model. Additionally, you can use `'status'` as the response variable and `'CustID'` as the `ID` variable.

```PredictorVars = setdiff(data_Train.Properties.VariableNames, ... {'CustAge','AgeGroup','CustID','FairWeights','status'}); sc1 = creditscorecard(data_Train,'IDVar','CustID', ... 'PredictorVars',PredictorVars,'ResponseVar','status'); sc1 = autobinning(sc1); sc1 = fitmodel(sc1,'VariableSelection','fullmodel');```
```Generalized linear regression model: status ~ [Linear formula with 9 terms in 8 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ________ __________ (Intercept) 0.73924 0.077237 9.5711 1.058e-21 TmAtAddress 1.2577 0.99118 1.2689 0.20448 ResStatus 1.755 1.295 1.3552 0.17535 EmpStatus 0.88652 0.32232 2.7504 0.0059516 CustIncome 0.95991 0.19645 4.8862 1.0281e-06 TmWBank 1.132 0.3157 3.5856 0.00033637 OtherCC 0.85227 2.1198 0.40204 0.68765 AMBalance 1.0773 0.31969 3.3698 0.00075232 UtilRate -0.19784 0.59565 -0.33214 0.73978 840 observations, 831 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 66.5, p-value = 2.44e-11 ```
`pointsinfo1 = displaypoints(sc1)`
```pointsinfo1=38×3 table Predictors Bin Points _______________ _________________ _________ {'TmAtAddress'} {'[-Inf,9)' } -0.17538 {'TmAtAddress'} {'[9,16)' } 0.05434 {'TmAtAddress'} {'[16,23)' } 0.096897 {'TmAtAddress'} {'[23,Inf]' } 0.13984 {'TmAtAddress'} {'<missing>' } NaN {'ResStatus' } {'Tenant' } -0.017688 {'ResStatus' } {'Home Owner' } 0.11681 {'ResStatus' } {'Other' } 0.29011 {'ResStatus' } {'<missing>' } NaN {'EmpStatus' } {'Unknown' } -0.097582 {'EmpStatus' } {'Employed' } 0.33162 {'EmpStatus' } {'<missing>' } NaN {'CustIncome' } {'[-Inf,30000)' } -0.61962 {'CustIncome' } {'[30000,36000)'} -0.10695 {'CustIncome' } {'[36000,40000)'} 0.0010845 {'CustIncome' } {'[40000,42000)'} 0.065532 ⋮ ```
`pd1 = probdefault(sc1,data_Test);`

Set the `threshold` value that controls the allocation of "goods" and "bads."

```threshold = 0.35; predictions1 = double(pd1>threshold);```

Create a `fairnessMetrics` object to compute fairness metrics at the model level and then generate a metrics report using `report`.

```modelMetrics1 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions1, 'SensitiveAttributeNames','AgeGroup'); mmReport1 = report(modelMetrics1)```
```mmReport1=4×6 table SensitiveAttributeNames Groups StatisticalParityDifference DisparateImpact EqualOpportunityDifference AverageAbsoluteOddsDifference _______________________ ______________ ___________________________ _______________ __________________________ _____________________________ AgeGroup Age < 30 0.54312 2.6945 0.47391 0.5362 AgeGroup 30 <= Age < 45 0.19922 1.6216 0.35645 0.22138 AgeGroup 45 <= Age < 60 0 1 0 0 AgeGroup Age >= 60 -0.15385 0.52 -0.18323 0.16375 ```

Measure accuracy of model using `validatemodel`.

`validatemodel(sc1)`
```ans=4×2 table Measure Value ________________________ _______ {'Accuracy Ratio' } 0.33751 {'Area under ROC curve'} 0.66876 {'KS statistic' } 0.26418 {'KS score' } 1.0403 ```
```figure tiledlayout(2,1) nexttile plot(modelMetrics1,'spd') nexttile plot(modelMetrics1,'di')```

Reweight Data at Predictor and Model Level

Use `fairnessWeights` to reweight the training data to remove bias for the sensitive attribute `'AgeGroup'`.

```fairWeights = fairnessWeights(data_Train, 'AgeGroup', 'status'); data_Train.FairWeights = fairWeights; head(data_Train)```
``` CustID CustAge AgeGroup TmAtAddress ResStatus EmpStatus CustIncome TmWBank OtherCC AMBalance UtilRate status FairWeights ______ _______ ______________ ___________ __________ _________ __________ _______ _______ _________ ________ ______ ___________ 1 53 45 <= Age < 60 62 Tenant Unknown 50000 55 Yes 1055.9 0.22 0 0.95879 2 61 Age >= 60 22 Home Owner Employed 52000 25 Yes 1161.6 0.24 0 0.75407 3 47 45 <= Age < 60 30 Tenant Employed 37000 61 No 877.23 0.29 0 0.95879 4 50 45 <= Age < 60 75 Home Owner Employed 53000 20 Yes 157.37 0.08 0 0.95879 7 34 30 <= Age < 45 32 Home Owner Unknown 32000 26 Yes 717.82 0.02 1 0.82759 8 50 45 <= Age < 60 57 Other Employed 51000 33 No 3041.2 0.13 0 0.95879 9 50 45 <= Age < 60 10 Tenant Unknown 52000 25 Yes 115.56 0.02 1 1.0992 10 49 45 <= Age < 60 30 Home Owner Unknown 53000 23 Yes 718.5 0.17 1 1.0992 ```

Use `fairnessMetrics` to compute fairness metrics for the training data after reweighting and use `report` to generate a fairness metrics report..

```trainingDataMetrics_AfterReweighting = fairnessMetrics(data_Train, 'status', 'SensitiveAttributeNames','AgeGroup','Weights','FairWeights'); tdmrReport = report(trainingDataMetrics_AfterReweighting)```
```tdmrReport=4×4 table SensitiveAttributeNames Groups StatisticalParityDifference DisparateImpact _______________________ ______________ ___________________________ _______________ AgeGroup Age < 30 -2.9976e-15 1 AgeGroup 30 <= Age < 45 -5.5511e-16 1 AgeGroup 45 <= Age < 60 0 1 AgeGroup Age >= 60 -2.9421e-15 1 ```

By applying the reweighting algorithm to the `AgeGroup` predictor, you can completely remove the disparate impact for `AgeGroup`. Then use this debiased data to fit a model to produce predictions with an overall reduced disparate impact at the model level.

Use `creditscorecard` to fit a new credit scoring model with the new fair weights and compute model-level bias metrics.

```sc2 = creditscorecard(data_Train,'IDVar','CustID', ... 'PredictorVars',PredictorVars,'WeightsVar','FairWeights','ResponseVar','status'); sc2 = autobinning(sc2); sc2 = fitmodel(sc2,'VariableSelection','fullmodel');```
```Generalized linear regression model: status ~ [Linear formula with 9 terms in 8 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ________ __________ (Intercept) 0.74055 0.076222 9.7158 2.5817e-22 TmAtAddress 1.3416 0.9108 1.473 0.14075 ResStatus 2.0467 1.7669 1.1584 0.24672 EmpStatus 0.91879 0.32197 2.8536 0.0043222 CustIncome 0.91038 0.33216 2.7407 0.00613 TmWBank 1.1067 0.30826 3.5901 0.0003305 OtherCC 0.42264 3.5078 0.12049 0.9041 AMBalance 1.1347 0.3447 3.2919 0.00099504 UtilRate -0.39861 0.77284 -0.51577 0.60601 840 observations, 831 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 46.6, p-value = 1.85e-07 ```
`pointsinfo2 = displaypoints(sc2)`
```pointsinfo2=34×3 table Predictors Bin Points _______________ _________________ ________ {'TmAtAddress'} {'[-Inf,9)' } -0.21328 {'TmAtAddress'} {'[9,23)' } 0.07168 {'TmAtAddress'} {'[23,Inf]' } 0.14763 {'TmAtAddress'} {'<missing>' } NaN {'ResStatus' } {'Tenant' } 0.016048 {'ResStatus' } {'Home Owner' } 0.091092 {'ResStatus' } {'Other' } 0.28326 {'ResStatus' } {'<missing>' } NaN {'EmpStatus' } {'Unknown' } -0.10352 {'EmpStatus' } {'Employed' } 0.33653 {'EmpStatus' } {'<missing>' } NaN {'CustIncome' } {'[-Inf,30000)' } -0.37618 {'CustIncome' } {'[30000,40000)'} 0.047483 {'CustIncome' } {'[40000,42000)'} 0.10244 {'CustIncome' } {'[42000,47000)'} 0.14652 {'CustIncome' } {'[47000,Inf]' } 0.40015 ⋮ ```
```pd2 = probdefault(sc2,data_Test); predictions2 = double(pd2>threshold);```

Use `fairnessMetrics` to compute fairness metrics at the model level and `report` to generate a fairness metrics report.

```modelMetrics2 = fairnessMetrics(data_Test, 'status', 'Predictions', predictions2, 'SensitiveAttributeNames','AgeGroup'); mmReport2 = report(modelMetrics2)```
```mmReport2=4×6 table SensitiveAttributeNames Groups StatisticalParityDifference DisparateImpact EqualOpportunityDifference AverageAbsoluteOddsDifference _______________________ ______________ ___________________________ _______________ __________________________ _____________________________ AgeGroup Age < 30 0.39394 2.1818 0.37391 0.39377 AgeGroup 30 <= Age < 45 0.094298 1.2829 0.22947 0.11509 AgeGroup 45 <= Age < 60 0 1 0 0 AgeGroup Age >= 60 -0.13333 0.6 -0.18323 0.1511 ```

Measure accuracy of model using `validatemodel`.

`validatemodel(sc2)`
```ans=4×2 table Measure Value ________________________ _______ {'Accuracy Ratio' } 0.27735 {'Area under ROC curve'} 0.63868 {'KS statistic' } 0.22702 {'KS score' } 0.90741 ```
```figure tiledlayout(2,1) nexttile plot(modelMetrics2,'spd') nexttile plot(modelMetrics2,'di')```

The process of reweighting removed all the bias from the training data. When you use the new data to fit a model, the overall bias in the model is reduced when compared to a model trained with biased data. As a consequence of this reduction in bias, there is a drop in model accuracy. You can choose to make tradeoff to improve fairness.

References

[1] Nielsen, Aileen. "Chapter 4. Fairness PreProcessing." Practical Fairness. O'Reilly Media, Inc., Dec. 2020.

[2] Mehrabi, Ninareh, et al. “A Survey on Bias and Fairness in Machine Learning.” ArXiv:1908.09635 [Cs], Sept. 2019. arXiv.org, https://arxiv.org/abs/1908.09635.

[3] Wachter, Sandra, et al. Bias Preservation in Machine Learning: The Legality of Fairness Metrics Under EU Non-Discrimination Law. SSRN Scholarly Paper, ID 3792772, Social Science Research Network, 15 Jan. 2021. papers.ssrn.com, https://papers.ssrn.com/abstract=3792772.