Documentation
Access all resources and documentation or contact us for additional assistance.
Bias Tutorial
What is Bias?
Bias is a frequently discussed topic when it comes to societal issues but it’s also very important in science and therefore for machine learning. To narrow down the definition of bias for machine learning, let’s start with a textbook societal example: 5 people are waiting in a room for a job interview. They have the same qualifications. In this particular moment, an unbiased interview would assign equal probability for each candidate to be hired. This is, each of them has a 1/5 chance to be hired. Any other distribution of the probabilities would indicate bias. For example, let’s assume the interviewer is more likely to hire a societal majority or has to obey a policy that alters the chances: any reduction or increase of uncertainty for a job candidate that is unintended (this is, in this example, not based solely on qualification) is defined as bias towards or against the candidate, respectively. Brainome therefore defines bias as undue change of uncertainty. This is, change of uncertainty by factors not intended or known to be part of the experiment.
Brainome’s Bias Measurements
In the literature, class imbalances, an uneven train/validation split or even threshold parameters are often called biased. Brainome’s biasmeter does not measure these obvious imbalances but measures the bias that may be implicit in the model trained with the data. Class imbalances in the data are shown in the pre-training measurements and classification imbalances are obvious in the confusion matrices shown at the end of Brainome’s output. The contribution bias by each attribute is shown in the importance ranking. The following screenshot illustrates these, with the relevant biases marked in red:
Brainome titanic_train.csv -e 5 -split 90
Brainome Table Compiler v1.004-165-prod
Copyright (c) 2019-2021 Brainome, Inc. All Rights Reserved.
Command:
btc titanic_train.csv -e 5 -split 90
Cleaning…done.
Splitting into training and validation…done.
Pre-training measurements…done.
Pre-training Measurements
Data:
Input: titanic_train.csv
Target Column: Survived
Number of instances: 800
Number of attributes: 11 out of 11
Number of classes: 2
Class Balance:
died: 61.50%
survived: 38.50%
Learnability:
Best guess accuracy: 61.50%
Data Sufficiency: Maybe enough data to generalize. [yellow]
Capacity Progression: at [ 5%, 10%, 20%, 40%, 80%, 100% ]
Ideal Machine Learner: 6, 7, 8, 8, 9, 9
Expected Generalization:
Decision Tree: 1.99 bits/bit
Neural Network: 6.52 bits/bit
Random Forest: 10.13 bits/bit
Expected Accuracy: Training Validation
Decision Tree: 100.00% 51.62%
Neural Network: —- —-
Random Forest: 100.00% 80.25%
Recommendations:
Warning: Data has high information density. Using effort 5 and larger ( -e 5 ) can improve results.
We recommend using Random Forest -f RF.
If predictor accuracy is insufficient, try using the option -rank to automatically select the important attributes.
Defaulting to RF model. Model can be forced with -f parameter.
Building classifier…done.
Training…done.
done.
Compiling predictor…done.
Validating predictor…done.
Predictor: a.py
Classifier Type: Random Forest
System Type: Binary classifier
Training / Validation Split: 90% : 10%
Accuracy:
Best-guess accuracy: 61.50%
Training accuracy: 100.00% (719/719 correct)
Validation Accuracy: 85.18% (69/81 correct)
Combined Model Accuracy: 98.50% (788/800 correct)
Model Capacity (MEC): 13 bits
Generalization Ratio: 53.18 bits/bit
Percent of Data Memorized: 3.82%
Resilience to Noise: -1.74 dB
Training Confusion Matrix:
Actual | Predicted
—— | ———
died | 442 0
survived | 0 277
Validation Confusion Matrix:
Actual | Predicted
—— | ———
died | 48 2
survived | 10 21
Training Accuracy by Class:
Survived | TP FP TN FN TPR TNR PPV NPV F1 TS
——– | —- —- —- —- ——– ——– ——– ——– ——– ——–
died | 442 0 277 0 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
survived | 277 0 442 0 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
Validation Accuracy by Class:
Survived | TP FP TN FN TPR TNR PPV NPV F1 TS
——– | —- —- —- —- ——– ——– ——– ——– ——– ——–
died | 48 10 21 2 96.00% 67.74% 82.76% 91.30% 88.89% 80.00%
survived | 21 2 48 10 67.74% 96.00% 91.30% 82.76% 77.78% 63.64%
Attribute Ranking:
Feature | Relative Importance
Sex : 0.4086
Cabin_Class : 0.2060
Cabin_Number : 0.0640
Age : 0.0489
Fare : 0.0468
Parent_Children : 0.0464
Sibling_Spouse : 0.0440
PassengerId : 0.0386
Ticket_Number : 0.0381
Name : 0.0328
Port_of_Embarkation : 0.0258
To get to the implicit biases induced by the data, Brainome needs to be invoked with the parameter -biasmeter. Brainome will then synthesize new samples from the data, and, taking into account the average generalization, measure how uniformly distributed random samples would be classified by the generated model. Of course, using this method, if the model was bias-free, uniform random input would generate uniform random output. No model is ever bias-free and so the bias towards a class is expressed as percentage. See below:
Brainome titanic_train.csv -e 5 -split 90 -biasmeter
Brainome Table Compiler v1.004-165-prod
Copyright (c) 2019-2021 Brainome, Inc. All Rights Reserved.
Command:
btc titanic_train.csv -e 5 -split 90 -biasmeter
. . .
Attribute Ranking:
Feature | Relative Importance
Sex : 0.4492
Cabin_Class : 0.1751
Cabin_Number : 0.0946
Parent_Children : 0.0551
Sibling_Spouse : 0.0463
Age : 0.0447
Ticket_Number : 0.0338
Fare : 0.0324
PassengerId : 0.0323
Name : 0.0256
Port_of_Embarkation : 0.0109
Measuring bias…done.
Model bias: 1.18% towards class died away from class survived
The bias measured above indicates a pretty well-balanced model.
Note that Brainome’s bias meter is an approximation and cannot be taken for absolute truth as measuring bias is inherently difficult. One should be as suspicious of models that are bias-free as with models that have a large amount of bias. Model bias can be reduced by making sure classes are balanced, sample size is high, and aiming for validation and training accuracy to be about equal. However, if there is inherent bias in the training data, then this bias should be part of the model. If the bias is unwanted, the training data or the process generating it needs to be corrected.