Machine Learning is exciting and full of possibilities. The promise of business insights and prediction is appealing and addicting. We all want the crystal ball that tells us which stock to purchase or what will be the next iPhone.
Burton Malkiel, a Princeton Professor, theorized in his book A Random Walk Down Wall Street that: “a blindfolded monkey throwing darts at a newspaper’s financial pages could select a portfolio that would do just as well as one carefully selected by experts.” The question at hand is: Are our current solution better than monkeys randomly throwing darts picking a model.
This is where the word ‘ TRUST’ comes into play. Can we trust a model that worked in a data scientist’s lab to be applied in the real world? The reality is that the solutions we use today are just a combination of guesses bundled in a strategy we hope is the right one.
Where is our trust in ML solutions today?
Trust in people:
We trust people based on their experience and achievements. A common misconception is that we assume that if someone was able to solve a particular problem, he or she will be able to solve any problem thrown at them. We can draw the parallel with standard financial performance disclaimers: ‘Past performance is not a guarantee of future performance’.
Trust in standard approaches:
The same models are used repeatedly to solve problems that appear to fit in the same man-made bucket. This leads to over trusting a solution leading to overfitting and possibly not addressing the problem at hand.
Some major Chinese cities are using ML to identify jaywalking. Through the use of smart cameras and using a facial recognition model they aim at detecting jaywalkers. Once detected, their name and picture will appear on a public screen stating that they illegally crossed the street.
The city of Ningbo recently experienced an embarrassing moment when their system recognized a successful and well known businesswomen from a picture on the side of a bus, and embarrassed her publicly for jaywalking.
Benchmarks create solutions based on anecdotal results which cannot be translated to other problems unless there is further evidence. Benchmarks are overrated with regards to their generalizability. . Yet, most often, the winning approach is then duplicated by others and becomes a standard approach for similar problems and the person who created that becomes a trusted person in the community. We assume that a good SAT score means you are smart and will successful in college.
When a Machine Learning problem is presented the fixation on a single score gets the race going. We pick the winner. This leads to anecdotal evidence and rarely is the big picture used to evaluate of the winning approach actually represents a general solution. The “No Free Lunch” Theorem is a mathematical proof that there is no one model that works best for every problem. Yet, most often, the winning approach is then duplicated by others and becomes a standard approach for similar problems and the person who created that becomes a trusted person in the community.
Our current trust practice for Machine Learning is why Gartner predicts that through 2022, 85% of AI projects will fail. A model is just a simplified representation of reality. The simplifications are based on assumptions which may hold in some situations but not in others. Since every problem and every data set is its own, one improvement of our current practice would be to quantify how well a model is able to generalize, in addition to how well it did on a specific task.
So, should we trust what is currently offered to us?