Artificial intelligence (AI) has been adopted by many industries as an important method for answering business questions. This powerful decision-making technique is extremely valuable to the healthcare industry. Responsible AI is a crucial aspect of using machine learning in an organization that may be easily overlooked.
Artificial intelligence has grown at a rapid rate from being a college research project to a method adopted throughout many industries to answer key business questions. In particular, machine learning models are able to help make predictions by learning complex and latent patterns in our data that are then able to help us with key decisions and automate many processes. However, as many have said, with great power comes great responsibility. This is where responsible AI is an extremely crucial aspect of using machine learning in an organization. Before we continue, let's take a couple steps back to really understand the importance of responsible AI.
How it works
While the idea of machine learning is not new, it’s been around since the 1800s, it is extremely powerful. While in traditional programming the programmer writes rules for the computer to follow such as if x=y then do z, with machine learning we simply give the computer the data and the answers and say, “hey, can you figure out the patterns or rules to come up with the answer”?
For example, if I said the input is 2 and the output is 4 then the computer might conclude that the formula may be y = 2*x. As long as we have data and we think there is some relationship in the data to predict our outcome, we can have machine learning apply a multitude of algorithms to find this relationship and make predictions for us. This can come in the form of using medical records to predict chronic illness or expected hospital stay length to identify pneumonia in chest x-rays. This is so powerful because machine learning allows us to come up with algorithms for situations that would have otherwise been too complex.
Even though machine learning is powerful, it is key to remember that the algorithms can only learn from the data provided. The data needs to be formatted in a way for the model to understand. In this way, it is like trying to teach a child. In the same way children are like sponges and learn both good and bad behaviors, machine learning models intake both the good and bad. In order to be effective, we must ensure the data entered is good, clean data. Additionally, when we give a machine learning model data to be fitted, it operates under the assumption that this is the world and all the data and cases it can ever see.
Where responsible AI comes in
When we are building these models, it is key to make sure the data is representative of the population and all the various possible cases are present. While it is important to have a machine learning model that performs well on the data that it is trained on, it is much more important to understand how the model acts when it has never seen data before. Here, it is key to have a data scientist and machine learning engineers checking the models against a testing data set or to use methods such as cross-validation to test how the model does when it tries to generalize its results to the population. It is essential to make sure your model is not overfitted to the data it was trained on but can generalize its results to your population. This is not only important for the usability of the model but helps to ensure responsible AI in that all types of groups were included in the training of the model and the model can accurately predict results for them and not create extraneous results for any groups. Reviewing and testing your model's performance to make sure it is not just performing well on the training data can offer huge dividends on having a much more robust model that is trusted in its predictions.
Biased models
To further exemplify the analogy of a small child and machine learning, think of yourself as a kid sitting in class and you really want to get an A on your exam to take home to your parents. You just found out the test was comprised of only true/false questions and 90% of the answers were true ... wouldn't you just pick true for every question so you’re guaranteed to get an A? Our machine learning models will do the same thing! This is called bias. Many times, when there are more results in one group than the other, a machine learning model will be prone to predict the group with the higher frequency because it will be right more often and therefore perform better. Unfortunately, many times this causes a model to create a biased result and to under or over represent a group.
Often, data will be naturally biased, especially when reporting around infrequent occurrences such as heart failures and rare diseases. Here again, having a data scientist and data engineers is key to addressing these situations. Using techniques such as oversampling or undersampling, we can adjust for the bias in the data and make sure the result of the model is not biased. A key element of responsible AI is prioritizing the careful review of models for biased cases to ensure that the model is functioning as planned and the results are accurate.
While machine learning models are extremely powerful tools that can drive key business decisions and handle large amounts of automation for our businesses, it is also important to remember that wherever we start in the process of building these machine learning models they are akin to completely innocent children who are counting on us to carefully explain our world and give accurate examples that represent it.
Checklist for practicing responsible AI
✓ Carefully review your data
✓ Check for key issues such as bias
✓ Ensure that your data is large enough to reflect the population
✓ Validate that your model can accurately generalize its predictions to your population
✓ Keep the human in the loop
At CGI we understand that to truly make sure that machine learning models are not just helping our healthcare clients answer key business questions and drive growth but are making a positive impact on our communities. Responsible AI is paramount in ensuring every possible group is getting accurately represented by our models with consistent results. Visit our site to review our healthcare analytics and intelligent automation capabilities. To learn how we demystify and deliver responsible AI to the clients we serve, visit us here.