Race, Data and the Complexities of Health Equity

by | Dec 5, 2022

Notable Learnings from Recent Research

Key Takeaways

A growing number of organizations are pointing out the racial biases in some medical algorithms with calls for fairness. While the identification of bias tends to be straightforward, the means for creating fair algorithms is much more complicated. 

Many approaches to fairness in AI have made calls for blindness identification of race and ethnicity in datasets where status was unreported. In practice however, researchers at Rand have demonstrated that algorithms that infer racial or ethnic status can actually be more fair than those adopting a blind approach.

Value-based pay-for-performance schemes can be a challenge for providers who have a more ethnically diverse population to treat. Better methods including the model proposed by Rand can offer advantages that are fairer to minority patient populations and the providers who treat them.

In the pursuit of fairness in algorithms, what constitutes fair is far more complicated than frequently acknowledged. Tradeoffs are often necessary and these need to become more transparent and appreciated by all stakeholders.

We often lack the data and mathematical solutions to create perfectly fair algorithms. Tradeoffs between performance and fairness and other values will often be made. Rather than rely on technological neutrality, developers need to be transparent about tradeoffs. Likewise, stakeholders need to understand the complexities associated with the pursuit of fairness in an increasingly algorithmic society.

Introduction

In October 2022, the ACLU issued a white paper and a warning about the dangers of medical racism with increased use of AI in healthcare. The white paper highlights the lack of transparency and shortcomings of current FDA mandates. There is also a dearth of regulatory bandwidth to provide adequate oversight of algorithms that have the potential to cause harm if they contain bias. The ACLU cite the well-known example of Optum’s algorithm that was found to be biased against Blacks and a home healthcare algorithm used in Arkansas in 2016 that made extreme cuts in funding and access to Blacks as primary examples of algorithms causing harm to patients.

However, one of the examples they cite is the rather perplexing study out of MIT that studied AI deployed on medical images that was able to predict patients’ race, even inexplicably so. The study showed that the deep learning algorithms could predict the self-reported race on a large, blinded set of medical images (blinded by exclusion of race labels). The researchers have been unable to explain why the model is able to do predict race with such accuracy, which raises a number of questions about bias in AI for radiology. Or does it?

How the Exclusion of Race and Ethnicity in Data can Make Bias Worse

To further complicate the picture, a recent study from Rand published in Health Affairs cautions against the precaution to want to remove race and ethnicity data from datasets used in AI, which may only make matters worse. Blindness to race in datasets, they argue, is a weaker approach than having knowledge of race and ethnicity in datasets for ridding models of bias. The authors of the study assert that tools that actually can estimate racial and ethnic information about patients from datasets lacking self-identified information could improve health care algorithms and assist clinicians in reducing bias in their practice.

The study makes the case that knowledge about race and ethnicity is more powerful for identifying algorithmic bias. Their argument is one for emphasizing the biased decision-making in the humans who use the algorithms. Often, they argue, the algorithm learns to detect disparities that are already in the data in which they are trained. To monitor bias, they argue, we need to deploy more health disparity tools to measure health inequities.

Bayesian Improved Surname Geocoding is one such tool that Rand, Carnegies Mellon, Kaiser Permanente, and the Centers for Medicare and Medicaid Services (CMS) developed to impute race and ethnicity to improve algorithmic performance on diverse datasets. The methodology has been used in the following use cases:

  • Estimating the racial and ethnic composition of a patient population
  • Comparing racial and ethnic differences in health care quality and outcomes
  • Coordinating community-level outreach and interventions
  • Comparing effectiveness of interventions

The accuracy of the method is sufficient to make it superior to approaches that do not impute race and ethnicity. Furthermore, the algorithm can be customized.

Figure 1: BISG algorithm (Source: Rand Corporation)

Impact on Value-Based Care Schemes

The methods and debates on health data, race and algorithms are salient to areas such as pay-for-performance schemes and value-based care analytics as well. Data analytics and AI/ML vendors will need to pay attention to the developments in this space and the quest for health equity will not be easy.

On one hand, there is the call for fairness through unawareness (i.e., the blind approach). This may encounter some challenges when evaluating pay-for-performance schemes that reward clinicians for providing higher quality care. The Rand researchers observe a pattern where providers who perform poorly on the quality measures in these schemes often are those who care for poorer patients. In these cases, the researchers want to use the BISG algorithm to understand the demographics of the population and ensure that payments are not correlated with race and ethnicity. Uncertainty of race and bias without self-reported status will always be present; even self-reported categories do not always reflect the individual’s actual status when official categories do not match lived realities or hybrid identities.

Irineo Cabreros, one of the authors of the Rand study above, has written on just how hard equity is to achieve with algorithms. From black box algorithms lacking transparency to the challenges of analyzing bias in binary predictions, the ability to evaluate whether an algorithm behaves equitably is often challenging, since the data to evaluate it can be rigorously difficult to come by. One evaluation method is to assess error rates across different populations to see if there are differences. Do the algorithms predictions have the same meaning for different populations? (E.g., if a negative prediction equals a 90% chance a white person is cancer negative but for Black populations the rate is 50%, this is unfair).

Cabreros notes that we often face trade-offs between performance and equity, and this is where the contention rests. Whose definition of fair counts? In the mathematics of fairness, not all parties will be satisfied. Different fairness goals may contradict one another; decisions will still need to be made. Different quality measures can be interrelated and occasionally interdependent. In some cases, if clinicians treat race as a biological category—as algorithms sometimes erroneously do—they miss the fact that in most cases it is a social construct, and race adjusted measures can come back to haunt them.[1] All algorithms have some form of bias, so how do we decide what tradeoffs to make?

Questions of race and ethnicity inevitably raise the issue of social determinants of health (SDoH); access to housing, transportation, healthy food, insurance, crime rates, etc. all become relevant. Even beyond the algorithmic issues, SDoH interventions and quality measures for these are still far off as a rigorous, scientific methodology. Health and Human Services is still in the early days of evaluating the ROI on various SDoH methodologies that could inform evidence-based approaches.[2] As analytics vendors develop AI-based tools to provide more comprehensive SDoH analytics, these bias issues will loom large. The veneer of technological neutrality is not the most advisable route to follow. Transparency and better tools are needed as well as social interventions to assist developers and the healthcare field to understand the tradeoffs more robustly.

Conclusion

A number of technical manuals for addressing bias have been published in recent years including Obermeyer et al’s “Algorithmic Bias Playbook”, Chen’s “Ethical Machine Learning in Healthcare”, Suresh et al’s “A Framework for Understanding Sources of Harm Throughout the Machine Learning Lifecycle” which are extremely valuable resources for auditing and rooting out bias in datasets and models.

Given the discussion above, it is clear that domains such as explainability of algorithms will need to incorporate information on any trade-offs on fairness made in the development of algorithms. We often rely on the unstated values of model developers and the choices they make to resolve broader societal and policy choices. Case studies on the tradeoffs made under different scenarios will be important to develop the ethical reasoning that technological tools alone cannot address. It is unclear if standards and certification processes can address what might be considered the “wicked problems” of healthcare where no clear-cut right answer is available. Compensatory policies and remedies should be deployed to address any shortcomings of trade-offs for affected parties. These are the types of issues that algorithmic review boards will need to address in the coming years.


[1] See Vyas, et al. NEJM https://www.nejm.org/doi/full/10.1056/NEJMms2004740

[2] See Crook, H. et al. How are Payment Reforms Addressing Social Determinants of Health? Policy Implications and Next Steps. Milbank Quarterly, Issue Brief, February 2021. https://www.milbank.org/publications/how-are-payment-reforms-addressing-social-determinants-of-health-policy-implications-and-next-steps/?gclid=Cj0KCQiAmaibBhCAARIsAKUlaKTMCfRJI2E80a84yfuoDC6Q71VZYazHSkrXGk88eqG6BVa2X7IyI_AaAh6iEALw_wcB

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related Content

HIMSS24: Back to Form but Haunted by Change Healthcare

HIMSS24: Back to Form but Haunted by Change Healthcare

Good luck trying to get noticed for anything other than AI or cybersecurity HIMSS24 was the first HIMSS national conference that I will have missed since I first attended in 2012. It felt weird not to be there with all my friends and colleagues, and I certainly missed...

read more
ViVE 2024: Bridging the Health 2.0 – HIMSS Gap

ViVE 2024: Bridging the Health 2.0 – HIMSS Gap

Workforce / capacity issues and AI – and where the two meet – are still the two biggest topics on clinical executives’ minds right now at both ViVE 2024 and HAS24. Probably the first time I’ve seen the same primary focus two years in a row – historically we’ve always seen a new buzzword / hype topic every year…

read more
Powered By MemberPress WooCommerce Plus Integration