Sunday, June 22, 2014

Are Discriminatory Systems Discriminatory? If So, Then What?

Scoring/selection systems based on big data analytics have a powerful allure—their simplicity gives the illusion of precision and reliability. But predictive algorithms can be anything but accurate and fair. They can narrow people’s life opportunities in arbitrary and discriminatory ways. As Oscar Gandy states, "already burdened segments of the population become further victimized through the strategic use of sophisticated algorithms in support of the identification, classification, segmentation, and targeting of individuals as members of analytically constructed groups."

These systems are described as discriminatory because discrimination is what they are designed to do. Their value to users is based on their ability to sort things into categories and classes that take advantage of similarities and differences that seem to matter for the decisions users feel compelled to make. All of these assessments act as aids to discrimination - guiding a choice between or among competing options.

In many cases the decisions made by the users determine the provision, denial, enhancement, or restriction of the opportunities that individuals and consumers face both inside and outside of formal markets.The statistical discrimination enabled by sophisticated analytics compounds the disadvantages that the structural constraints we readily associate with race, class, gender, disability, and cultural identity influence the opportunity sets people encounter during their life. Please see Do We Regulate Algorithms, or Do Algorithms Regulate Us?


Seizing Opportunities, Preserving Values

The recent White House report, “Big Data: Seizing Opportunities, Preserving Values," found that, "while big data can be used for great social good, it can also be used in ways that perpetrate social harms or render outcomes that have inequitable impacts, even when discrimination is not intended." The fact sheet accompanying the White House report warns:
As more decisions about our commercial and personal lives are determined by algorithms and automated processes, we must pay careful attention that big data does not systematically disadvantage certain groups, whether inadvertently or intentionally. We must prevent new modes of discrimination that some uses of big data may enable, particularly with regard to longstanding civil rights protections in housing, employment, and credit.
Some of the most profound challenges revealed by the White House Report concern how big data analytics may lead to disparate inequitable treatment, particularly of disadvantaged groups, or create such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms.


Big data analytic systems, like those used by employers in making hiring decisions, have become sources of material risk, both to job applicants and employers. The systems create the perception of stability through probabilistic reasoning and the experience of accuracy, reliability, and comprehensiveness through automation and presentation. But in so doing, the systems draw attention away from uncertainty and partiality.

Moreover, they shroud opacity—and the challenges for oversight that opacity presents—in the guise of legitimacy, providing the allure of shortcuts and safe harbors. Programming and mathematical idiom (e.g., correlations) can shield layers of embedded assumptions from higher level decisionmakers at an employer who are charged with meaningful oversight and can mask important concerns with a veneer of transparency.This problem is compounded in the case of regulators outside the firm, who frequently lack the resources or vantage to peer inside buried decision processes.

Recognizing these problems, the White House Report states that "[t]he federal government must pay attention to the potential for big data technologies to facilitate discrimination inconsistent with the country’s laws and values" and contains the following recommendation:
The federal government’s lead civil rights and consumer protection agencies, including the Department of Justice, the Federal Trade Commission, the Consumer Financial Protection Bureau, and the Equal Employment Opportunity Commission, should expand their technical expertise to be able to identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations of law in such cases. 
Due Process for Automated Decisions?

Danielle Keats Citron and Frank Pasquale III have argued that scoring/selection systems should be subject to licensing and to audit requirements when they enter critical settings like employment, insurance, and health care. The idea is that with a technology as sensitive as scoring/selection, fair, accurate, and replicable use of data is critical.

Licensing can serve as a way of assuring that public values inform this technology. Such licensing could be completed by private entities that are themselves licensed by the relevant government agency (e.g., EEOC, FTC) This “licensing at one remove” has proven useful in the context of health information technology.

Keats Citron and Pasquale also argue that the federal government’s lead civil rights and consumer protection agencies should be given access to hiring systems, credit-scoring systems and other systems that have the potential to unlawfully harm citizens and consumers. Access could be more or less episodic depending on the extent of unfairness exhibited by the scoring system. Biannual audits would make sense for most scoring systems; more frequent monitoring would be necessary for those which had engaged in troubling conduct. We should be particularly focused on scoring systems which rank and rate individuals who can do little or nothing to protect themselves. Expert technologists could test scoring systems for bias, arbitrariness, and unfair mischaracterizations. To do so, they would need to view not only the datasets mined by scoring systems but also the source code and programmers’ notes describing the variables, correlations, and inferences embedded in the scoring systems’ algorithms.

For the review to be meaningful in an era of great technological change, the technical experts must be able to meaningfully assess systems whose predictions change pursuant to artificial intelligence (AI) logic. They should detect patterns and correlations tied to classifications that are already suspect under American law, such as race, nationality, sexual orientation, and gender. Scoring systems should be run through testing suites that run expected and unexpected hypothetical scenarios designed by policy experts. Testing reflects the norm of proper software development, and would help detect both programmers’ bias and bias emerging from the AI system’s evolution.

A potentially more difficult question concerns whether scoring/selection systems’ source code, algorithmic predictions, and modeling should be transparent to affected individuals and ultimately the public at large. There are legitimate arguments for some level of big data secrecy, including concerns connected to intellectual property, but these concerns are more than outweighed by the threats to human dignity posed by pervasive, secret, and automated scoring systems.

At the very least, individuals should have a meaningful form of notice and a chance to challenge predictive scores that harm their ability to obtain credit, jobs, housing, and other important opportunities. Even if scorers (e.g., testing companies) successfully press to maintain the confidentiality of their proprietary code and algorithms vis-a-vis the public at large, it is still possible for independent third parties to review it.

One possibility is that in any individual adjudication, the technical aspects of the system could be covered by a protective order requiring their confidentiality. Another possibility is to limit disclosure of the scoring system to trusted neutral experts. Those experts could be entrusted to assess the inferences and correlations contained in the audit trails. They could assess if scores are based on illegitimate characteristics such as disability, race, nationality, or gender or on mischaracterizations. This possibility would both protect scorers’ intellectual property and individuals’ interests.

Do We Regulate Algorithms, or Do Algorithms Regulate Us?

Can an algorithm be agnostic? Algorithms may be rule-based mechanisms that fulfill requests, but they are also governing agents that are choosing between competing, and sometimes conflicting, data objects.

The potential and pitfalls of an increasingly algorithmic world beg the question of whether legal and policy changes are needed to regulate our changing environment. Should we regulate, or further regulate, algorithms in certain contexts? What would such regulation look like? Is it even possible? What ill effects might regulation itself cause? Given the ubiquity of algorithms, do they, in a sense, regulate us? We regulate markets, and market behavior, out of concerns for equity, as well as out of concern for efficiency. The fact that the impacts of design flaws are inequitably distributed is at least one basis for justifying regulatory intervention.

The regulatory challenge is to find ways to internalize the many external costs generated by the rapidly expanding use of analytics. That is, to find ways to force the providers and users of discriminatory technologies to pay the full social costs of their use. Requirements to warn, or otherwise inform users and their customers about the risks associated with the use of these systems should not absolve system producers of their own responsibility for reducing or mitigating the harms. This is part of imposing economic burdens or using incentives as tools to shape behavior most efficiently and effectively. Please see Do We Regulate Algorithms, Or Do Algorithms Regulate Us?


No comments:

Post a Comment

Because I value your thoughtful opinions, I encourage you to add a comment to this discussion. Don't be offended if I edit your comments for clarity or to keep out questionable matters, however, and I may even delete off-topic comments.