Monday, June 23, 2014

The Dark Side of Big Data

This post contains excerpts from: Big Data: A Revolution That Will Transform How We Work, Live, and Think by Kenneth Cukier and Viktor Mayer-Schonberger, "Critiquing Big Data: Politics, Ethics, Epistomology" by Kate Crawford, Kate Milner, and Mary L. Gray, and "It's Not Privacy and It's Not Fair" by Cynthia Dwork and Deirdre K. Mulligan.

* * * * * * *

With big data promising valuable insights to those who analyze it, all signs seem to point to a further surge in others’ gathering, storing, and reusing our personal data. The size and scale of data collections will increase their already exponential growth as storage costs continue to plummet and analytic tools become ever more powerful. If the Internet Age threatened privacy does big data endanger it even more? Is that the dark side of big data?

Yes, and it is not the only one. Here, too, the essential point about big data is that a change of scale leads to a change of state. This transformation not only makes protecting privacy much harder, but also presents an entirely new menace: penalties based on propensities. That is the possibility of using big-data predictions about people to judge and punish them even before they have acted. Doing this negates ideas of fairness justice, and free will.

Penalizing Based On Propensities

The clearest example of penalizing based on propensities is predictive policing. Predictive policing tries to harness the power of information, geospatial technologies and evidence-based intervention models to reduce crime and improve public safety. This two-pronged approach — applying advanced analytics to various data sets, in conjunction with intervention models — can move law enforcement from reacting to crimes into the realm of predicting what and where something is likely to happen and deploying resources accordingly.

Predictive policing raises suspicions that it either legitimizes racial profiling or at the very least gives the police a much wider latitude of probable cause with which to challenge citizens or force a consent to a search. On top of this, some wonder if it goes beyond criminalizing actions to criminalizing the simple fact of being in the wrong place and at the wrong time.

Take marijuana arrests as an example. We know that black people and Latinos are arrested, prosecuted and convicted for marijuana offenses at rates astronomically higher than their white counterparts, even if we adjust for income and geography. We also know that whites smoke marijuana at about the same rate as blacks and Latinos.

Therefore we know that marijuana laws are not applied equally across the board: Blacks and Latinos are disproportionately targeted for associated arrests, while whites are arrested at much lower rates for smoking or selling small amounts of marijuana.

If historical arrest data shows that the majority of arrests for marijuana crimes in a city are made in a predominately black area, instead of in a predominately white area, predictive policing algorithms working off of this problematic data will recommend that officers deploy resources to the predominately black area -- even if there is other information to show that people in the white area violate marijuana laws at about the same rate as their black counterparts.

If an algorithm is only fed unjust arrest data (as compared to conviction data), it will simply repeat the injustice by advising the police to send yet more officers to patrol the black area. In that way, predictive policing creates a feedback loop of injustice.

Dictatorship of Data

In addition to privacy and propensity, there is a third danger. We risk falling victim to a dictatorship of data, whereby we fetishize the information, the output of our analyses, and end up misusing it. Handled responsibly, big data is a useful tool of rational decision-making. Wielded unwisely, it can become an instrument of the powerful, who may turn it into a source of repression, either by simply frustrating customers and employees or, worse, by harming citizens.

Big data is predicated on correlations among various data elements. Correlations let us analyze a phenomenon not by shedding light on its inner workings but by identifying a useful proxy for it. Of course, even strong correlations are never perfect. It is quite possible that two things may behave similarly just by coincidence. We may simply be “fooled by randomness” to borrow a phrase from the empiricist Nassim Nicholas Taleb. With correlations, there is no certainty, only probability.

As boyd and Crawford state, "Too often, Big Data enables the practice of apophenia: seeing patterns where none actually exist, simply because enormous quantities of data can offer connections that radiate in all directions. In one notable example, David Leinweber demonstrated that data mining techniques could show a strong but spurious correlation between the changes in the S&P 500 stock index and butter production in Bangladesh."

While many companies and government agencies foster an illusion that classification is (or should be) an area of absolute algorithmic rule—that decisions are neutral, organic, and even automatically rendered without human intervention—reality is a far messier mix of technical and human curating. Both the datasets and the algorithms reflect choices, among others, about data, connections, inferences, interpretation, and thresholds for inclusion that advance a specific purpose.

Like maps that represent the physical environment in varied ways to serve different needs—mountaineering, sightseeing, or shopping—classification systems are neither neutral nor objective, but are biased toward their purposes. They reflect the explicit and implicit values of their designers.

A similar principle should apply outside government, when businesses make highly significant decisions about us – to hire or fire, offer a mortgage, or deny a credit card. When they base these decisions mostly on big-data predictions, we recommend that certain safeguards must be in place.

First is openness: making available the data and algorithm underlying the prediction that affects an individual. With big-data analysis, however, traceability will become much harder. The basis of an algorithm’s predictions may often be far too intricate for most people to understand.

Second is certification: having the algorithm certified for certain sensitive uses by an expert third party as sound and valid. Third is disprovability: specifying concrete ways that people can disprove a prediction about themselves. (This is analogous to the tradition in science of disclosing any factors that might undermine the findings of a study.)

Most important, a guarantee on human agency guards against the threat of a dictatorship of data, in which we endow the data with more meaning and importance than it deserves.

Preexisting Bias

If the system is complex, and most are, biases can remain hidden in the code, difficult to pinpoint or explicate, and not necessarily disclosed to users or their clients. Three categories of bias in computer systems have been developed: preexisting, technical, and emergent. Preexisting bias has its roots in social institutions, practices, and attitudes. Technical bias arises from technical constraints or considerations. Emergent bias arises in a context of use.

Freedom from bias should be counted among the select set of criteria—including reliability, accuracy, and efficiency—according to which the quality of systems in use in society should be judged. We use the term bias to refer to computer systems that systematically and unfairly discriminate against certain individuals or groups of individuals in favor of others. A system discriminates unfairly if it denies an opportunity or a good or if it assigns an undesirable outcome to an individual or group of individuals on grounds that are unreasonable or inappropriate.

Moreover, because big-data analysis is based on theories, we can’t escape them. They shape both our methods and our results. It begins with how we select the data. Our decisions may be driven by convenience: Is the data readily available? Or by economics: Can the data be captured cheaply? Our choices are influenced by theories. What we choose influences what we find, as the digital-technology researchers Crawford and boyd have stated. Similarly, when we analyze the data, we choose tools that rest on theories. And as we interpret the results we again apply theories. The age of big data clearly is not without theories – they are present throughout, with all that this entails.

Ameliorating the Risks

In these scenarios, we can see the risk that big-data predictions, and the algorithms and datasets behind them, will become black boxes that offer us no accountability, traceability, or confidence. To prevent this, big data will require monitoring and transparency, which in turn will require new types of expertise and institutions. These new players will provide support in areas where society needs to scrutinize big-data predictions and enable people who feel wronged by them to seek redress. Please see Are Discriminatory Systems Discriminatory? If So, Then What?

Big data will require a new group of people to take on this role. Perhaps they will be called “algorithmists.” They could take two forms - independent entities to monitor firms from outside, and employees or departments to monitor them from within - just as companies have in-house accountants as well as outside auditors who review their finances.

We envision external algorithmists acting as impartial auditors to review the accuracy or validity of big-data predictions whenever the government requires it, such as under court order or regulation. They also can take on big-data companies as clients, performing audits for firms that want expert support. And they may certify the soundness of big-data applications like anti-fraud techniques or stock-trading systems. Finally, external algorithmists are prepared to consult with government agencies on how best to use big data in the public sector.

Moreover, people who believe they’ve been harmed by big-data predictions —a patient rejected for surgery, an inmate denied parole, a loan applicant denied a mortgage — can look to algorithmists much as they already look to lawyers for help in understanding and appealing those decisions.

Sunday, June 22, 2014

Decision-by-Algorithm: No Silver Bullet

This post contains excerpts from "The Scored Society: Due Process for Automated Predictions," a 2014 law review article authored by Danielle Keats Citron and Frank A. Pasquale III. 

* * * * * * *

Big Data is increasingly mined to rank and rate individuals. Predictive algorithms assess individuals as good credit risks, desirable employees, reliable tenants, and valuable customers. People’s crucial life opportunities are on the line, including their ability to obtain loans, work, housing, and insurance.

The scoring trend is often touted as good news. Advocates applaud the removal of human beings and their flaws from the assessment process. Automated systems are claimed to rate individuals all in the same way, thus averting discrimination. But this account is misleading. Human beings program predictive algorithms. Their biases and values are embedded into the software’s instructions, known as the source code, and predictive algorithms.  Please see What Gets Lost? Risks of Translating Psychological Models and Legal Requirements to Computer Code.

Credit scoring has been lauded as shifting decision-makers’ attention from troubling stereotypes to bias-free assessments of would-be borrowers’ actual records of handling credit. The notion is that the more objective data at a lender’s disposal, the less likely a decision will be based on protected characteristics like race or gender. But far from eliminating existing discriminatory practices, credit-scoring algorithms are instead granting them an imprimatur, systematizing them in hidden ways.

A credit card company uses behavioral-scoring algorithms to rate consumers a worse credit risk because they used their cards to pay for marriage counseling, therapy, or tire-repair services. Online evaluation systems score interviewees, with color-coded rating of red signaling a “poor candidate,” yellow as middling, and green as “hire away.”

Beyond biases embedded into code, some automated correlations and inferences may appear objective, but may in reality reflect bias. Algorithms may place a low score on occupations like migratory work or low paying service jobs. This correlation may have no discriminatory intent, but if a majority of those workers are racial minorities, such variables can unfairly impact consumers’ loan application decisions.

Credit scores are only as free from bias as the software and data behind them. Software engineers construct the datasets mined by scoring systems; they define the parameters of data-mining analyses; they create the clusters, links, and decision trees applied. They generate the predictive models applied. The biases and values of system developers and software programmers are embedded into each and every step of development.

Just as concerns about scoring systems are heightened, their human element is diminishing. Although software engineers initially identify the correlations and inferences programmed into algorithms, Big Data promises to eliminate the human “middleman” at some point in the process.

According to a January 9, 2014 article in, IBM says cognitive computing systems like Watson are capable of understanding the subtleties, idiosyncrasies, idioms and nuance of human language by mimicking how humans reason and process information.

Whereas traditional computing systems are programmed to calculate rapidly and perform deterministic tasks, IBM says cognitive systems analyze information and draw insights from the analysis using probabilistic analytics. And they effectively continuously reprogram themselves based on what they learn from their interactions with data.

Said IBM CEO Ginni Rometty, "In 2011, we introduced a new era [of computing] to you. It is cognitive. It was a new species, if I could call it that. It is taught, not programmed. It gets smarter over time. It makes better judgments over time." "It is not a super search engine," she adds. "It can find a needle in a haystack, but it also understands the haystack."

This "new species" of computing has its challenges. According to "IBM Struggles to Turn Watson Computer Into Big Business," a recent Wall Street Journal article:
Watson is having more trouble solving real-life problems than "Jeopardy" questions, according to a review of internal IBM documents and interviews with Watson's first customers. 
For example, Watson's basic learning process requires IBM engineers to master the technicalities of a customer's business—and translate those requirements into usable software. The process has been arduous.
Klaus-Peter Adlassnig is a computer scientist at the Medical University of Vienna and the editor-in-chief of the journal Artificial Intelligence in Medicine. The problem with Watson, as he sees it, is that it’s essentially a really good search engine that can answer questions posed in natural language. Over time, Watson does learn from its mistakes, but Adlassnig suspects that the sort of knowledge Watson acquires from medical texts and case studies is “very flat and very broad.” In a clinical setting, the computer would make for a very thorough but cripplingly literal-minded doctor—not necessarily the most valuable addition to a medical staff.

As Hector J. Levesque, a professor at the University of Toronto and a founding member of the American Association of Artificial Intelligence, wrote:

 "As a field, I believe that we tend to suffer from what might be called serial silver bulletism, defined as follows:
the tendency to believe in a silver bullet for AI, coupled with the belief that previous beliefs about silver bullets were hopelessly naıve. 
We see this in the fads and fashions of AI research over the years: first, automated theorem proving is going to solve it all; then, the methods appear too weak, and we favour expert systems; then the programs are not situated enough, and we move to behaviour-based robotics; then we come to believe that learning from big data is the answer; and on it goes."

Similarly, employment assessment companies have marketed the benefits of "science, precision and data" over the past fifteen years under the guise of neural networks, artificial intelligence, big data and deep learning, yet what has changed? Employee engagement levels have hardly budged and employee turnover remains a continuing and expensive challenge for employers. Please see Gut Check: How Intelligent is Artificial Intelligence?

Are Discriminatory Systems Discriminatory? If So, Then What?

Scoring/selection systems based on big data analytics have a powerful allure—their simplicity gives the illusion of precision and reliability. But predictive algorithms can be anything but accurate and fair. They can narrow people’s life opportunities in arbitrary and discriminatory ways. As Oscar Gandy states, "already burdened segments of the population become further victimized through the strategic use of sophisticated algorithms in support of the identification, classification, segmentation, and targeting of individuals as members of analytically constructed groups."

These systems are described as discriminatory because discrimination is what they are designed to do. Their value to users is based on their ability to sort things into categories and classes that take advantage of similarities and differences that seem to matter for the decisions users feel compelled to make. All of these assessments act as aids to discrimination - guiding a choice between or among competing options.

In many cases the decisions made by the users determine the provision, denial, enhancement, or restriction of the opportunities that individuals and consumers face both inside and outside of formal markets.The statistical discrimination enabled by sophisticated analytics compounds the disadvantages that the structural constraints we readily associate with race, class, gender, disability, and cultural identity influence the opportunity sets people encounter during their life. Please see Do We Regulate Algorithms, or Do Algorithms Regulate Us?

Seizing Opportunities, Preserving Values

The recent White House report, “Big Data: Seizing Opportunities, Preserving Values," found that, "while big data can be used for great social good, it can also be used in ways that perpetrate social harms or render outcomes that have inequitable impacts, even when discrimination is not intended." The fact sheet accompanying the White House report warns:
As more decisions about our commercial and personal lives are determined by algorithms and automated processes, we must pay careful attention that big data does not systematically disadvantage certain groups, whether inadvertently or intentionally. We must prevent new modes of discrimination that some uses of big data may enable, particularly with regard to longstanding civil rights protections in housing, employment, and credit.
Some of the most profound challenges revealed by the White House Report concern how big data analytics may lead to disparate inequitable treatment, particularly of disadvantaged groups, or create such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms.

Big data analytic systems, like those used by employers in making hiring decisions, have become sources of material risk, both to job applicants and employers. The systems create the perception of stability through probabilistic reasoning and the experience of accuracy, reliability, and comprehensiveness through automation and presentation. But in so doing, the systems draw attention away from uncertainty and partiality.

Moreover, they shroud opacity—and the challenges for oversight that opacity presents—in the guise of legitimacy, providing the allure of shortcuts and safe harbors. Programming and mathematical idiom (e.g., correlations) can shield layers of embedded assumptions from higher level decisionmakers at an employer who are charged with meaningful oversight and can mask important concerns with a veneer of transparency.This problem is compounded in the case of regulators outside the firm, who frequently lack the resources or vantage to peer inside buried decision processes.

Recognizing these problems, the White House Report states that "[t]he federal government must pay attention to the potential for big data technologies to facilitate discrimination inconsistent with the country’s laws and values" and contains the following recommendation:
The federal government’s lead civil rights and consumer protection agencies, including the Department of Justice, the Federal Trade Commission, the Consumer Financial Protection Bureau, and the Equal Employment Opportunity Commission, should expand their technical expertise to be able to identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations of law in such cases. 
Due Process for Automated Decisions?

Danielle Keats Citron and Frank Pasquale III have argued that scoring/selection systems should be subject to licensing and to audit requirements when they enter critical settings like employment, insurance, and health care. The idea is that with a technology as sensitive as scoring/selection, fair, accurate, and replicable use of data is critical.

Licensing can serve as a way of assuring that public values inform this technology. Such licensing could be completed by private entities that are themselves licensed by the relevant government agency (e.g., EEOC, FTC) This “licensing at one remove” has proven useful in the context of health information technology.

Keats Citron and Pasquale also argue that the federal government’s lead civil rights and consumer protection agencies should be given access to hiring systems, credit-scoring systems and other systems that have the potential to unlawfully harm citizens and consumers. Access could be more or less episodic depending on the extent of unfairness exhibited by the scoring system. Biannual audits would make sense for most scoring systems; more frequent monitoring would be necessary for those which had engaged in troubling conduct. We should be particularly focused on scoring systems which rank and rate individuals who can do little or nothing to protect themselves. Expert technologists could test scoring systems for bias, arbitrariness, and unfair mischaracterizations. To do so, they would need to view not only the datasets mined by scoring systems but also the source code and programmers’ notes describing the variables, correlations, and inferences embedded in the scoring systems’ algorithms.

For the review to be meaningful in an era of great technological change, the technical experts must be able to meaningfully assess systems whose predictions change pursuant to artificial intelligence (AI) logic. They should detect patterns and correlations tied to classifications that are already suspect under American law, such as race, nationality, sexual orientation, and gender. Scoring systems should be run through testing suites that run expected and unexpected hypothetical scenarios designed by policy experts. Testing reflects the norm of proper software development, and would help detect both programmers’ bias and bias emerging from the AI system’s evolution.

A potentially more difficult question concerns whether scoring/selection systems’ source code, algorithmic predictions, and modeling should be transparent to affected individuals and ultimately the public at large. There are legitimate arguments for some level of big data secrecy, including concerns connected to intellectual property, but these concerns are more than outweighed by the threats to human dignity posed by pervasive, secret, and automated scoring systems.

At the very least, individuals should have a meaningful form of notice and a chance to challenge predictive scores that harm their ability to obtain credit, jobs, housing, and other important opportunities. Even if scorers (e.g., testing companies) successfully press to maintain the confidentiality of their proprietary code and algorithms vis-a-vis the public at large, it is still possible for independent third parties to review it.

One possibility is that in any individual adjudication, the technical aspects of the system could be covered by a protective order requiring their confidentiality. Another possibility is to limit disclosure of the scoring system to trusted neutral experts. Those experts could be entrusted to assess the inferences and correlations contained in the audit trails. They could assess if scores are based on illegitimate characteristics such as disability, race, nationality, or gender or on mischaracterizations. This possibility would both protect scorers’ intellectual property and individuals’ interests.

Do We Regulate Algorithms, or Do Algorithms Regulate Us?

Can an algorithm be agnostic? Algorithms may be rule-based mechanisms that fulfill requests, but they are also governing agents that are choosing between competing, and sometimes conflicting, data objects.

The potential and pitfalls of an increasingly algorithmic world beg the question of whether legal and policy changes are needed to regulate our changing environment. Should we regulate, or further regulate, algorithms in certain contexts? What would such regulation look like? Is it even possible? What ill effects might regulation itself cause? Given the ubiquity of algorithms, do they, in a sense, regulate us? We regulate markets, and market behavior, out of concerns for equity, as well as out of concern for efficiency. The fact that the impacts of design flaws are inequitably distributed is at least one basis for justifying regulatory intervention.

The regulatory challenge is to find ways to internalize the many external costs generated by the rapidly expanding use of analytics. That is, to find ways to force the providers and users of discriminatory technologies to pay the full social costs of their use. Requirements to warn, or otherwise inform users and their customers about the risks associated with the use of these systems should not absolve system producers of their own responsibility for reducing or mitigating the harms. This is part of imposing economic burdens or using incentives as tools to shape behavior most efficiently and effectively. Please see Do We Regulate Algorithms, Or Do Algorithms Regulate Us?

Tuesday, June 17, 2014

Recession and Unemployment Cause Spike in Suicides

A new study has found the financial crisis between 2008 and 2010 was responsible for thousands of deaths across Europe and North America. The researchers found that suicides rates spiked sharply between 2008 and 2010 as millions lost their jobs and homes, and levels of debt surged. The spike in suicide rates reversed a downward trend in Europe and Canada in the years before the crash.

Experts from the University of Oxford and the London School of Hygiene and Tropical Medicine said the increase was four times higher among men than women.They analysed data from the World Health Organisation about suicides in 24 EU countries, the U.S. and Canada. Between 2007, when the economic crisis began, and 2009, suicide rates rose in Europe by 6.5 per cent, they found. The rates remained elevated until 2011.

This corresponds to an additional 7,950 suicides than would have been expected across the 24 EU countries during this time period. Before the recession suicide rates had been falling in Europe.

Deaths by suicide were also falling in Canada, but there was a marked increase when the recession took hold in 2008, leading to 240 more suicides.

The number of people taking their own life was already increasing in the US, but the rate "accelerated" with the economic crisis, leading to 4,750 additional deaths.

Overall there were at least 10,000 additional suicides as a result of the crisis, the authors say.
But they added that this is a conservative estimate. "If we had an upper range, you could say ... 20,000," said David Stuckler, a sociology professor at the University of Oxford and senior author of the report.

Some countries coped with the economic shock better than others, with suicide rates in Sweden and Austria remaining steady. Stuckler said both countries had a good record of helping people find new jobs, a key factor in keeping suicide rates in check along with good mental health care.

Employment Challenges for Persons with Serious Mental Illness

This post is comprised of excerpts from testimony by Dr. Gary Bond, Professor of Psychiatry, Dartmouth Psychiatric Research Center to the EEOC at a March 15, 2011 hearing on employment of persons with disabilities. Please click on this link for the complete written version of the testimony, together with all references.

There are many benefits of employment—work enhances skills such as communication, socialization, academics, physical health, and community skills; it factors into how one is perceived by society; it promotes economic well-being; it leads to greater opportunity for upward mobility; and it contributes to greater self-esteem. Yet only 15 percent of those with a mental disability are in the labor market
* * * * * * *
People with serious mental illness are a very heterogeneous group that has included Nobel Prize winners, American Presidents, artists, and other famous persons, as well as many who live in poverty and isolation. You cannot judge a person by their diagnosis. While the public has many negative stereotypes about this group, the take-home message from this testimony is that the research strongly demonstrates that full recovery from mental illness is possible. Working is a crucial element in this recovery process.

What is Serious Mental Illness?
This testimony concerns the population of people with serious mental illness, defined by three criteria: 
(a) Diagnosis: a psychiatric diagnosis of schizophrenia, bipolar disorder, or other major psychiatric disorder;
(b)Disability: significant role impairment, in areas such as independent living, interpersonal functioning, and employment;
(c) Duration: extended involvement with the mental health system (such as admission to psychiatric hospitals, supervised group homes, and mental health case management services). 
This is a large segment of the disability population. For example, over one-third of people of the Social Security disability roles have a serious mental illness.
Employment of People with Serious Mental Illness
Employment rates for people with serious mental illness are very low. Surveys have found that only 10% - 15% of people with serious mental illness receiving community treatment are competitively employed. Rates are even lower, typically less than 5%, in follow-up surveys of people discharged from psychiatric hospitals National and international surveys of community samples, which include respondents with less serious disorders, have reported employment rates of 20% - 25% for people with schizophrenia and related disorders.
The employment rate for people with serious mental illness is less than half the 33% rate for other disability groups Both rates are of course much lower than for the general population. Even during the height of the current recession, the national employment rate for adults in the general population was 72%, according to U.S. Bureau of Labor statistics.
Moreover, people with serious mental illness who are working are often underemployed. Nearly twice as many workers with mental illness earn at or near minimum wage as workers without disabilities. Non-standard jobs (such as temporary employment, independent contracting, and part-time employment) are common among workers with serious mental illness. Such jobs pay lower wages with fewer benefits. Among those employed, people with serious mental illness are overrepresented in unskilled occupations, such as in the service industries and as laborers.
Most People with Serious Mental Illness Want to Work
Despite these dismal employment statistics, most people with severe mental illness want to work. Studies indicate that approximately 2 out of every 3 people with mental illness are interested in competitive employment Moreover, these rates may understate the interest in working in this population, because many mental health professionals discourage clients from pursuing employment goals.
Barriers to Getting and Keeping Jobs
Why, then, is there a wide disparity between employment rates and desire to work? The reason is not that people with serious mental illness cannot work. People with serious mental illness are capable of working if they are matched to appropriate jobs and receive appropriate supports. But attitudinal, service, and system barriers are challenges to their employment. According to a national survey, persons with serious mental illness reported the primary barriers to employment to be:
  • stigma and discrimination (45%); 
  • fear of losing benefits (40%);
  • inadequate treatment of disability (28%);
  • lack of vocational services (23%).

Regarding attitudinal barriers, psychiatric disability is the most stigmatizing of all disabilities. One national survey reported that only 19% of those polled were “very comfortable with people with mental illness,” compared to triple than rate for people with a physical disability (National Organization on Disability, 1991). People with serious mental illness experience discrimination and negative attitudes constantly in everyday life. Employers are less likely to hire someone whom they believe has a mental illness.
One major attitudinal barrier is the perception that people with mental illness are violent. News stories involving atrocities committed by people with mental illness reinforce this perception. Based on extensive epidemiological research over the past two decades, we have a much better understanding of the risk factors for violence in the psychiatric population. Violence is exceedingly rare among people with mental illness, and the rare instances that do occur are associated with other factors, such as active substance use or refusing to take medications. Being employed significantly reduces the possibility of violence even further. In sum, a very low proportion of people with mental illness have a history of violence and overall people with mental illness are no more likely to behave violently than people without mental illness.
Another barrier is the fear of losing Social Security disability benefits (MacDonald-Wilson, Rogers, Ellison, & Lyass, 2003) Many of these apprehensions are based on lack of information and misconceptions. In one survey, 85% of Social Security disability beneficiaries incorrectly believed that Medicaid benefits would be terminated if they went to work (MacDonald-Wilson, Rogers, Ellison et al., 2003). Fortunately, when provided accurate information about the impact of employment, beneficiaries substantially increase their employment earnings (Tremblay, Smith, Xie, & Drake, 2006).
Once employed, people with serious mental illness often need ongoing support and accommodations to succeed. But employers are far less willing to accommodate people with psychiatric disabilities than those with physical conditions. Workers with mental health conditions are half as likely to receive accommodations as those with other disabilities. This is true even though most accommodations for psychiatric disabilities cost very little or nothing, in contrast to technological and architectural changes required for other disabilities. According to one employer survey, the kinds of functional limitations they most commonly observe in workers with psychiatric disabilities are cognitive (e.g., following instructions, concentrating) and social (e.g., interacting, reading social cues), and to a lesser extent emotional (e.g., managing symptoms, tolerating stress) and physical (e.g., stamina).
The onset of serious mental illness often occurs in early adulthood, interfering with the completion of education. Over 30% of people with severe mental illness have not completed high school. Low educational attainment contributes to the underemployment of people with serious mental illness. The median annual income of the U.S. population without a high school diploma or equivalent is less than $20,000. Median income increases 33% with completion of high school and more than triples with the completion of a bachelor’s degree.
The dismal rate of employment among people with serious mental illness is a formidable challenge. Nonetheless, we have compelling reasons to be optimistic that people with serious mental health problems can work and that working helps them to recover from mental illness. The major barriers to employment are not immutable clinical or cognitive characteristics but rather attitudes and lack of access to support services and accommodations.
Even in the absence of professional support, employers can play a pivotal role promoting the employment of people with serious mental illness by approaching applicants and workers as individuals and applying sound employment practices. Work accommodations typically involve pragmatic and inexpensive modifications. Employment is a win-win for people with serious mental illness, for employers, and for society at large.

Saturday, June 14, 2014

Algorithms: Deeply Human Choices Behind Cold Mechanisms

This post is comprised of excerpts and a graphic from Rethinking Personal Data: A New Lens for Strengthening Trust, a document published by the World Economic Forum and  prepared in collaboration with A.T. Kearney. The document addresses the key trust challenges facing the personal data economy, and offers a set of near-term and long-term insights for addressing these issues.  

* * * * * * *

Complex and opaque, algorithms generate the predictions, recommendations and inferences for decision-making in a data-driven society. While easily dismissed as abstract empirical processes, algorithms are deeply human. They reflect the intentions and values of the individuals and institutions which design and deploy them. The ability for algorithms to augment existing power asymmetries gives rise to debates on their influence over data-driven policy-making.

The nature of these debates are complex, value-laden and give rise to some fundamental societal choices. Questions of individual autonomy, the sovereignty of individuals, digital human rights, equitable value distribution and free will are all a part of these conversations. There are no easy answers. Through this long-term lens on the impact of proactive computing, the focal point for discussion begins to shift away from personal data, per se, to computer-based profiles of individuals and groups of individuals. These profiles — fueled by fine-grained behavioral and sensor data — make it possible to monitor, predict and instrument social phenomena at the micro and macro levels.

The world of “smart” environments, where cars, eyeglasses and just about everything else coalesce into the Internet of Things, creates a sea change in how data will be processed. Rather than being based on “interactive” human-machine computing, smart environments rely upon “proactive computing”. By design, these proactive environments are one step ahead of individuals. Connected cars need to anticipate accidents before they happen. Evacuating flood prone areas needs to occur before major storms hit.

The emphasis on proactive computing will change the role of human intervention from a governance perspective. Lacking a full understanding of how complex systems work, the ability of humans to understand, make decisions and adapt can be too slow, incomplete and unreliable. In this brave new world, building trust from the “principles up” will be essential and require new forms of governance that are open, inclusive, self-healing and generative.

From a community and societal perspective, as civil “regulation-by-algorithm” begins to scale, incumbent interests and power asymmetries will play an increasing role in establishing who gets
access to an array of commercial and governmental services. As such, there is a need to ensure that the algorithms driving proactive and anticipatory decisions will be lawful, fair and can be explained intelligibly. Meaningful responses must be given “when individuals are singled out to receive differentiated treatment by an automated recommendation system”.

One emerging set of concerns is the institutional ability “to discover and exploit the limits of an individual’s ability to pursue their own self-interest.” Given that a majority of consumer interactions in the future will be mediated via devices and commercially oriented communications platforms, data-centric institutions will have the means and incentives to trigger “predictable irrationality”
from individuals.

With a vast trail of “digital breadcrumbs” accessible for companies to mine and tailor highly personalized experiences, a growing set of concerns is arising on how individuals could be profiled and targeted at moments of key vulnerability (decision fatigue, information overload, etc.) and limit their ability to act with agency and in their own self-interest. With the lives of individuals becoming increasingly mediated by algorithms, a richer understanding is needed for how people adapt their behaviors to empower themselves and gain more control over the manner of how profiles and algorithms shape their lives in areas such as credit scores, retail experiences, differential pricing, reputational currencies, insurance rates, etc.

One of the most strategic insights on strengthening trust is the concept of exploring ways to share intended consequences of data usage to individuals. For example, the 2012 Draft European Data Protection Act (section 20), calls for “the obligation for data controllers to provide information about the envisaged effects of such processing on the data subject”.

To address this emerging set of concerns, establishing a cross-disciplinary community of forward-looking experts, complexity scientists, biologists, policy-makers and business leaders with an appreciation of the long-term societal impact was identified as a priority. This group would proactively help design and test systems that balanced the commercial, legal, civil and technological
incentives shaping outcomes at the individual and social level. They would need to develop some form of legal protection to limit liabilities and provide a safe space to explore complex issues in a
real-world setting. One attribute of this safe space would be for it to be governed by an institutional review board where ethics and the interests of individuals could have a meaningful and relevant voice (similar to how they are used by the biomedical and behavioural science sectors). Institutions concerned about legal uncertainties, regulatory action or civil lawsuits could have a richer means for assessing ethical concerns using these approaches.

Friday, June 13, 2014

Exacerbating Long-Term Unemployment: Big Data and Employment Assessments

A recent Brookings Institution paper states that the “diverse and varied set of characteristics [of the long-term unemployed] implies that a broad array of policies will be needed to substantially lower the long-term unemployment rate and stem labor force withdrawal, as concentrating on any single occupation, industry, demographic group or region is unlikely to have a substantial impact reducing long-term unemployment by itself." Please see On the Margins of the Labor Market.

There is, however, a common employment factor that can be linked to numerous occupations, industries, demographic groups and regions -- online job application processes that require individuals (i) to provide "location-based information" (i.e., distance from job site, commute time, household relocation) and (ii) to complete personality assessments.The screening elements in these processes exclude or penalize persons with lower socioeconomic status - disproportionately Blacks, Hispanics, persons with mental illness, and the less well-educated. The same groups (ex persons with mental illness) that the recent Brookings Institution paper found to comprise a disproportionate percentage of the long-term unemployed.

Jobs that were once filled on the basis of work history and interviews are left to personality tests, data analysis and algorithms. The new hiring tools are part of a broader effort to gather and analyze employee data.  Use of online assessments has grown exponentially over the past 10-15 years, with assessment companies like Kronos now having a database of hundreds of millions of job applicant and employee information. To provide a sense of scale, one major big box retailer processes more than nine million job applications a year.

Personality tests are “growing like wildfire,” said Josh Bersin, president and CEO of Bersin & Associates, an Oakland, Calif., research firm. Bersin estimated that this kind of pre-hire testing has been growing by as much as 20 percent annually in the past few years. Industries that are flooded with resumes such as retail, food service and hospitality are among the ones that use such tests most often, he said.

Employment Redlining: Location-Based Discrimination

Kenexa, an assessment company purchased by IBM in December 2012 for $1.3 billion, will test tens of millions of applicants this year for thousands of clients. Kenexa believes that a lengthy commute raises the risk of attrition in call-center and fast-food jobs. It asks applicants for call-center and fast-food jobs to describe their commute by picking options ranging from "less than 10 minutes" to "more than 45 minutes."The longer the commute, the lower their recommendation score for these jobs, said Jeff Weekley,, who oversees the assessments. Applicants also can be asked how long they have been at their current address and how many times they have moved. People who move more frequently "have a higher likelihood of leaving," Mr. Weekley said.

Painting with the broad brush of distance from job site, commute time and moving frequency results in otherwise well-qualified applicants being excluded, applicants who might have ended up being among the longest tenured of employees. The Kenexa  findings are generalized correlations; the insights say nothing about any particular applicant. Please see From What Distance is Discrimination Acceptable.

Are there any groups of people who might live farther from the work site and may move more frequently than others? Yes, lower-income persons, disproportionately women, black, Hispanic and the mentally ill. They can't afford to live where the jobs are and move more frequently because of an inability to afford housing or the loss of employment.

Spatial  Mismatch and its Institutionalization

An NBER study published in April 2014, "Job Displacement and the Duration of Joblessness: The Role of Spatial Mismatch, finds that better job accessibility significantly decreases the duration of joblessness among lower-paid displaced workers. Blacks, females, and older workers are more sensitive to job accessibility than other subpopulations.

The so-called “spatial mismatch hypothesis,” which originally grew out of research on the effects of segregated housing markets, has been debated among economists and social scientists since the 1960s. But while there’s general agreement that “job accessibility” has some impact on unemployment duration, researchers have disagreed about how important it is and for which groups of workers.

Although the study was limited to the 2000-05 period, its conclusion — that “a worker with locally inferior access to jobs is likely to have worse labor market outcomes” — could help explain the current situation. What we know for sure is that as of March 2014, more than a third (35.7%) of all unemployed Americans had been out of work for more than 26 weeks, according to the BLS. Blacks and Asians are most likely to experience extended joblessness: Last month, 44% of unemployed blacks and about as many unemployed Asians had been out of work longer than 26 weeks, versus a third of unemployed whites and 32% of unemployed Hispanics. Please see Long-Term Unemployment and its Costs.

With the "location-based" scoring "insights" provided by companies like Kenexa, spatial mismatch has been institutionalized over the past 5-10 years. If a job applicant has a long commute - whether due to the lack of effective mass transit where the applicant lives or to the lack of access to personal transportation, that applicant may never be interviewed, let alone offered a job.

Mental Illness and Socioeconomic Status

One of the most consistently replicated findings in the social sciences has been the negative relationship of socioeconomic status (SES) with mental illness: The lower the SES of an individual is, the higher is his or her risk of mental illness.

As an example, for the period from 2005-2010, the Centers for Disease Control found that among adults 20–44 and 45–64 years of age, depression was five times as high for those below poverty, about three times as high for those with family income at 100%–199% of poverty, and 60% higher for those with income at 200%–399% of poverty compared with those at 400% or more of the poverty level.

According to a 2001 study, lower income Americans had a higher prevalence of 1 or more psychiatric disorders (51% vs 28%): mood disorders (33% vs 16%), anxiety disorders (36% vs 11%), and eating disorders (10% vs 7%). Consequently, pre-employment assessments using these location-based "insights" screen out persons with mental illness.

Mental Illness and Disability

The prevalence of mental disorders in the U.S. population remained unchanged between 1990 and 2003.  In that same interval, the rate of treatment of mental illness substantially increased—which in turn should have contributed to improved work-readiness among individuals coping with mental illness. The combination of the prevalence of mental disorders remaining unchanged and substantially increased rates of treatment should have resulted in a decline in the percentage of persons receiving SSDI awards who are diagnosed with mental illness. That has not been the case. Please see Costing Taxpayers Billions of Dollars Each Year.

People with psychiatric impairments constitute the largest and most rapidly growing subgroup of Social Security disability beneficiaries. In 2011, 47.5 percent of persons receiving SSI and 31.0 percent of persons receiving SSDI had a mental disorder. These percentages keep growing, in part because beneficiaries with psychiatric impairments are generally younger than other beneficiaries when they become ill and therefore remain on the Social Security rolls much longer.

Some analysts contend that rising disability awards for mental illness reflect a “broken” system that provides benefits to those who should not receive them; others point out that income support makes it easier for persons with mental illness to live in the community. These conflicting conclusions reflect an ongoing debate over whether increasing awards for mental illness represent a policy success because they reach needy individuals or failure because the increased awards reflect moral hazard.

The income support programs may be working as designed, but those programs did not anticipate the impact of the widespread use of pre-employment assessments and the resulting material increase in the absolute number and percentage of unemployed persons with mental disabilities seeking SSDI and SSI benefits as a consequence of the use of potentially  illegal assessments.

* * * * *

Persistently high long-term unemployment has significant implications for families, government budgets, and the country’s overall economic and social health. The high rate of long-term unemployment has had a direct impact on the federal budget by prompting the extension of normal unemployment benefits, ratcheting up spending on other government safety-net programs (including, indirectly, SSDI, SSI and Medicare) and by reducing taxable wages. Martin Feldstein in a recent article in the Wall Street Journal, draws on the Brookings Institution paper to suggest that those who have been out of work for six months or more do not affect wage inflation and that since the unemployment rate among those out of work for less than six months was only 4.1%, wage inflation may soon begin to rise more rapidly.

The growing and widespread use of employment assessments and applicant data collection processes over the past ten years has likely had an impact on the growth of the long-term unemployed in the U.S. labor market.  Persons with lower socioeconomic status, disproportionately Black, Hispanic, persons with mental illness, and the less well-educated, risk becoming a permanent underclass of the unemployed and underemployed.

Some of the most profound challenges revealed by the recent White House Report "Big Data: Seizing Opportunities, Preserving Values" concern how big data analytics may lead to disparate inequitable treatment, particularly of disadvantaged groups, or create such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms. Please see White House: Big Data's Role in Employment Discrimination.

Workforce assessment systems, designed in part to mitigate risks for employers, have become sources of material risk, both to job applicants and employers. The systems create the perception of stability through probabilistic reasoning and the experience of accuracy, reliability, and comprehensiveness through automation and presentation. But in so doing, technology systems draw  attention away from uncertainty and partiality. Moreover, they shroud opacity—and the challenges for oversight that opacity presents—in the guise of legitimacy, providing the allure of shortcuts and safe harbors for actors both challenged by resource constraints and desperate for acceptable means to demonstrate compliance with legal mandates and market expectations.

Long Term Unemployment and its Costs

In the first quarter of 2012 (the 3-month period from January to March), approximately 29.5 percent of the nearly 13.3 million Americans who were unemployed had been jobless for a
year or more, according to data released by the U.S. Department of Labor’s Bureau of Labor Statistics (BLS). That percentage translates into 3.9 million  workers, slightly more than the population of Oregon. The total unemployment rate (short-term and long-term unemployed) is currently at 6.3%, but an unusually large one-third of those who are counted as unemployed have been out of work for more than six months.

According to Pew analysis of Current Population Survey (CPS) data from the BLS, the percentage
of jobless workers who had been unemployed for a year or more reached  a peak of 31.8 percent in the third quarter of 2011. Despite modest improvement in the first quarter of 2012, the rate of long-term unemployment among the jobless remained stubbornly high. In fact, it was more than triple the 9.5 percent rate that it was in the first quarter of 2008, the first quarter of the Great Recession (see following table).

As discussed in A Year or More: The High Cost of Long-Term Unemployment, a report released by the Pew Fiscal Analysis Initiative in April 2010, persistently high long-term unemployment has significant implications for families, government budgets, and the country’s overall economic and social health.

A high rate of long-term unemployment has had a direct impact on the federal budget by prompting the extension of normal unemployment benefits, ratcheting up spending on other government safety-net programs (i.e., SSDI and SSI) and by reducing taxable wages. Please see Costing Taxpayers Billions of Dollars.

Long-term unemployment also affects the federal budget on the other side of the fiscal ledger by reducing income tax revenue and the amount of money flowing into the unemployment insurance pool. Unemployment benefits are taxable, but people on the unemployment rolls are receiving only a fraction of the income they would be getting if they were working. As a result, they are paying only a fraction of the taxes.

On the Margins of the Labor Market

In “Are the Long-Term Unemployed on the Margins of the Labor Market?” Alan B. Krueger, Judd Cramer, and David Cho of Princeton University find that even after finding another job, reemployment does not fully reset the clock for the long-term unemployed, who are frequently jobless again soon after they gain reemployment. Even in good times, the long-term unemployed are on the margins of the labor market, with diminished job prospects and high labor force withdrawal rates, and as a result they exert little pressure on wage growth or inflation.

Justin Wolfers, a Senior Fellow, Economic Studies at Brookings Institution, the publisher of the paper, discusses the findings of the research paper in the video below:

Transitioning from Unemployment to Employment

The graphic below displays annual averages of monthly transition rates from unemployment to employment each year since 1994 for five duration of unemployment categories:
  1. Unemployed less than 5 weeks
  2. Unemployed for 5 to 14 weeks
  3. Unemployed for 15 to 26 weeks
  4. Unemployed for 27 to 52 weeks
  5. Unemployed for 52 weeks and over

Probability of Transitioning from Unemployment to Employment by Duration of Unemployment
A few patterns are clear. First, the job-finding rate is lower for those with a longer duration of unemployment, with the long-term unemployed finding jobs at less than half the rate of those very short-term unemployed. Second, the cyclicality of job finding is clear in these data, with all rates declining during the recession of the early 2000s, and declining more dramatically during the Great Recession. Third, job finding rates for all groups remain well below their pre-Great Recession averages. Fourth, the job finding rate has risen for each group in the last four years, although it has barely increased for those unemployed longer than a year.
In 2013, just under 10 percent of those who had been unemployed for more than one year transitioned into employment in the average month. This rate, though higher than in many European countries, might overstate how well the long-term unemployed are faring due to measurement error and the fact that the long-term unemployed are more likely to take low-paying, part-time jobs and temporary jobs

On the Margins

The total unemployment rate (short-term and long-term unemployed) is currently at 6.3%, but an unusually large one-third of those who are counted as unemployed have been out of work for more than six months. The research paper indicates that the longer workers are unemployed the less they become tied to the job market, either because, on the supply side, they grow discouraged and search for a job less intensively or because, on the demand side, employers discriminate against the long-term unemployed, based on the (rational or irrational) expectation that there is a productivity-related reason that accounts for their long jobless spell.

The demand-side and supply-side effects of long-term unemployment can be viewed as complementary and reinforcing of each other as opposed to competing explanations, as statistical discrimination against the long-term unemployed could lead to discouragement, and skill erosion that accompanies long-term unemployment could induce employers to discriminate against the long-term unemployed.

As shown in the graphic below, even after finding another job, reemployment does not fully reset the clock for the long-term unemployed, who are frequently jobless again soon after they gain reemployment: only 11 percent of those who were long-term unemployed in a given month returned to steady, full-time employment a year later.
Comparing the Unemployed and Employed

If the long-term unemployed are compared to the short-term unemployed, a larger proportion of the long-term unemployed are over age 50.
African Americans and Hispanics are also over represented among the ranks of the unemployed compared with the employed. For example, African Americans comprise 22 percent of the long-term unemployed, compared with just 10 percent of the employed population.
If the unemployed as a whole are compared to the employed, notably larger shares of the unemployed are less well educated. For example, although about one third of employed workers have earned a bachelor’s degree, less than 20 percent of the unemployed have done so. By contrast, nearly 20 percent of the unemployed lack a high school diploma, which is twice the rate for the employed.
The long-term unemployed have problems finding work wherever they are, even in states with the lowest unemployment – Hawaii, Iowa, Kansas, Minnesota, Montana, Nebraska, New Hampshire, North Dakota, Oklahoma, South Dakota, Utah, Vermont, Virginia, and Wyoming – where the average unemployment rate was 4.4 percent (compared to 7.0 elsewhere). Even in those 14 states, long-term unemployment grew dramatically during the recession, reaching 4.5 times its historical average, suggesting that long-term unemployment will be a lingering problem even if the unemployment rate returns to normal.

Overall, there is little evidence to suggest that the long-term unemployed fare substantially better in the states with the lowest unemployment rates, consistent with the idea that the long-term unemployed are on the margins of the labor force, even where the economy is stronger.

Unlucky Subset 

The portrait of the long-term unemployed in the U.S. that emerges here suggests that, to a
considerable extent, they are an unlucky subset of the unemployed. Their diverse and varied set
of characteristics implies that a broad array of policies will be needed to substantially lower the
long-term unemployment rate and stem labor force withdrawal, as concentrating on any single
occupation, industry, demographic group or region is unlikely to have a substantial impact reducing long-term unemployment by itself. Understanding the labor market and personal hurdles faced by the long-term unemployed should be a priority for future research in order to craft solutions to reduce long-term unemployment.

To the authors of the paper, the most important policy challenges involve designing effective interventions to prevent the long-term unemployed from receding into the margins of the labor market or withdrawing from the labor force altogether, and supporting those who have left the labor force to engage in productive activities. Overcoming the obstacles that prevent many of the long-term unemployed from finding gainful employment, even in good times, will require a concerted effort.

Friday, June 6, 2014

Primer on Big Data and Hiring: Chapter 7

This is the seventh chapter of a primer on big data and hiring. The structure of the primer is based on the following graphic created by Evolv, a company that provides "workforce optimization" services. Evolv was selected not because it is sui generis; rather, it is emblematic of numerous companies, from start-ups to well-established companies that market "workforce science" services to employers.

The Evolv graphic below is intended to illustrate the process of workforce science.

Chapter 7: Optimize
Closed-Loop Optimization Constantly Analyzes and Refines Insights

According to Evolv, "closed-loop optimization is the process of using Big Data analytics to determine the outcomes of the assessments and other data collected, and then using the knowledge gained to make ever more effective assessments." Click on this link for an Evolv video that describes the closed-loop optimization process.

The challenge in using a closed-loop optimization process for hiring and employment decisions is that those decisions do not fit within a closed loop. Take for example the Evolv insight that living in close proximity to the job site are correlated with reduced attrition and better performance. Over time, the closed-loop optimization process for that insight means that a growing percentage of the workforce lives in close proximity to the job site. Excellent. Less attrition and better performance across jobsite.

That closed loop, however, does not account for factors like the element of time and the relative immobility of persons and companies. Businesses tend to be clustered; they are not evenly spread throughout the geography. If all businesses in a particular area focus on hiring applicants in close proximity, costs will increase (greater demand for the same number of applicants), employee turnover will increase (since the number of geographically-proximate employees changes slowly) and profitability will decrease (higher wage costs combined with greater turnover).

When two variables, A and B, are found to be correlated, there are several possibilities:
  • A causes B
  • B causes A
  • A causes B at the same time as B causes A (a self-reinforcing system)
  • Some third factor causes both A and B

The correlation is simple coincidence. It is wrong to assume any of these possibilities. Evolv, however, assumes that A (proximity to job site) causes B (reduced attrition and better performance). Therefore, employers should hire applicants who live closer to the job site. 

The correlation could also demonstrate B (reduced attrition and better performance) is caused by C (proximity of job site to applicants homes). Instead of being a hiring insight, the correlation might function better as being a job site location insight. Given the relative immobility of persons and companies, locating a job site (call center, etc.) close to communities with high numbers of lower-income persons could lead to a more sustainable competitive advantage.

As David Brooks wrote, "Data struggles with context. Human decisions are not discrete events. They are embedded in sequences and contexts. ... Data analysis is pretty bad at narrative and emergent thinking, and it cannot match the explanatory suppleness of even a mediocre novel."

Executives and managers frequently hear about some new software billed as the “next big thing.” They call the software provider and say, “We heard you have a great tool and we’d like a demonstration.” The software is certainly seductive with its bells and whistles, but its effectiveness and usefulness depend upon the validity of the information going in and how the people actually work with it over time. Having a tool is great, but remember that a fool with a tool is still a fool (and sometimes a dangerous fool).

Primer on Big Data and Hiring: Chapter 6

This is the sixth chapter of a primer on big data and hiring. The structure of the primer is based on the following graphic created by Evolv, a company that provides "workforce optimization" services. Evolv was selected not because it is sui generis; rather, it is emblematic of numerous companies, from start-ups to well-established companies that market "workforce science" services to employers.

The Evolv graphic below is intended to illustrate the process of workforce science.

Chapter 6: Evaluate and Act
Analyzed Data Reveals Insights That Drive Workforce Performance and Retention
The Impact of Insights Are Quantified and Used to Inform Decision-making

As noted in a prior post, prejudice does not rise from malice or hostile animus alone. It may result as well from insensitivity caused by simple want of careful, rational reflection.

For example, take two insights from Evolv:

  1. Living in close proximity to the job site and having access to reliable transportation—are correlated with reduced attrition and better performance; and
  2. Referred employees have 10% longer tenure than non-referred employees and demonstrate approximately equal performance.
An employer confronted with these two insights might well determine that (i) applicants living beyond a certain distance from the job site (i.e., retail store) should be excluded from employment consideration and (ii) preference in hiring should be extended to applicants referred by existing employees.

Painting with the broad brush of distance from job site will result in well-qualified applicants being excluded, applicants who might have ended up being among the longest tenured of employees. Remember that the Evolv insight is a generalized correlation (i.e., persons living closer to the job site tend to have longer tenure than persons living farther from the job site). The insight says nothing about any particular applicant.

As a consequence, employers will pass over qualified applicants solely because they live (or don't live) in certain areas. Not only does the employer do a disservice to itself and the applicant, they increase the risk of employment litigation, with its consequent costs. How?

A recent New York Time article, "In Climbing Income Ladder, Location Matters," reads, in part:

Her nearly four-hour round-trip [job commute] stems largely from the economic geography of Atlanta, which is one of America’s most affluent metropolitan areas yet also one of the most physically divided by income. The low-income neighborhoods here often stretch for miles, with rows of houses and low-slung apartments, interrupted by the occasional strip mall, and lacking much in the way of good-paying jobs
The dearth of good-paying jobs in low-income neighborhoods means that residents of those neighborhoods have a longer commute. The 2010 Census showed that poverty rates are much higher for blacks and Hispanics. Consequently, hiring decisions predicated on distance, intentionally or not, discriminate against certain races.

Similarly, an employer extending a hiring preference to referrals of existing employees may be further exacerbating the discriminatory impact of its hiring process. Those referrals tend to be persons from the same neighborhoods and socioeconomic backgrounds of existing employees, meaning that workforce diversity, broadly considered, will decline.

With the huge amounts of "bad" data that get generated and stored daily, the failure to understand how to leverage the data in a practical way that has business benefit will increasingly lead to shaky insights and faulty decision-making, with significant costs to applicants, employees , employers  and society.