Challenging Assumptions on Big Data

This is part of a Penton Technology special series on big data.

As you read through Penton Technology’s special report on big data and the opportunity it offers you as a service provider, you will learn how big data can be transformative if used properly. It’s not just marketers that can take advantage of all the insights big data can provide; governments, healthcare systems and others in the public arena can use big data to better serve their constituents by providing adequate healthcare and housing, for example.

As technologists, we often talk about big data as a technical challenge: how will we be able to support the growth of data? Will our clouds be able to scale with the data deluge? How will we secure all of this data? It can be easy to forget about the other side, which deals with the ethics of big data.

Discrimination and Big Data

The White House has actually been exploring this side of big data through the Obama Administration’s Big Data Working Group in hopes of creating dialogue around some of these issues. The group has released a series of reports on big data; the most recent report, released last month, looks at the assumption that big data techniques are unbiased.

“The algorithmic systems that turn data into information are not infallible—they rely on the imperfect inputs, logic, probability, and people who design them. Predictors of success can become barriers to entry; careful marketing can be rooted in stereotype. Without deliberate care, these innovations can easily hardwire discrimination, reinforce bias, and mask opportunity,” according to a blog post.

Some examples of these biases could leave underserved and low-income families out of credit and employment opportunities. The Federal Trade Commission said that to maximize the benefits and limit the harms of big data, companies should consider how representative their data set is, if their data model accounts for biases, the accuracy of their predictions based on big data, and whether their reliance on big data raises concerns around fairness.

“Ideally, data systems will contribute to removing inappropriate human bias where it has previously existed. We must pay ongoing and careful attention to ensure that the use of big data does not contribute to systematically disadvantaging certain groups,” the report said. “To avoid exacerbating biases by encoding them into technological systems, we need to develop a principle of ‘equal opportunity by design’—designing data systems that promote fairness and safeguard against discrimination from the first step of the engineering process and continuing throughout their lifespan.”

The White House report goes through several case studies and recommends that “public and private sectors continue to have collaborative conversations about how to achieve the most out of big data technologies while deliberately applying these tools to avoid—and when appropriate, address—discrimination.” The report offers 5 recommendations:

  1. Support research into mitigating algorithmic discrimination, building systems that support fairness and accountability, and developing strong data ethics frameworks.
  2. Encourage market participants to design the best algorithmic systems, including transparency and accountability mechanisms such as the ability for subjects to correct inaccurate data and appeal algorithmic-based decisions.
  3. Promote academic research and industry development of algorithmic auditing and external testing of big data systems to ensure that people are being treated fairly.
  4. Broaden participation in computer science and data science, including opportunities to improve basic fluencies and capabilities of all Americans.
  5. Consider the roles of the government and private sector in setting the rules of the road for how data is used.

Big Data Algorithms Can Learn Discrimination

Algorithms learn from the external process of bias or discriminatory behavior, according to a blog post by The Ford Foundation.

“The origin of the prejudice is not necessarily embedded in the algorithm itself: Rather, it is in the models used to process massive amounts of available data and the adaptive nature of the algorithm. As an adaptive algorithm is used, it can learn societal biases it observes.”

In order to address this, policymakers must learn more about big data, and algorithms that underpin “critical systems like public education and criminal justice” must be transparent. There also needs to be updated regulation around the use of personal data, according to the report.

Data as a Liability

A report by our sister site Windows IT Pro looks at the amount of responsibility companies take on when they store big data, referencing a report by Quartz which states data companies will not own data, “they will just manage the flows of data between those that have data and those that need it.”

The conversation around discrimination and big data challenges the idea that big data will replace human decision-making. Big data algorithms can assess, sort, and analyze all sorts of data, but without a human ensuring that the data is being used properly, there could be major consequences.

Source: TheWHIR