Organizations are increasingly focusing on diversity, equity and inclusion in their hiring practices and corporate culture, not just because it’s the right thing to do, but by not doing it. not, it can hurt the business.
With software at the heart of every business and organizations deriving more value and insight from their data collected by the software, having undiversified datasets and software can result in products and services that do not only appeal to a specific group of people and underserve another, or worse, harm them. The reality is that developers and data scientists encode their beliefs, convictions and biases – most often unconsciously – in their data and when they design software.
We have already seen in real life the negative impacts of when data science and software development is left unchecked without considering DE&I. For example, at the start Amazon’s attempt to design a computer program to guide its hiring decisions, the company used submitted resumes from the previous decade as training data. Because most of these resumes were from men, the program learned that male applicants were preferable to women. Although Amazon realized this trend early on and never used the program to assess candidatesthe example shows how relying on biased data can reinforce inequalities.
Ultimately, these problems are not due to malevolent intent, but rather to being “blind” or ignorant to all viewpoints and potential outcomes that groups of people experience differently. The best way to mitigate and avoid the problem is to have a team with diverse representation covering various professional backgrounds, genders, races, ethnicities, etc. A diverse team can examine every step of building and managing data pipelines (collection, cleansing, etc.) and the software delivery process considering all kinds of outcomes.
While we see developments and improvements in the growing diversity of roles in data science and software, there is still much to do. A 2020 study in AI suggests that while data science is a rather new field and it will take time to respond to diversity initiatives, some of the efforts to increase diversity in other technology areas may be successful. Over the past few years, many diverse coding conferences and events have been developed, with rapidly growing attendance rates.
One of the first places to start is to commit to hiring diverse candidates and fostering an inclusive work culture that retains and ensures the continued development of diverse teams. Likewise, managers must ensure they create an inclusive and open culture that gives voice to underrepresented talent.
From there, the assurance of your organization’s data integrity and software delivery can begin to take shape.
How to ensure the integrity of your data and its results
As we know, the ramifications of biased data can impact society as a whole, so it’s important to have the right set of data and apply it correctly. Programmatically, software teams follow a life cycle: collecting data, cleaning and classifying, then writing code that uses that data and testing to achieve results that meet business and customer needs. Having a diverse set of people working at each stage of the lifecycle will help organizations avoid some of the previously mentioned pitfalls.
Spending time defining what is a “good” set of data that will provide fair results is essential to ensuring the integrity of your data. Specifically, when looking at a set of data, teams need to ask themselves if the result could be harmful or if there is something to be learned from it. They should ask questions such as, what does the good look like, where could there be biases, which populations could it harm? If the data doesn’t represent the population, you can expect to get bad results or results from that dataset. Throughout the data collection process, be sure to capture all perspectives, not throw away critical information, and feed the data with the notion of what will result in “good” results.
The iterative nature of software development also gives teams the ability to continually course-correct when they detect problems in the data, where the data may be “tainted” by personal biases, and to constantly adjust.
Addressing unconscious bias issues at every stage of the product lifecycle, from strategy to product definition, requirements, user experience, engineering and product marketing, will ensure that organizations provide software that meets more needs. Likewise, diverse teams working on fair and more inclusive datasets and software can drive innovation that creates competitive advantage, improves customer experience, and improves service quality, which can lead to better business outcomes. .