Ethics in AI is a broad, deep, and tough subject. It’s also, arguably, one of the most important subjects for analysts, data scientists, and organizations overall to deliberately and determinedly tackle as a standard part of how they do work. On this episode, Renée Cummings, Professor of Practice in Data Science and Data Activist in Residence at the University of Virginia (among many other roles), joined us for a discussion of the topic. Her knowledge of the subject is as deep as her passion for it, and both are bordering on the limitless, so it was an incredibly informative chat!
Photo by Philippe Oursel on Unsplash
[music]
0:00:06.6 Announcer: Welcome to the Analytics Power Hour. Analytics topics covered conversationally and sometimes with explicit language. Here are your hosts, Moe, Michael, and Tim.
0:00:22.5 Michael Helbling: Hi, everyone. It’s the Analytics Power Hour. This is episode 206. You may have heard the phrase, the data is the data. And kind of a nod to the idea that in data we can find some sense of objectivity and truth. But practice in the field for any significant length of time and you learn that data and its uses are inextricably linked to the intent, purposes, and inclinations of the people involved. On the one hand, there are people who intentionally misuse data to achieve their own ends. And then there are also unforeseen problems or honest mistakes. But to the people impacted by either end of that, the result makes no difference. And sure, bad data online might make a less than desirable customer experience, but as data and AI become ubiquitous in society, the use of it impacts marginalised populations all the more. Let me introduce my two co-hosts for a conversation on this topic. Hey, Moe. You’re the marketing leader of Analytics at Canva. Do you run into this problem?
0:01:29.2 Moe Kiss: I am so excited about this topic. I can’t even contain myself. [chuckle]
0:01:35.5 MH: I think we all are. And Tim Wilson, senior director of Analytics at Search Discovery. You saw our guests speak at a conference and that’s how we found out about it. So I know you’re excited about this.
0:01:48.3 Tim Wilson: I’m excited and I’m sure that I am running into this problem and unaware of it in many of those cases. So I’m glad we’re gonna just solve it in the next 45 minutes. [laughter]
0:01:56.0 MH: Yeah, that’s funny. Yeah, I’m Michael Helbling. I don’t know about solving, but we did want to bring on a guest who I’ve alluded to, someone who has identified this challenge much earlier and has shaped her career around it. Renee Cummings is an AI Ethicist, Data Activist, Criminologist, Criminal Psychologist, Therapeutic Jurisprudence Specialist, Urban Technologist, and Consultant. She is a Professor of Practice in Data Science and Data Activist in Residence at the University of Virginia. She is the founder and CEO of Urban AI and a leader in Women in AI Ethics. And today she is our guest. Welcome to the show, Renee.
0:02:33.8 Renée Cummings: Thank you so much. And thank you for having me. It’s certainly an honour.
0:02:37.5 MH: We feel like the honour is on our side. [chuckle] So we’re so excited to have you and to start this conversation. I think for our listeners and even for us, I’d love to understand more how in your career you saw a need to move in this direction and what started you on this path into AI ethics.
0:02:56.8 RC: And to have the coolest job title ever of data activist as well. [laughter]
0:03:00.0 MH: Thank you. [laughter]
0:03:02.7 RC: Thank you so much. It’s a question that I get all the time and I say, as a criminologist and criminal psychologist, I worked in data for a very long time. And two things made me invite myself to this data science party. One, were the risk assessment tools that were being used in the criminal justice system that were creating these zombie predictions and overestimating the risks of black and brown defendants. And the other was looking at the trauma that’s related to data. So it were the risk assessment tools and something that I call data trauma that really brought me into the space.
0:03:45.1 RC: And that’s what I did. I invited myself to this data science and AI party, and I really started to speak publicly on ways in which we need to rethink data science and AI within the context of justice and equity and diversity and inclusion and how we need to use a trauma-informed approach to data to build data that is sustainable, to build AI that is sustainable and resilient as well as AI that is responsible and trustworthy. And if it is that we really wanted to design or develop or deploy or adopt an AI that is mature, then it has to have the kind of life experience and the kind of intuition and the kind of understanding and the kind of sensitivity that is so prerequisite to what it means to be human.
0:04:40.4 TW: So I feel like… I feel very small, like trying to track down improving conversion rates for clients [laughter] Can you talk a little about what data trauma is? Trauma is like a… It’s a word that has pretty heavy connotations. What’s kind of the… How do you define that or explain what data trauma is.
0:05:01.9 RC: Well, the easiest way to think about it and to look at it would be the history of data. And when we think of the history of data, particularly in the United States, we understand that there’ve been many times that data has created an extraordinary amount of trauma for particular communities. When we think of healthcare data and how we’ve gathered that data over the years and some of the missteps that have been made with those data sets. And when we think of criminal justice data and we think of our criminal justice system, which is a good system, but it’s not a perfect system.
0:05:35.3 RC: And between good and perfect, the criminal justice system has made an extraordinary amount of missteps again, in particular in black and brown communities. And when we speak of data in certain communities, it comes with an extraordinary amount of trauma. When we think about credit scores and data attached to credit scores, and we think about something like generational equity and how data that has been gathered in ways that undermine its credibility. When we think about digital redlining and we think about so many ways we have used data to create trauma intentionally or unintentionally in certain communities. Then we understand that when the word data is spoken in certain communities, the connotation is not always a positive one. It could bring an extraordinary amount of stress for particular communities.
0:06:33.2 RC: So the history of the data set, the memory in much of our data sets, the fact that the data continues to remember the trauma and that history of sometimes pain that it carries forward. It means that we have got to think critically about data science and our data sets and how we use those data sets to create a future because what we continue to see, we are using data-driven technologies in ways that are brilliant and transformative and ways in which we can really enrich the human experience. But then we’re also using data in ways that undermine the human experience, that undermine our ability to create generational equity that undermines access and opportunities for particular groups. So when I speak about data trauma, I speak about that history that we are yet to reconcile in our data sets.
0:07:29.4 MK: So I have to give you a full disclosure that I have a master’s in criminology and I used to work in the youth justice system in my former career and now have transitioned into data. So actually being exposed to your work has been phenomenal ’cause I finally see…
0:07:41.3 RC: Oh, that’s fantastic. [chuckle]
0:07:42.0 MK: How my two worlds come together. But I think the thing that I’m trying to wrap my head around because now I work in a very different data space, is that when we talk to machine learning engineers and data scientists, everyone thinks this problem starts at the model building. And I’m curious because as we talk about data trauma and like I know, you’ve made some comments before about the collection of data, which I thought was really fascinating, it sounds like this inequity in what we’re building starts much earlier.
0:08:20.6 RC: It definitely. And recently I was reading an article in the Harvard Data Science Review and it was very interesting. And I think you’re going to like this because you have a background in criminal justice. And it’s not verbatim that I’m quoting here, but it speaks to the essence that if you interrogate the data long and hard enough, you will get a confession. And that [chuckle] really is what data science is about. If you interrogate that data, it will eventually confess to anything you want it to confess to. And that’s the challenge that we have. So it’s more than the model that we’re building. It’s the model of the society that we have built that really creates those inequities, that really undermines the ways in which we have been doing data science and the ways in which we are building AI and building data-driven technologies. And that is why even with the best intention, if we do not understand the history and the politics attached to data and the fact that data has a culture of its own and there are some things in that culture that we really need to drill deep into to understand the best ways in which we can use our data. So if we want to build an equitable society and we want to build a sustainable society, we’ve got to be very reflective and intentional when it comes to bringing an ethical approach to the ways in which we are doing big tech.
0:09:56.0 MK: So one thing that just, I feel like I’m flashing back to the start of my career in criminology and the one thing that I found really challenging was getting people to care because obviously the criminal justice system, it’s a tricky space, right? Because people think of… People that are in that system as different to them. And I feel like this is the same when we… It’s the same problem when we start to talk about the fairness of our models or the outcomes of what we’re building in AI machine learning. How do we get people to care about this? Because a big part of it is like basically everyone in this process has to understand the ramifications of the work they’re doing in order to help fix it. But it seems like there are people that are very passionate and interested and there are some people that are kind of like, “Oh, the model will figure it out.” And getting them bought in, I don’t know how. [chuckle]
0:10:48.9 RC: Well, I think to get people to care, you’ve got to reach people at a place where they want to care. And I think for me, one of the things that I always say is that we may be doing different things at the companies that we are. So some people may be building products, some people may be creating processes, some people may be reinventing systems, some people may be using data science and AI for entertainment and fun. But when we think of it as a collective, what we are all doing with data is building a future. And that’s what we’re doing. When all of these things come together, they come together to create our future, what our future society is going to look like.
0:11:35.7 RC: So to get people to care, they have got to understand that they need to have a stake in the future. They need to have a stake in their future. They need to have a stake in the future of their generations. And they also need to understand something like data that you do not see works together with machine learning and deep learning and unsupervised learning processes and the many processes that are now being harnessed into service, that they can create an algorithm. And that algorithm has the ability to make a life and sometimes death decision about you in real-time and in most times without you knowing. And when people think about that, then people immediately start to care. Even the data scientists who have realised that they are building a future society and they have an extraordinary social responsibility when it comes to the work of data science.
0:12:44.6 TW: Well, so on that, so when we say inequity and even like as you are articulating kind of looking to… It’s basically taking the inequities and the biases in the historical data, now we are in the present, we are looking towards the future. It feels like very quickly where you wind up in a discussion around do we agree that there is inequity. There is… The last… The current political or cultural environment globally in the US certainly, do you wind up in a space where when you say there is inequity? And let’s talk about the inequity that there are cultural, segments of society that say, “I don’t agree with that. I don’t believe that there was a historical problem or a current problem or a future problem.” How quickly does the data challenges bleed over into the cultural and societal, I guess, rifts or challenges? Those are groups that I think in many cases don’t really want to look at the data anyway. Does that make sense? How do we… Yeah.
0:14:01.5 RC: I have never met someone who has not believed that there is inequity in society. I think people appreciate that. Whether or not they actually care determines how much that inequity affects their own lifestyle. And that is where people decide if they care or if they don’t care. But inequity we know exists because at any moment or any given day anyone can step outside and look to their left and look to their right and see someone who has more, who is doing more and maybe doing more with even less that you have. And people have those questions. The challenge is this, we have got to make data science care. We have got to make data scientists and technologists care. And we have got to bring that understanding. And I think more and more and more we are seeing that understanding. At the University of Virginia School of Data Science where I am at, we are committed to that values representation, values, voice, visibility when it comes to the kinds of ways in which we are bringing an interdisciplinary approach to data science and of course new and emerging technology. Because we understand that the value of that, we understand that only an interdisciplinary approach could really help us craft the future technology or future-proof society that is required.
0:15:25.8 RC: And data scientists more and more are realizing that they need to bring that ethical perspective. They need to understand social justice and racial justice and healthcare equity if you are working in healthcare data. Because they are seeing it more and more. Privacy issues. They are seeing it. The lack of sometimes autonomy and agency when it comes to decision making. Things like accountability and transparency, explainability, interoperability. These things are critical to the ways in which we are doing data science. And if you are coming from a social sciences background and you are bringing a rights-based approach or if you are coming from a business background and you are bringing a risk-based approach to data science, no matter the approach, rights or risk or the actual approach needs to be a combination of rights and risk. Even if you are thinking only bottom line and business model, you will appreciate if you do not bring a justice oriented trauma informed approach that deals with equity, diversity and inclusion, then you are bringing an extraordinary amount of risk to the work that you are doing.
0:16:40.3 RC: And that risk is going to meet you somewhere across that data science workflow or the life cycle of what you are doing. You are going to meet that risk. And when you meet that risk, that is where it becomes very unfortunate. Because it can be reputational damage for the organization. It could be financial damage for the organization. You can find yourself in the middle of a media expose. You can be dealing with regulators, a criminal investigation. So I think we understand that we have got to do data science right and we have got to do data science good. I think we understand those things. So I think people understand that inequity is real. Whether or not they care when they are designing, developing, deploying and adopting is something else. And that is why we need to encourage the idea that ethics and innovation can live in the same house comfortably.
0:17:34.7 MK: Do you… I suppose, one of the things, and this is probably a bad character flaw, is like I try and go straight to solution mode of like, “Okay, so how do we solve this?” That’s the bit that I just instinctually jump to. There is a lot of discussion in the data science community about the need to have diversity in teams. And that seems to be one of the things that everyone is like, “Oh, we would just solve this if we have diverse teams and we make sure we hire diversely.” I don’t guess that there is a silver bullet to this, but is that one of the things that we need to be really thoughtful about or in your view, is that just, I do not know, like a cliché response to the problem.
0:18:19.8 RC: Well, no, diversity in teams is critical to the ways in which we design and think about design and develop and deploy. But more than that, we need diversity and perspectives and we need diversity in the understanding that the only thing to me that really can stretch the imagination of data science and AI and technology would be that interdisciplinary imagination, because it is the interdisciplinary imagination that brings the diversity in perspective, the diversity in intelligences, the diversity in ideas, and the diversity in cultures that make up our team. So what we want would be that diverse intellectual confrontation when we are designing that brings those extraordinary and exciting perspectives that understand that if we are to do technology in a way that is truly innovative, then if we design using an ethical state of mind, if we design using empathy and that trauma-informed approach or a justice-oriented approach, then we are really bringing an interdisciplinary imagination and what we are really going to deploy or adopt is going to be something that is truly innovative because it is something that is truly inclusive.
0:19:40.7 TW: And so, if we’ve got a team, and I think this is actually, it’s encouraging and that if I’m hearing part of what you’re saying right, it’s that there is the data scientists and the analysts are actually kind of empowered to just raise their awareness and then raise their hand and bring up the issues and say, are we considering, are we doing this, what are the ethical ramifications? They may get smacked down, like “Don’t want to think about it, that’s too hard,” but they keep raising their hands and eventually that awareness seems like it can spread through an organization. If those are being considered, if there are specific ethical AI considerations that are being considered, does that mean like hard questions emerge and then the organization is trying to answer hard questions and ultimately it’s getting to surfacing What are the real values of the organization? I guess maybe this is another angle at the other question? It starts with one data scientist but then next thing you know you’re sitting with executives and asking really hard questions that are hard.
0:20:57.3 RC: So I think what you want to do as an organization is really to build an ethical organizational culture that allows designers and technologists and data scientists to feel comfortable to discuss those uncomfortable questions. What we also need to think about when it comes to data science and it comes to technology and it comes to the ways in which we are designing, if we don’t ask the hard questions, then we’re not getting the best out of our data sets. The other thing that you want to understand, and let’s think about this example. You have a big tech company, a billion-dollar company that a few years ago faces an extraordinary crisis that brings an extraordinary amount of embarrassment.
0:21:50.8 RC: When they create an algorithm to screen resumes and CVs [chuckle] and that algorithm leaves women out. Now, you’re a big tech company, you are supposed to have the most brilliant minds as part of your organization. You are supposed to be the most innovative company in the world and that’s how you get caught? With an algorithm that forgot that women are also educated? It’s simple things like that that really impact the reputation of many of our big tech companies and many companies that are working in the tech space because they did not ask simple questions. And sometimes those simple questions produce some very, very difficult lessons. So if we bring that ethical approach, if we build an organizational culture that is ethical, if we create the room and space to have the discussion, and if we appreciate that for us to build mature, responsible, trustworthy conversations, trust has got to begin way before we start to build those models.
0:23:09.6 MK: But so what do you attribute that kind of mistake to? Because… So I’m not sure if you’ve ever read Invisible Women: Data Bias in a World Designed for Men by Caroline Criado-Perez. I’m obsessed with her. It’s about… “Oh my God, this is the best.” [laughter] So Renee is currently pulling it out. It’s the number two book on her on the stack behind her. I am very obsessed with that book. [laughter] And she does talk a bit about the teams that are creating this technology. And the truth of the matter is, it was probably a bunch of white dudes sitting in a room who didn’t think about the fact that… It’s like how… How do you… And it’s not about attributing the mistake to someone or a team of people, but it’s like, what do you put in place to protect from that? I’m just thinking really practically, do we need a checklist? Do we need an ethics committee? What are the things that we can do as an organization to make sure that’s not us one day?
0:24:12.0 RC: Well, I think the simplest thing and what we’re seeing many institutions and organizations and agencies doing are requesting that our data scientists in particular have that ethical training. So ethical training and ethical certification now for data scientists is becoming almost mandatory, although it’s not written in the print, it’s becoming a mandatory approach. So at UVA, I teach big data ethics, and I spend a lot of time with our students going through all of those scenarios and understanding that you’ve got to bring that ethical approach. Sometimes it’s the little thing that you did not think about that really undermines the big things that an organization would like to do.
0:24:53.8 RC: So if you understand diversity, if you understand equity and inclusion and trauma, and why a justice-oriented approach that looks at a due process and brings an approach that speaks to duty of care, it provides a really sharp sense of due diligence that is required when you are building any kind of model, when you are using data. At least for me, the thing that I always say is that you have got to be diligent. So due diligence is critical. You have got to deploy an approach that uplifts duty of care. You have got to pay attention to due process because data science is about decision making.
0:25:38.5 RC: That’s what we’re doing with data. We are using data to make decisions and we want to make the best decisions with our data. And the best decisions are the most accurate decisions. And if you are looking for decision-making accuracy, it means that you have to bring eternal vigilance that speaks to an understanding of ethics when it comes to any kind of innovative work. And if you are doing that, then you are going to get the kinds of results that you want. So I always say, when it comes to the ethical imagination, when it comes to the interdisciplinary imagination, if we merge these two, then you are going to really stretch your own imagination as a data scientist, a technologist, or just anyone working in that space.
0:26:24.6 TW: What’s your sense of… That class has to blow the minds of your students, I assume. Do you have a sense of the data scientists being trained in US universities or global universities? What percentage of them are specifically including curriculum in this area? Is that… Is it taking off? Is it growing? Is it still pretty rare? That’s got to be a fascinating class that is arming those students with some tools that are pretty unique, which I wish they weren’t, I guess.
0:27:12.2 RC: Definitely. I can tell you in my class, it’s about taking philosophy to practice. And it really is about bringing that eternal vigilance to the work that you’re doing, bringing the knowledge. And yes, data science ethics and AI ethics are now becoming critical to any curriculum at any higher education institution because we’re understanding the social and the psychological and the impacts, the cultural impacts of data and the kind of ways in which we need to think about it. I’m also at UVA, the co-director of the Public Interest Technology Network, and it’s called PIT-UN and many universities, I think over 54, including all the Ivy League as well, are in that because public interest technology is so critical now to the tech space, which is we are building technology in the interest of the public.
0:28:05.6 RC: And if we are to do that, we have got to get that community buy-in. We have got to get communities involved in building the technologies. And if they can’t build it, at least give them the kind of understanding that is required. So as a data activist in residence, one of the tools that I am building at the moment with some very dynamic data scientists is a digital force index. And that digital force index produces a digital force score. And that digital force score is really a score that looks at the amount of technology, the amount of surveillance technology that is being deployed against you in any particular space. So we have collected an extraordinary amount of data around the technologies that are being used by law enforcement, by protective services, by national security, by private security agencies.
0:28:55.2 RC: And we have collated and compiled and we have come up with a score. And this score tells you if you are in a particular space, high, medium, low, how much surveillance is being deployed. But it also is a way to get communities involved in the conversation around surveillance tech and to get police agencies to understand not every problem or challenge in a community requires a technological solution. Some things still require a community-driven approach or an approach that is built on empathy and cultural sensitivity and appropriateness and relativity to really deal with communities. And also for us to see the kinds of budgets that are being expended on things like surveillance tech and for us to understand that, techno determinism and techno solutionism and a technological solution for everything is not always required.
0:29:50.3 RC: But the main thing for me is Futures Literacy and bringing communities in the conversation because I am very passionate, as you may have realised, [laughter] about data and AI, right? And I always say, I was not there for the invention of the printing press, but I am here for AI and I am here for [chuckle] data-driven technologies. And I am going to have some kind of involvement, if not as myself or through the work that I do with my students to understand and to let communities and people understand that this is how we’re communicating now in technology, in particular in algorithms. And you need to be literate in what happens in the future. And it’s just really important that we need to bring that kind of literacy to our society because so much is happening in the data science and AI space. And data is something that you don’t often see. It’s something that you can’t touch, but you’re creating so much of it that’s being monetised. And one of the things you don’t want is for that to be weaponised against you, your family, your community or future generations.
0:31:01.8 MH: All right, I’ve got so many questions, but I’ll try to pare it down. [chuckle] So one thing I wanted to go back to, you mentioned as we were talking sort of about how to build teams or diverse teams, but also build diversity of perspective. There’s another phrase that I’ve heard used, and I’d love for you to juxtapose those for our listeners a little bit and give a sense of how they contrast, which is the concept of diversity of perspective and the concept of diversity of thought. I don’t know how… That gets used… I’ve heard that used in corporate settings like we want a diversity of thought, but diversity of perspective. I would love to hear you look at those and sort of compare and contrast them a little bit for our listeners, if that’s okay?
0:31:43.2 RC: Sure. And I will say it’s pretty much the same thing. It’s about thinking. It’s about thought. It’s about perspective. It’s about ways of seeing. So it’s just another way to say the same thing. So diversity of thought is really diversity of perspective. What is new is diversity of intelligences, because we’ve got to understand that there are many intelligences that we need to use from emotional intelligences to neurodiversity, which are also critical because for me, technology is about collective intelligence, the best of the human mind and the best of the technological mind coming together and working together to build the best society that we can build. And I always say we may never get a perfect society, but we can certainly get a better one than we are in today. [chuckle]
0:32:31.9 MH: Yeah, I love that. No, I appreciate that. I felt like sometimes I’ve heard diversity of thought being used as a way of saying we don’t want to build diverse teams. It’s like, well, people will just think differently. That’s enough diversity.
0:32:44.7 RC: Oh, Yeah, yeah, yeah. I get what you mean.
0:32:47.1 MH: So, I don’t know if that’s how that was being used, but it like didn’t feel like…
0:32:49.3 TW: I’m pretty, pretty sure that was a middle-aged white dude who coined that. [laughter]
0:32:53.8 MH: I don’t know.
0:32:55.4 TW: Coined that one. He probably had a podcast too.
0:32:58.7 MH: Well, what middle-aged white guy doesn’t? [laughter] Anyways.
0:33:02.2 RC: So you felt they were using diversity of thought as another way of thinking of, “We don’t need diverse people. We just need diverse thought.”
0:33:10.0 MH: Yeah. And I feel like in a certain sense, like that idea of perspective was, I feel, maybe a deeper way to say it than the idea of thought. And that’s why I had the question. The second thing I would love to just sort of talk about a little bit more is, I think one of the things… Let’s talk about perspective, which is around the perspective of trauma. So I think that’s hard for people to really ingest sometimes. And I think it’s difficult because trauma affects different people a different way. And certainly when we’re talking about these equities across society and how that creates trauma, not everyone experiences it.
0:33:51.5 MH: And so it’s sometimes difficult, I think, for people to understand what that means for an individual and then have the appropriate… I don’t know if I’m sure empathy is the right word, but at least a coherent way of saying, “Aha, I see it. I understand it. I’m connecting with you on that.” And I’m not going to be like, what’s wrong with you when you talk about trauma. It’s like, okay, that’s a thing and I have a place or a category for it. I know that’s not coming out very cleanly, but I’d love to sort of get… How do you coach people or your students on getting a better sense of that or deepening their sense of how to understand that?
0:34:36.6 RC: I think everyone has experienced trauma. If you have experienced grief, you’ve experienced trauma. If you’ve experienced illness or any kind of sickness, you’ve experienced trauma. If you’ve experienced identity theft, you’ve experienced trauma. If someone has entered your life and created a particular kind of discomfort or disease, trust me. So people understand that trauma is real to our existence and it often could be intergenerational because many of the challenges that we have in families, the dysfunctions are traumatic and the experiences, the continuous psychological experiences, lived experiences of trauma. That’s what we have. When we think about how we relate with family members or how we don’t relate with family members, the kind of dysfunctions, the things that make us who we are, the things that make us proud, the things that make us unhappy, the childhood experiences we’ve had, the early childhood experiences we’ve had, things that people may have said to us, the things that people may have not said to us.
0:35:46.0 RC: So I think people understand what trauma is and what psychological pain and psychological hurt and psychological harm feels like. But for certain communities, that is just a little more than the average person. And in many communities, what we are seeing would be polyvictimization because you have some communities that are the victims of, let’s say, something like digital redlining. So those communities have been denied access and opportunities and the kinds of financial support that are required for us to build generational wealth and for us to have generational equity. So when we are introducing these historical data sets, so when we are collecting data in a very cavalier manner, or when we are not bringing an ethical approach to the kinds of decisions we are using data to make, it means that what we are creating would be a situation where we are creating more and more intergenerational trauma.
0:36:47.4 RC: And we’ve got to be very concerned about that. So when we use something like historical data sets that’s creating these zombie predictions or overestimating the risk of black and brown men and poor white men and women as well, because they are also in that category. So when we speak about vulnerable communities, there are many poor white people who are in that community as well. And they feel the brunt of those decisions. So when a risk assessment tool works for someone, it means it may not be working for someone else. Or when a man of colour is arrested because of facial recognition technology, and the police comes to his house, and they arrest him in front of his children and his wife, you could understand the traumatic experience that is for children. And you can understand that this man now has to go to a police station or a precinct, and he spends 24 hours in there. Now he has to call his job or his wife has to call his job and says he can’t come to work because he’s been arrested. Now he has an arrest on his record. That creates an extraordinary amount of trauma because of the data.
0:38:00.9 MH: Mm-hmm, yeah, that’s good.
0:38:02.2 RC: Because of the data that said that this man is the one who perpetrated this crime when that man was not there. He was actually in the mall. And then what did we hear? Well, facial recognition has a challenge with black and brown men, women, or anyone who identifies as a woman. And we often hear these unintentional consequences of technology. But I always say to the people and the groups impacted, it never feels unintentional.
0:38:32.8 MK: Can you… You mentioned a term there that I just wanted to clarify. Digital redlining. What do you mean by that?
0:38:43.9 RC: Go ahead.
0:38:45.1 TW: I’ll give [laughter] my version and then you fix it.
0:38:48.8 MH: And then we’ll get the eloquent one. [laughter] Yeah.
0:38:50.2 TW: In the United States, Moe, throughout history, as people applied for mortgages, basically there were… The bank financial institutions would redline, which basically would mean they would outline certain areas like communities and say, yeah, “Let’s not give good rates or lend in these areas,” basically picking on specific minority groups and people of colour and things like that. And so basically they were drawing a map and saying, yeah, if you live in this zip code or this area, yeah, we’re basically going to treat you like a second-class citizen. We’re not going to give you a loan. We might not give you as good of a rate. And it was proven. I believe it was proven they did this. And it is now, of course, illegal to do that. But it doesn’t mean that the legacy of that is not felt. Because if you think about the number one way to build generational wealth in the United States is through the ownership of real estate. And so people who bought a home post Second World War and passed that home on and built another home or whatever, like they’ve had in two generations into the third generation now of building equity and housing that other people were denied, even though they might have had the exact same credibility or credit score or whatever the case may be. So that’s my understanding of it. I probably missed some pretty salient parts, but hopefully that was…
0:40:09.4 RC: No, I think that was a very good explanation. I will not touch it. [laughter] So that is exactly what it is. And yeah, and we still feel the impact of that. Now we’re using algorithms to do that kind of redlining. And what is happening is that the algorithm makes that decision. And most times people don’t even know a decision was made that you were denied. So the algorithm undermines agency and autonomy. And that’s a challenge. And if it is, we want to use data to really deploy decision-making accuracy, because that’s what we want. We want to make the best decisions for our companies, for public policy, for good governance, using data. And if we’re using data sets that are compromised or data sets that are bringing that history or that memory of trauma, and we’re re-sharing data sets and we’re using them to make all these decisions, then all we’re doing with new technology is replicating old biases and old patterns and old stereotypes and an old way of thinking. And that’s what we don’t want to do.
0:41:14.1 MK: So what protections do we put in place? ‘Cause it sounds like that practice was people coming together and actively deciding we’re going to make decisions about location. Whereas in the situation we’re in now, people would just be like, “I’m going to use location data ’cause it’s a great variable. And I’m gonna throw it in my model.” And like out comes the answer of yes or no, if someone gets a home loan. So how do we build a protection into that? Is it like having to be really thoughtful about the variables we include, or is it even a step before that?
0:41:46.6 RC: It’s a step before that and it’s called governance. And it’s called how do we govern data? How do we govern new and emerging technologies? How do we govern algorithmic decision-making systems? And it begins with a kind of governance. It begins with understanding the power of data, understanding the powerful infrastructure that data has created, understanding the kinds of power plays, understanding that algorithms are more about innovation, much more than innovation and experimentation. It’s also about exploitation and extraction, and it really can disempower as well.
0:42:27.2 RC: So it’s really about the governance instructions. It’s about structure. It’s about legislation. And it’s really about bringing that kind of understanding. In the United States, most recently, that we had the AI Bill of Rights deployed by the White House, which brings that kind of rights-based approach and as well as a risk-based approach. It is not law. It’s not legislation. It’s just about building an awareness, but it is a start. Of course, we know we have the EU GDPR. We also have the EU AI Act. That is something that’s being discussed at the moment. Again, these things could be very, very powerful, but they also need to be global for us to understand that when we think about data, we’ve got to think about it in a global context and we’ve got to understand the power of data. And I think that is something that we are yet to truly understand how powerful data is. It can do brilliant things, transformative things, but it could also do very harmful things. And for us to get the best of it, we’ve got to have a very intimate understanding of how it works.
0:43:42.0 TW: So I love that you brought up the blueprint for an AI Bill of Rights, ’cause that’s, I think, reasonably hot off the presses, but… And it’s kind of structured. I wanted to ask you about it, ’cause it’s got the five pillars of safe and effective systems, algorithmic, discrimination, protections, data privacy, notice and explanation, and human alternatives, consideration, and fallback. I did not rattle those off. Clearly, those were in my notes, ’cause I was curious. Those look great to me, and I don’t know if you’ve dug into those. Are those solid? Is that a good framework to work from? Are there blind spots in it that you already see? What is your take on…
0:44:28.2 RC: It’s an excellent framework. It’s a great place to begin, but it’s only a beginning. What we need would be the backup legislation that actually has the teeth, because we can’t always leave it to industry. So, excellent framework, fantastic start, but we’ve got to put more in there to ensure that we really get the kinds of results. I am grateful. I am thankful for it. I’ve been part of many discussions, pre-its deployment, and when it comes to how do we build this framework? But we need so much more. And we need to stop legislating in a way that creates these loopholes, but we need to understand that technology is critical to our future. Technology is critical to the ways in which we communicate, to the ways in which we educate, to the ways in which we build, and the ways in which we could advance society and enhance and enrich progress and development.
0:45:34.1 RC: We need technology, but we need to do technology right, and to do that, we’ve got to bring that ethical understanding. We’ve got to bring that interdisciplinary imagination. We have got to understand the due process and duty of care, our prerequisites, and those prerequisites must always be in place, because we’ve got to bring ethical due diligence and ethical vigilance to the ways in which we are doing data and doing technology. We have got to stay, as they say, we’ve got to keep on our toes, because tech happens so quickly, and it moves so quickly, and the law is always behind technology. So, for us to do things right, it means that we’ve really got to bring a collective approach and a collective understanding that we’re just not all building different things. Collectively, we are trying to build a better future.
0:46:34.1 MK: Okay, I’ve got a random question that if I don’t ask, it’s just like, you know when you hear something and it just like pings in your mind, and I’m worried we’re going to run out of time and I won’t be able to ask it. In one of your lectures, you talked about the fact that there’s a belief that we can de-bias data sets. Can you explain a little bit more about that? ‘Cause I feel like… I can see evidence of that belief that we think we can.
0:47:01.4 RC: So the belief is we can probably build an algorithm to help us de-bias our data sets. So we can bring a technological approach to a technological problem to help us come to a technological solution. And that is a major challenge. What we need to do is start with our thinking, that understanding of the history, bringing that ethical approach. And yes, we may be able to get technical support from an algorithm. But one of the things that we must never do or feel comfortable doing is believing that an algorithm is going to help us solve the problem with an algorithm when an algorithm is the one who got us to this problem, right? So one of the things I always say as well is that we never want to reach to a place where we’ve got to deploy an algorithm or design an algorithm to teach us what it means to be human again. So let’s bring our collective intelligence to the space. [laughter]
0:48:00.8 MH: I like that [chuckle]
0:48:00.9 RC: Let’s bring our emotional intelligence to the space. Let us build big things, but let us build ethical things that ensure that what we lift up and what we celebrate always is our humanity and our differences. And always understand that voice, visibility, representation, technology is about people.
0:48:24.9 MH: I’m laughing because I just want to say something to Tim right now. It’s all about culture, Tim. It keeps coming back, but I… So true. So true. All right. Well, we do have to start to wrap up. This is an amazing conversation. Where can people go to learn more? I think is probably the last question I want to ask you. Besides enrolling at University of Virginia and taking your class, how can a data scientist or a machine learning engineer or data engineer, analytics engineer, where can they take a next step or what’s a good resource or something like that? You don’t have to… Obviously we have no time to discuss exhaustively, but like be like, here’s a good idea for you. Take this one and work on it.
0:49:07.4 RC: Well, the best idea is UVA, School of Data Science, because that is where I am. And that is where you could always interact with me. You can check the website, click on School of Data Science. My email is there. My number is there. And we can just get involved in those kinds of conversations. I’m also on every social media platform. And of course, I speak internationally. So that’s another way to engage with me. And the best way would be to invite me to have a conversation with your data scientists and your technologists. And of course, enter your C-suites because if we are to get ethics right, we’ve got to get it right from Main Street to the C-suite and from C-suite back to Main Street because it’s all about us. Because what we’re building, as I keep saying…
0:49:51.0 MK: You’re gonna have like hundreds of data people [chuckle] reach out to you after this, including me being like, [laughter] “Please come chat to us.” [laughter]
0:49:57.7 RC: It’s conversation. Ethics needs to be a continuous conversation because the one thing we know about AI is that we really don’t know about AI. So we need to have that conversation flowing.
0:50:12.6 MH: All right. Yeah, we do have to wrap up. But thank you so much for the generosity of your time and experience. This has been such an amazing conversation. One thing we love to do is go around the horn and share something we might be of interest to our audience called our last call. So Professor Cummings, you’re our guest. Do you have a last call you’d like to share?
0:50:33.5 RC: I think my last call would be to just keep an open mind and understand that this technology is a beautiful thing and for it to be beautiful, it has got to include all of us because that is the beauty of society and the beauty of humanity is the beauty that we bring to the space.
0:50:52.7 MH: I love it. That’s great. All right. Moe, what about you? What’s your last call?
0:50:56.0 MK: It’s really hard to follow that up. I kind of feel mine’s like…
[vocalization]
0:51:00.1 TW: Yeah. I was like, yes, Moe’s going next, so.
0:51:01.7 MK: Yeah. Yeah. So I’ve actually been spending a lot of time thinking about data strategy at the moment and the next three years and where the industry is going. And I stumbled across this HBR IdeaCast called To Build Strategy, Start with the Future. And funnily enough, it actually brought me back to my youth justice days because a lot of it is doing that thing of like, “Okay, well if you’ve got this goal that you want to get to in three years, how do you work back from that?” But it was a really nice podcast about strategy. If anyone else is doing anything on data strategy, I would love you to send me, tweet me some links or things that you’re reading that have been useful ’cause it’s a space I’m really interested in at the moment.
0:51:46.2 MH: Outstanding. Okay, Tim, what about you? What’s your last call?
0:51:49.3 TW: Well, I am not going to be able to shift as smoothly there, but I’ve got… This is just the timing has happened. This is like very tactical for the digital analysts who are out there who are dealing with this whole Google Analytics 4 is coming. There is an absolutely amazing resource that Jason Packer has put together that is alternatives to Google Analytics. And that I know sounds like it’s just a, somebody did a little checklist. It is a very robust two-part eBook that is available at gaalternatives.guide. He is selling it. Knowing what he invested in this. He is barely gonna kind of break back even. I will bridge to… He is very passionate about this and he is a very humble individual who has put a lot of work into something that is both funny and crazy informative. So my plea would be that if you are anywhere dealing with the, “Wow, Google Analytics 4 is causing me heartburn, what else is out there?” It’s a super comprehensive, really, really well done resource. The second thing is I would beg you to not like get a copy and then share it around to everyone ’cause this is not something… He is not a massive consultancy where finding your back door to get the PDF is cool. That’s not cool. So check it out. I have read it. I have also purchased it and it’s a great document. So that’s an unsolicited plug for that resource. What about you, Michael? Are you going to wind up with something that’s…
0:53:32.7 MH: Well, it’s tangentially related. I recently discovered that our good friend of the podcast, Sergio Maldonado, CEO over at Privacy Cloud has a podcast. And I was like, well, that’s pretty cool. And I listened to a couple of episodes and it’s awesome. So I would recommend you check it out if you…
0:53:50.3 TW: Michael, the first rule of podcasting is you don’t promote other people’s podcasts.
0:53:54.0 MH: Whatever. We share everybody’s podcast. That’s how this, it’s how it all works. Anyways, it’s called Masters of Privacy and it’s very interesting. It covers a lot of topics. He’s from Spain and so this is… A lot of this is coming from the European perspective, but he’s interviewed people that we’ve talked to on the show like Corey Underwood and some others as well. So I recommend it. Sergio’s voice is extremely soothing and wonderful. [laughter] So he’s an awesome person and he’s fun to listen to. So there you go.
0:54:29.2 MH: All right. Well, no show would be complete without a mention to you, the audience. We would love to hear from you. And what are your thoughts? How are you engaging with this in your companies? We’d love to hear from you. The best ways to do that are on the Measure Slack group or on Twitter or through our LinkedIn page. So we would love to hear your thoughts and comments. And of course, no show would even be possible without the extensive efforts of our awesome producer Josh Crowhurst. So we thank you very much for that. And once again, Professor Cummings, I just want to say thank you again for taking the time. One of the quotes I took down from this conversation was “Ethics and innovation can live in the same house comfortably.” And I just love that. And that’s my nugget I’m taking away from this conversation amongst many notes I’ve taken, but thank you so much.
0:55:22.3 RC: And thank you to all of you. And of course, to your listeners, thank you for the privilege of your time. Thank you so much.
0:55:27.5 MH: Yeah. And we’ll include on the show page some links to your resources and places where people can reach out and contact you as well. So that if you’d like to find out more and get in touch with Professor Cummings on her work or to talk with her more about this, we will provide some links to that on our website.
0:55:44.1 MH: All right. Well, I know that this is a challenging topic, but one that is vital and important that our audience cares about quite a lot. And I think I speak for both of my co-hosts, Moe and Tim, when I say, no matter where you’re at in this journey, remember, keep analysing.
0:56:04.8 Announcer: Thanks for listening. Let’s keep the conversation going with your comments, suggestions, and questions on Twitter at @AnalyticsHour, on the web at analyticshour.io, our LinkedIn group, and the Measure Chat Slack group. Music for the podcast by Josh Crowhurst.
0:56:22.8 Charles Barkley: So smart guys want to fit in. So they made up a term called Analytics. Analytics don’t work.
0:56:29.5 Tom Hammerschmidt: Analytics. Oh, my God. What the fuck does that even mean?
0:56:38.4 TW: Nice. I’m not doing it.
0:56:38.5 MH: You’re not doing it. He’s not… [laughter]
0:56:41.9 TW: Absolutely not. I’m not going to do it.
0:56:42.0 MK: Tim, you’d better do it. So, Renée, he does this weird thing at the end where he like sings this jingle and he does it at the end of every episode. But now he’s like so enamoured with you that he’s freaking out about it.
0:56:55.6 RC: And now I’m super excited to hear it.
0:57:00.1 MK: Pressure, pressure, pressure.
0:57:01.6 TW: Oh, good Lord. Rock flag and ethical AI.
0:57:06.5 MH: There you go.
[laughter]
Subscribe: RSS