#223: Explainability in AI with Dr. Janet Bastiman

To trust something, you need to understand it. And, to understand something, someone often has to explain it. When it comes to AI, explainability can be a real challenge (definitionally, a “black box” is unexplainable)! With AI getting new levels of press and prominence thanks to the explosion of generative AI platforms, the need for explainability continues to grow. But, it’s just as important in more conventional situations. Dr. Janet Bastiman, the Chief Data Scientist at Napier, joined Moe and Tim to, well, explain the topic!

Articles, Books, and Netflix Originals Mentioned in the Show

Photo created using the AI Image Generator in Canva

Episode Transcript

0:00:06.3 Announcer: Welcome to the Analytics Power Hour, analytics topics covered conversationally and sometimes with explicit language. Here are your hosts, Moe, Michael and Tim.

0:00:22.8 Tim Wilson: Hi everyone. Welcome to the Analytics Power Hour. This is episode number 223. I’m Tim Wilson, filling in for Michael Helbling, who unfortunately was unable to make it for this show. He’s okay. Everything’s okay, don’t worry, but we’re not going to explain further. And explainability is actually the topic of today’s show. I’m joined as always by Moe Kiss, marketing data lead at Canva. Moe, how are you doing? And more importantly, how would you explain the quality of that little segue I just did?

0:00:54.5 Moe Kiss: Oh, I’m gonna give it a solid, like eight out of 10, but it would help if you explained a bit further.

0:01:00.9 TW: Oh, great. So explainability, not of all things but specifically of AI things is today’s topic. AI is often seen as a black box. Even data scientists sometimes seem like they start their explanations of neural networks and then get to a point where they wave their hands, mumble something about back propagation, and then quietly back out of the room. And that could be problematic when it comes to building trust and confidence in AI powered processes. So to dig in further on the subject, we, as we often do, needed an expert. Dr. Janet Bastiman is the chief data scientist at Napier AI, which is a company that develops AI driven anti-money laundering or AML solutions to help organizations curb financial crime. Dr. Bastiman has a master’s degree in biochemistry from the University of Oxford. And in addition to her role at Napier AI, she was recently appointed to the synthetic data expert group for the UK’s Financial Conduct Authority and is the chair of the data science and AI section of the Royal Statistical Society. Like we said, we needed an expert and we found one. Dr. Bastiman, welcome to the show.

0:02:10.1 Dr. Janet Bastiman: Thank you very much. Pleasure to be here.

0:02:12.5 TW: All right. So I know we are gonna get into explainability of AI, but there is no way that everyone who heard that didn’t perk up when they heard this anti-money laundering. What is that? So to scratch my curiosity itch and I’m sure those of others, could you give a little background just on kind of what that, what that space is and how it works?

0:02:38.3 DB: Absolutely. So criminals do what they do because they want money. And if you can follow the money you can usually identify the criminals and find the crime. So one of the things that they do is they try to clean or launder the money so you can’t trace it back as to the crime that it was attached to or who it’s gone to. And they do this by obfuscating and hiding that money through lots of different accounts. And there’s all sorts of typologies where there’s like money mulling and splintering of funds so that it’s really difficult for the authorities to work out where the money’s come from and who it’s going to. Now worldwide, all of the financial institutions have regulations where they’ve got to try and detect and stop what is criminal activities. They need solutions where they can try and identify all of this money laundering that’s going on. And that’s where our software comes in because we help them track down all of these different, different types of money laundering. And it can be very difficult because there’s lots and lots of different ways they can do it. The criminals are very astute and they try everything to stop being detected.

0:03:49.1 MK: Just because this is an area that I find really interesting. And then I promise we’ll get up to the explainability portion. Is this like an industry that has really adopted AI quite well? Because like you said, you’re looking for patterns in huge volumes of data, very disparate, different countries, different regulations, banking like is it a, an industry that’s adopting I guess AI quickly or is it a bit slower to move?

0:04:14.7 DB: Yes and no. And it depends on the regulations of the regions you’re in. So there are some regions worldwide that are very pro AI and are happy for its use. There are others that are taking a more cautious approach. There’s still very much a need for traditional rules-based analysis because, and we’ll come onto that a bit later, that has the explainability side a lot more easily than the AI. So there’s definitely combinations. What we are seeing is a great interest in it because, you know, the criminals know the rules that are out there, they know what people are trying to detect and it’s very straightforward for them to try and find ways around those rules. So you need that second level of defense.

0:04:57.8 TW: It sounds a lot like kind of the bot detection in the digital digital marketing space. I did see there’s this fascinating documentary on Netflix about money laundering called Ozark, which I learned a lot about. No, it’s okay. It’s not a documentary. Did you watch Ozark? Is Ozark something that floats around as like that is the most ridiculous that they would never get away? It’s always a little, it’s not even clear how, that’s my other random, I’m trying to… Michael I’m sure would be chiming in with the pop reference, but…

0:05:32.7 DB: Well the trouble is that you don’t want to put something out that shows everyone how to do it exactly. So you’ve got hint at the ways in which these things are done and how they’re detected or not detected. But there’s, I mean there’s some great real life stories out there. I mean, one of the books I read in the past year is called Very Bad People and it talks about illegal logging and blood diamonds and how the money flows and how you can, from a detecting criminality point of view, really look at what’s going on. So there’s plenty of real things out there. I mean we’ve had some great documentaries here in the uk, The Gold on the BBC about a big gold heist that happened in the early ’80s and how back then and there wasn’t the legislation and the checks and how the criminals potentially might have turned that gold into cash that they could then spend. So there’s, if you are interested, there’s plenty of things out there that are kind of in the half documentary, half fiction area that should give you a good sense of what’s going on.

0:06:30.0 TW: For anyone who’s wondering, I know that Ozark is completely fiction. I was making a… No one laughed. So that’s pretty much how I go through life.

0:06:38.4 MK: I’m smirking.

0:06:39.2 TW: Sorry, go ahead Moe.

0:06:39.4 MK: I’m smirking. Does that count?

[laughter]

0:06:42.8 MK: I’m not sure that’s laughing. So Janet, I’d love to chat a little bit like, you’ve written some posts previously on explainability for AI in anti-money laundering and it seems like you do have kind of an interest in this topic. Did you become interested in explainability because of the need for that in anti-money laundering or was it kind of across the board that you see that it’s needed? Like did the chicken or the egg come first, essentially?

0:07:10.7 DB: For me it was, it was needed across the board. It’s something that from my PhD onwards, it was that understanding of what’s going on. And that’s always what’s driven me. Whether it was sort of as a kid tinkering with the computers in the ’80s through to, you know, the work I do now, I need to understand at a low level what’s happening, why it’s happening and is it correct? So it’s always been a a thing for me. And I think there was that frustration when AI exploded into industry maybe decade or so ago now, when it really started becoming mainstream that it was just an answer that was on a plate and everyone just trusted and believed it. And I found that very uncomfortable. And you have that combination of, well, we’ve tested it so it’s fine but then just because something’s tested and passed these tests doesn’t necessarily mean that it’ll go wrong. I mean, you know, every time we, you know, we look at anything in the financial space, you always get that previous performance is not a good indicator of what necessarily might happen in the future. And we have to take the same approach with technology and specifically with AI. And there’s been an awful lot of bad things that have happened completely unintentionally because people tested things really well, but they didn’t necessarily test correctly or understand the results of the tests in order to prevent errors once things went live.

0:08:32.7 MK: So for you though, it sounds like it’s about I guess unravelling the black box or like lifting the lid, but I assume that it has to happen at many different levels, right? Because like even within my team, the level of discussion we have about a particular model or a solution is very, very different to say, how we might then explain it to like our CMO or our CEO and it’s like that in and of itself then becomes, I guess like a a gradient of explainability because you need different options for different people.

0:09:06.2 DB: That’s exactly right. And I think this has been one of the problems when historically explainability in AI has been discussed. It started off as being almost a tool just for the data scientist to check their own homework and make sure they’re doing things correctly. And in, I still see definitions of X.AI as a tool for data scientists rather than something for the end users of a solution. And you’re absolutely right, you need those different levels because the explanation I would want and my team would want is very different from let’s say a compliance officer in a bank who’s looking at a result and wants to understand, you know, is this something that I need to escalate or is it something that I can safely discount? And you’ve got everything in between. So it’s really important that not only do you have those explanations, but they’re at the right level for the person that they’re targeted to.

0:09:58.4 TW: Does that, I think of a challenge of where when you talk to the, I guess the more casual the, I guess casual user is not the right, the, the ones that they’re not gonna get into the super depth of the explanation. It seems like there’s a challenge that when you, again, not the right word, but when you dumb it down, when you simplify it to try to explain here’s what it’s doing. And I mean neural networks are an easy one to, everyone’s tried to explain neural networks, you know, at the casual to the marketer. And is there a risk that when you try to simplify it to the point where it’s accurate but everything’s explained simply, then the person thinks they have a depth of understanding that they don’t actually have, and then then you’ve kind of like, you’re kind of letting them run with scissors and over interpret what they can and can’t do with the AI?

0:11:02.7 DB: That’s actually a really good point. There is that risk. And I think there, the boundaries of what you’re explaining need to be clearly defined. You’re not going to put in an explanation for an end user that’s very straightforward as to why a result is the way it is. All of the extra detail that might indicate different parts that could have gone down and those probabilities. And there’s an awful lot of testing and feedback that’s involved to get that level right. Whereas if somebody’s trained to be able to analyze data correctly, then they can accept a lot more information. The nature of how you explain something though, is not necessarily dumbing down. I think this is one of the problems that we have with how people think about explanations is that as soon as you take out the maths and the complexity, it’s meaningless. And that’s not true. Depending on the techniques you use, you can give very clear, clean explanations at the right level that someone can understand and then ask questions about either directly of the data or to a data scientist or to someone else to help further their knowledge rather than just going off and thinking they know everything. So it’s all about that conversation and making sure that you’ve correctly communicated what you need to communicate to those people.

0:12:25.2 MK: So I’m sure in your situation like, well, generally skills like this are kind of honed over time, right? Like you have lots of trial and error and you’re like, oh that explanation didn’t work with this person, this time it stuck. How do you work with like junior people to help develop this skill and get them to understand it? Because I imagine like you kind of can’t just say like, oh wait, it’ll come with time. Like they’re often the ones who are actually doing the work and for me it’s really important that the people who do the work are also able to explain it.

0:12:53.5 DB: And for me, I actually find that some of the junior people are usually the best because they’re the ones that have come to this more recently. So they’re coming to it from sometimes from a non mathematical background, and as part of that they’ve had to hone their understanding of what’s happening and why it’s happening, and potentially speak to a friend group or to family that you know, aren’t at the same mathematical level and explain it to them in a way that they can really get across what they’re doing and why they’re doing it. So they can then talk to end users on a much more lateral level. Whereas, you know, someone that’s only ever had to talk and explain things in a, you know, a highly technical peer group, they can struggle with that ’cause they’ve just not had that practice and experience.

0:13:42.4 DB: And you’re right, it is practice and experience but the way in which you get that is very, very different. And it’s not necessarily you need to wait until you’re senior to do it. It’s one of the things that I check for in the interview process as much as I can is I get somebody that’s outside of my team, maybe someone from product or marketing or you know, somebody non-technical and I get an interviewee to bring in a paper or an article about AI they want to talk about and talk to this non-technical person about it to make sure that they can really communicate in a way that’s useful. ‘Cause the last thing we want to do is bring someone in that we can’t put in front of clients or can’t explain things to a broader audience because then they’ll struggle to build the solutions in the way that we need them.

0:14:27.8 TW: That’s amazing.

0:14:31.5 MK: I’m just sitting here with my mouth wide open because I’ve like, I think I’ve tried a couple of things. So like I often will be like, okay, can you explain this concept? Like, explain this concept as if you were explaining it to a group of five-year-olds, right? And that’s like one thing that I’ve tried. Another one we do is like explain it as if you were like, talk us through the results, but I want you to pretend that we’re the product, the relevant product managers, but this sounds way better. I’m like, how do I change our interview process?

0:15:02.8 DB: Nothing beats having someone in the room that’s actually that person ’cause it’s really hard for you to forget the things you know.

0:15:08.4 MK: Totally.

0:15:08.5 DB: So you will forgive someone for overcomplicating an explanation because you know what they mean. Whereas having someone who doesn’t know what they mean, who will then ask the questions and seeing how that interaction occurs, you know, do they get irritated? Do they get frustrated that the other person’s not getting them, really gets you to the bottom of, you know, whether they can explain things at different levels. So yeah, it’s fun.

0:15:34.3 TW: This is, I mean, as you were just giving that whole explanation, it’s the curse of knowledge being a real challenge, like talking to your peers. And I’m now like running through some of the, my go to people who have explained ideas and concepts in the space to me. And it is like, you kind of nailed, it’s the patience and the I’m gonna… In their mind, I mean, I don’t know whether they’re conscious or not, they’re repeating themselves. They’re basically coming at what’s happening from a different angle. And I have a very clear memory that, well, I’ll take to my grave. There’s a gentleman Matt Gershoff, a lot of people who listen to this know him where he actually was sitting with cocktail straws at a bar and explaining multi arm bandits to me. And I’m like, that was probably his third or fourth attempt at trying to get me to understand how this thing actually worked and how it didn’t. And I kept saying things that were wrong and his response was, well no, that’s not it, but let me talk about it this way so that it seems like the give and take as opposed to I figured out how to explain it, here’s the explanation. Now let me go forth, that doesn’t work.

0:16:45.9 MK: But Tim, I’m gonna catch you there, right? Because the difference is you were also an engaged audience who wanted to understand it. And I think sometimes the problem with the space is that the audience that we actually do need to understand it, for example, our legal team, our managers or executive team, they don’t necessarily have a desire to understand it because for whatever reason they’re not excited or motivated to get it. Janet, what are your thoughts on that is like I mean, do we just keep trying or?

0:17:15.3 DB: I think we have to keep trying. It’s so pervasive in our everyday lives that, and I’ve said in in previous conversations, it’s something that school children now need to be aware of. They need to understand and they need to start building those philosophical and discernment skills of the sorts of things that they’re gonna be surrounded with as they become adults now, you know, we can, just touching on the whole explosion of generative AI. You know, we can’t trust what we read. We can’t trust video, we can’t trust photographs anymore. It’s just gone. So unless you are physically there, you have to then make a discernment of, do I trust this source? And that’s something we need to teach. But furthermore than that, the people that are currently in business, in governments making decisions and legislation, they need to understand this and they need to understand it at a better level than, it’s scary. It’s gonna get rid of all jobs, it’s gonna take over the world. They need to know what it can do and what it can’t do, rather than all the hype. So we need to keep trying and we need to get people talking and it’s, I’m really pleased to see that worldwide, governments seem to be engaging and getting experts on board and talking about these things so they can really get to the right levels. But businesses need to do the same as well.

0:18:30.0 TW: So you said generative AI, and I think, I mean, the large language models, specifically within that, that… And I went and I read a few papers. And again, to Moe’s point, trying to understand, what is your thinking? That’s like AI hitting the masses. And I’ll hear people say, “Oh, but if you… ChatGPT will give you wrong information.” And on the one hand, in some circles, it seems like it’s cropping up. Well, you need to kind of understand what an OAM is doing in order to understand why? What are your thoughts on… I mean, as the AI stuff becomes massive consumer-facing upfront, people directly interacting with it, and they just kind of want magic. What is your thinking on a level of what they need to understand and how to help them understand it?

0:19:29.0 DB: I mean, that’s a great question. I mean, everyone wants magic and it was Asimov who said, “Any sufficiently advanced technology is sufficiently indistinguishable from magic.” And I mean, that is where we’re heading. If you think back 20, 30 years ago, something being able to write novels, compose music, create artworks automatically of the quality we’re seeing would’ve just been unthinkable. But we are there now. And if you look at the incremental changes, and this is incremental, and I think that’s the thing to remember. We’ve all got used to predictive text on our phones and when we’ve been composing emails, the next things have been suggested. So that concept of having help to write things has been there a while. This is just taking it a big step further. So now we can say rather than, let me finish this sentence, we want paragraphs, we want pages, we want poems and songs. So it is an increment, but going back to what you said about the getting it wrong, that’s the thing that we need to remember.

0:20:35.6 DB: Predictive text quite often doesn’t suggest the right word for you. Email, auto-complete doesn’t always suggest the correct end of the sentence. It’s the same thing with the large language models. They will suggest things that feel right, but won’t necessarily be right. So we need to take them as suggestions that we then edit. And as long as we keep it in that mind, and anything where it’s giving us facts and figures, we check our sources. We look at things online. Most people don’t just assume that the first bit of information they find online is the truth. You’ll check what’s coming, where it’s come from? Has somebody made that up? Is there a reference behind it? And if that’s the case, then yes, you can take what it said and use it, and you might want to add bits to it.

0:21:24.4 DB: And it’s gonna be huge as a productivity talk, but we need to remember that it’s absorbing things that has been fed to it, whether that’s other people’s suggestions, whether that’s web pages, whether that’s open source literature, things like that, just like all these other technologies that we’re really familiar with. So as long as we look at it with a, “Is this actually what I want to say?” And, “Have I checked these facts?” Then it will be an amazing time saver for all of us. And I think that’s what we need to consider it. It’s not a replacement for work, it’s something that will help us work smarter.

0:22:02.7 TW: You have a gift. You have a gift for this. This is like, literally, I don’t know how many… I’m gonna go use that. I need to listen to that and just play it for people. Like, “Oh, it’s just like auto complete.” Sorry, go ahead Moe.

0:22:14.6 MK: I was just gonna say like, this feels harder though, and I don’t know why it feels harder. Is it that what’s sitting behind it is more complex? Obviously, technological change has happened very rapidly, particularly over the course of my lifetime. I don’t know why this feels quite… I don’t know if polarizing is the right word, but it’s… There seems to be fear mongering, and when you say it, it sounds so rational and explainable. [laughter] And so I’m just like, “Am I missing something? Do we not have enough people that care about having these conversations?”

0:22:53.8 DB: That might be true. I think, historically, there’s always been a fear of change as you get older. You’ve mentioned a huge amount of change over your own lifetime, but as you do get older, you start looking at things new as technology that is scary and doesn’t make sense. Gosh, I’m gonna drop in a lot more pop quotes here. I can’t remember if it’s Douglas Adams or Terry Pratchett. I think it was Douglas Adams that said sort of, “Anything that was around when you were born is just the way of the world. And then anything that’s invented before you’re 30 is exciting and new. And then anything that’s invented after the age of 30 is this scary stuff that shouldn’t exist.” [laughter] I probably completely misquoted, but it’s along those lines. And I think that’s fair because you look at things and you think, “Well, I’ve established myself and this is how I’ve been trained and this is how I did my degree, and this is how I’ve done my work, and now I’ve gotta learn this new thing.”

0:23:49.0 DB: Whereas all of the people in their teens and 20s are just absorbing this and you see all the posts about how they can use it and how it’s making them better developers or better data scientists or better creatives. And I think we just need to consider the same thing. I mean, you go back to the industrial revolution and people throw in their shoes in the looms. It’s that change of technology and it is scary. And I think the difference now is we are in an age where that fear can be communicated quickly and it can be communicated in a way that’s deliberately designed to perpetuate the fear. Whereas again, pre-internet, you wouldn’t have that. You’d have the newspaper articles, you might have the encyclopedia at the end of the year with the footnote in it, but…

0:24:32.2 MK: Yeah.

0:24:32.3 DB: You wouldn’t have this immediate spread of information and disinformation and there’s no blocker to that. So anybody with a thought can post something out. And of course the media are clamoring for our attention. So the headlines, very clickbaity. They’ve all got the Terminator imagery and the AI’s taking over the world, and it’s missing that rational element of, “This is what it is. This is what it can do. This is what it can’t do, use it. It’s out there.” And this is how you need to change how you work in order to absorb these new technologies. Much like we’ve done with computers rather than typewriters and mobile phones rather than pay phones. It’s just a change. And we will adapt because we are very good at that as humans, but we just need to make sure that we’re calm and rational about how we approach it.

[music]

0:25:26.2 Announcer: All right, it’s time to step away from the show for a quick word about Piwik PRO. Tim, tell us about it.

0:25:31.9 TW: Piwik PRO is a digital analytics platform that is easy to implement, easy to use, and reminiscent of Google’s universal analytics in a lot of ways.

0:25:39.1 TW: And I love that it’s got basic data views for less technical users and more advanced features built in like segmentation, custom reporting, and calculated metrics for power users.

0:25:49.3 TW: And Piwik PRO has both a free plan and a paid enterprise plan that adds scale and some additional features.

0:25:54.9 TW: That’s right. So head over to piwik.pro to check them out for yourself. Get started with their free plan. That’s piwik.pro, and now back to the show.

[music]

0:26:06.2 TW: One of the articles, and we’ll include the link to the ones you’ve written on explainability in AI, kind of talked about sort of two different reasons for explainability. ‘Cause we could go down the LLM and the generative AI world, but there’s a lot of, call it more traditional or conventional AI, pre all of that hype. And it talked about there being sort of two different reasons for explainability, system usability and regulatory compliance. Can you kind of talk a little bit about the rationale, the distinction behind kind of what those are and where they fit? Kind of the why of explainability?

0:26:47.1 DB: Absolutely. This goes back to something that Moe said a bit earlier about different people that needed an explanation. So when you’re looking at regulatory compliance, generally what the regulators want to see is that you have solid systems that are detecting what you’re expecting, that they detect, that they’re doing what you’ll want them to be doing and that there are no errors. So the explainability side there is very much around the, is it doing everything correctly and being able to prove and test that? When it comes to system usability, these are the people that are making decisions on the data. So they, I’m not saying they don’t care, but their primary focus is not on how it’s been tested or what algorithms are being used or what test data is being used. Their focus is on, “Why have I got this result? Is this something that I could be confident in or is it something that I need to investigate further? And if so, on what?” So it’s a completely different audience that you are trying to give that explainable information to. So you need to give them different information.

0:27:57.5 MK: So on this, I guess, this regulatory compliance piece, ’cause I feel that is cropping up a lot in our world as well, even within marketing data and to do with privacy regulations and all that sort of that whole space. I guess, I’m just trying to understand what the trade-off is between the confidence of what the output is versus the need to understand how you got to the output, particularly in that space and whether there are, I guess, times that you’re willing to have less understandability but it potentially means that you have a better output. I’m just… It’s in my head and I can’t reconcile it.

0:28:40.0 DB: There’s a couple of things on that, but first I just wanna address that fallacy of explainability means a lower output.

0:28:46.1 MK: Mmh.

0:28:47.3 DB: This traces back to a call for proposals back in, I think it was the early 2000s. It was a DARPA study and they showed a graph showing that as explainability increases, accuracy decreased. And this was a graph that’s unattributed in the original paper. And it drives me at the wall ’cause I hate unattributed data. [laughter] But basically, it was about at the same time as there was a switch from that natively explainable sort of if this then that algorithmic approaches and neural networks when no attempt to do any explanation had started at that point. And at the time, the neural networks were getting better results than the standard algorithms. So it just a graph showing that. And as a result, there’s been this known piece of information that if you want accuracy, you can’t have explanations and it’s just quoted and everything leads back to this DARPA study when you follow the traces. But there’s no evidence for it. And there’s been… One of the papers that you quote that I did, there’s some evidence from a lot of labs that you can add interpretable layers and outputs even in complex neural networks for no loss of accuracy.

0:29:57.7 MK: Well, I’m really glad you did call me on that [chuckle] ’cause that is a really good clarification.

0:30:02.6 DB: Yeah, it’s one of those things that… It’s like a lot of these stats. There’s the one that I love that says 90% of all information on the internet was created in the last two years. That was first said by IBM in ’99 maybe. And even they just said, we think that. But I just see this on slides all the time and it’s never attributed and it’s not even true anymore. So yeah.

0:30:26.5 MK: I do think sometimes though that the data scientist or machine learning engineers potentially exacerbate that. I don’t know if that’s a fair thing? But it’s like, I think sometimes they’re like, “We wanna do some of the sexier work. We might get better results if we do this fancy algorithm versus just doing the old, if then.” So I don’t know if it’s also like our actions sometimes egg that on.

0:30:52.5 DB: There’s definitely that, that you get more funding for the more exciting new stuff and you can get businesses to invest in your department if you say that you’re doing the exciting new stuff. I’m sure there are businesses all over the world coming up with, “How can you use ChatGPT and Midjourney and all these other things in our business, even if it’s not necessarily the right solution?” And one of the things I’m a big believer in is look at the problem. You can’t bash nails in with a power drill. So you’ve gotta look at, “What is the right tool for the job?” And sometimes if you need a straight-forward, simple, very cleanly explainable solution, then some of the more traditional techniques are the way to go. If you’ve got a complex problem, then look for the more complex techniques.

0:31:43.3 TW: Have you seen cases where a simple and very… If then or a linear regression or something that’s pretty straight-forward and relatively speaking, easy to explain compared to sexier, more robust, harder to explain, does perform better but only a little bit better. Does that trade off ever come up that. “You know what? This other much harder to explain, feels much more like a black box is a little bit better. So it’s objectively better, but it’s not worth it because it’s got so much more complexity and so much harder to explain that we’re gonna go with the simpler?”

0:32:32.8 DB: Yeah. All the time. Like I say, it depends on the problem and what are the needs of that problem and the needs of that end user. So taking aside the fact that those complex things can be explainable, it is much harder work to make them, particularly if you’ve already got something that’s fully built and running, adding that explainability can be difficult. But yeah. There’s many many situations where a simple statistical technique will give an answer faster with smaller amounts of data, fewer resources, and get you to the same point with all of the extra information that you might need compared to something that might take 12 months of research, much bigger data sets for a fraction of a percent. And in the end, it comes down to a combination of, “What are you trying to find out and what’s the need of explainability?”

0:33:21.6 DB: And it’s something you mentioned in an earlier question, Moe, that I didn’t get round to answering ’cause we diverted on the complex models aren’t explainable. But there are some decisions that you don’t need explainability for and there are some that do. So if you’re thinking about simple things like what you chose to have for breakfast or lunch today? I don’t need you to explain everything that was going on in your mind as to how hungry you were or what you might have had on previous days or what was available, it doesn’t matter. Whereas, if we’re talking about something that’s actually gonna make a material difference to my life, such as, “Are you gonna authorize me to have a bank loan or a mortgage or something like that?” Then rather than just that answer, I want to know, well, why have you made that decision? Particularly if I disagree with it.

0:34:08.0 DB: And as humans, if we are presented with information we disagree with, we are far more likely to challenge it than if it’s information we agree with. So it’s in those situations where things have not gone in the way we want, that we might want that information. Or if, lets say back to the regulatory, if we’re doing something that’s gone to have material impact on someone else’s life, we are gonna want that information. So it’s not every applicable use of AI. I mean, you go back to the generative stuff and if you get a pretty picture come out of… Or a video come out, you don’t want to know necessarily why it chose that stylistic choice beyond, did it match the prompts you put in? And so understanding the needs of the user and where they are and what they might want is critical when we are looking to the explainability piece.

0:35:00.6 MK: Generally speaking, though, is it fair to say that anything that involves any kind of compliance, you just always need to have better explainability. Is that a fair statement to say?

0:35:10.8 DB: I 100% agree with that. As soon as… ‘Cause that falls under that critical use of AI. You’re doing things that are legal where it’s gonna be audited. So you’ve got to have not only just a standard explanation, but all of your paperwork needs to be in order all the way down to the person that’s making that end decision and back up the chain again. I mean, it’s a no-brainer for me. You’ve gotta have it.

0:35:36.0 TW: Does the regulatory compliance, does that ever head down a path of discriminatory practices? If you’ve got an AML system and it’s doing something where it seems like it’s flagging a certain ethnic group more commonly, maybe more accurately. I mean, does the regulatory compliance head down the path that when there’s explainability, like, “Wait a minute, why are you… ” And I’m thinking not in the financial crimes, but more of what’s happened in credit ratings and other areas where it’s actually at… Speaking broadly, I think it’s more the training data is flawed. Does that open up as well, where the explainability kind of leads you down the path of saying, “Well, wait a minute, we may need to adjust the way we’re doing this because we’ve gotta account for flawed training data going in.”

0:36:33.6 DB: Absolutely. I mean, this is where the art of data science really comes into its own because those models should never make it to production through testing and through really understanding what they’re doing. We should know before something goes live, whether there’s been a problem, whether it’s got that inherent bias towards any sort of gender bias or marginalized groups or even just lack of appropriate data. So I mean, particularly in the financial space, you may have a small proportion of individuals who are, maybe they work a lot of cash-in-hand jobs. So their profile of lots of cash deposits and cash withdrawals might overlap with that of somebody that’s doing one of the money laundering typologies. So understanding that someone is vulnerable and marginalized compared to potentially criminal is really important and you don’t always have the full data set. So being able to be really clear about why the flags come up and what else someone might need to investigate is really important. But you’ve got to know that that’s a problem upfront. You’ve gotta be able to see that in the data. And a good data scientist when they’re building the models and testing them, will be able to highlight that.

0:37:47.4 MK: You’ve mentioned testing quite a few times now, so I’m keen to kind of understand a little bit of your thoughts on that in terms of… Yeah. I feel like it’s one of those things where, when we look at all the examples of models that have gone awry or resulted in disproportionate treatment of one group over another or whatever, it’s never been intentional, that I’m aware of. It’s generally always ’cause someone’s made a mistake or not aware, whatever the case may be, right? But how do you develop a good testing program and have that awareness as a data scientist to not make those mistakes?

0:38:21.9 DB: That’s a great question ’cause a lot of companies are failing to do this really well. Generally even well tested systems can have problems because during testing they give the right answer but for the wrong reasons. These are what we call maths or type III errors. Your testing looks good, but because you don’t understand why the results were as they were, you may then green light that model and it go out into the world. And if you’re not continuing to monitor it and continuously sample, then it will just get worse and worse and you won’t notice these things. But going back to the beginning, generally you start with analytics of the data. So if you’ve got a good, diverse team, so people have got different backgrounds, whether that’s educational or socioeconomic and just look at the data in different ways, they’ll be able to spot the problems before you start building a model and be able to say, actually we need to augment this with some synthetic data just so that it’s not too lopsided. There’s been an awful lot of papers that show that if you’ve got imbalanced data, it is impossible to get a fair model out at the end of it, because you either end up with a much too high proportion of false negatives or false positives.

0:39:38.6 DB: And when you’re building the model, you have to decide which one you optimize for, depending on what the purpose of the model is. Is it better to have too many false positives and the end user has to then sort through them all? Or is it better to miss too many things you should be catching? And that’s a really difficult question. So you’ve got to do that upfront and analyze it and make sure that you’ve got that data. ‘Cause it may be that you don’t have the right data or the right information to build a fair model and that’s the point at which you need to then start looking at some other techniques. But even when you do, you’ve got to make sure that you have enough realistic test data. And it’s one of the things when we’re looking at the balance between synthetic data or artificial data or digital twinning or whatever terminology you use and real data, you’ve got to make sure that you’ve got a combination of both.

0:40:33.4 DB: Because the synthetic data is based on how things are a point in time. And particularly when you’re looking at behaviors, my space in the financial world, how people have behaved as you can probably imagine financially over the past five years, has changed considerably. So look, if we had digital data that was created five years ago, it’s not necessarily representative of how things have happened in the past 12 months, 24 months, things like that. So we need to look at how things are changing, is it representative? And make sure that we are balancing that data constantly before it even gets to production. And then post-production, constantly sample and check that new data against our production models, our in training models and make sure that we are not dropping in accuracy, that we’re still got the same acceptable levels of bias and false positives and false negatives and things like that. And that our explainability still holds. So it’s, you shouldn’t think of testing and explainability as a start of the process. It has to be continuous through production and while the model’s live because otherwise things just go wrong quickly as I’m sure your audience will have seen in some of the scare stories.

0:41:51.8 TW: Do you think of an understanding and a good level of intuition about the distinction between type I and type II errors, false positives and false negatives? I feel like I’ve found myself having to, you said it, you have to pick one that you’re optimizing towards or if you’re setting a threshold, you gotta figure out if you move this, you’re gonna make one of those go up and the other one go down. Is that an explainability thing or an education thing? ‘Cause I think of explainability is like here’s how it’s doing things under the hood. Understanding the trade-offs, I guess. Yeah. Is that part of explainability?

0:42:34.1 DB: I think that starts with education. I think it’s something that’s very easy to overcomplicate ’cause it’s very mathy and it’s all based on the null hypothesis and things like that. But when I’m doing internal trainings, I try and keep it at a very simple one dimension with pictures of cats and dogs and if you say it’s like, I want to detect pictures of cats and you show the computer a cat and it says yes it’s a cat, then there’s your true positive. And if you show it the cat and it says it’s not a cat, then you’ve got a false negative and you can explain things very simply like that. And then once they’ve got that understanding of the false negatives, false positives, true negatives, true positives, you can then take a step back and say, well, if we think as to what might it be that we are detecting and what we’re not, so is it some, are all the pictures that we showed it of a cat on a white background and if we showed it a picture of a cat on a blue background, would it be able to detect it?

0:43:29.6 DB: And you can start very simply getting people to understand not only what these statistical errors are, which is the education piece, but then also that explainability understanding of what might help them understand if it gets it wrong. So that’s really important. And I think quite often when we try and do these explanations, we are looking at very complex data and if you jump straight to complex data, it’s very difficult for us as humans to really get our heads around it because we are not used to thinking in multiple dimensions. So you have to start simply, and then once you’ve got that concept of true false for a very simple ideal, you can then start expanding that out to, okay or rather than a single cat or not, let’s look at different breeds of cat and then animals. And then you are expanding those dimensions in a way that feels comfortable and it’s a very straightforward educational piece that everyone can get.

0:44:28.8 DB: I think probably children from about the age three onwards when they can tell the difference between different four-legged animals coherently, you can get it, it’s nice and simple and then that background piece becomes the explainability and what are we looking at? Are we looking at ears, whiskers, nose shapes, things like that. Once they’ve got that visually, it then feels comfortable when you look at financial data or words or anything else. So that’s the level of basic understanding we need everybody to have so that they can go into this eyes open and get what sort of errors there are.

0:45:04.6 MK: So it sounds like constant education is kind of the like foundation here in order to be successful with explainability.

0:45:13.4 DB: Yeah, absolutely. And I don’t think it’s, that’s just sort of an AI explainability thing. I think as a species, we could do better with that. Just making sure that people are comfortable. Because technology is not gonna stop changing and not gonna stop developing. So if we can get into the habit of making sure that we give out this information in a way that’s straightforward, that’s easy that everyone can get regardless of educational level, then we can get to a point where we are not scared by these new things and they just become things that can help us do, help us live longer, help us do our jobs faster and whatever it turns out to be.

0:45:51.5 TW: I feel like, like Annie Duke has built an entire post gambling career, like probabilistic thinking, comfortable with uncertainty. I mean, I feel like that’s the constant battle is that there’s this idea that the data gives the truth, the AI model gives the truth and maybe that’s gonna be a benefit of ChatGPT is people acknowledging that, oh, it’s going to give me some bad information. We were in a period, even in your world, I assume there are organization saying, why did you miss this financial crime? Or why did you flag that as a financial crime when it was totally legit? And it comes down to, well, let me draw you a confusion matrix. You probably don’t wanna start there, you’re much better. Like, well let’s talk about identifying cats. We’re doing the same thing. Do you want to call some dogs cats or do you want to let some cats not get labeled as cats? Like that’s the game. It seems like the question for our times in the data analytics world is, it’s not getting you to some singular hard line threshold of a truth. Right?

0:47:05.7 DB: I think that’s a really good point. And, again, that calls back to what are we using these things for? And that comfort with getting it wrong because the data is only as good as its providence as the amount you’ve collected. We’ve all seen those sort of bogus scientific studies with like three people anecdotally said something happened and therefore suddenly that’s a fact. So understanding the rigor behind how these things have been built is important. But then how comfortable are you with it getting it wrong? And weather forecasting is a great one for that because if you… 95% accuracy sounds really high. It’s the sort of thing that if we had on a test, everyone would be happy with. But if you look at models that are 95% accurate, that still means they get it wrong one time in 20. And if we’re talking about the weather forecast and you’re saying, will it rain or not? The impact of that one in 20 is quite low. You either have to buy an umbrella or you get wet or maybe you carry an umbrella you didn’t need to. So it’s quite low impact, nobody cares. But when it comes to more critical situations, like for a blind person telling them whether it’s safe to cross a busy road, that one in 20 suddenly becomes much higher impact.

0:48:22.4 DB: And there may be a percentage that’s is 99% high enough is one in 10,000 high enough. But we need to be comfortable with what that percentage is and we need to understand that percentage. Because for most decisions in our life, it won’t matter. And even in the financial crime space, because of the checks and balances and the extra information, you can then look at those results and you know that somebody is reviewing them. There’s a human in the loop and they’re taking the data you provide and then making that decision. And as long as the best decision is made with the data you have available, then that’s the point we need to get to. And I think as individuals, we can sometimes be quite bad at when we’re holding people to account for making wrong decisions. In hindsight is like, did they make the best decision with the information they had at the time or did they make the wrong decision with the best information they had available at the time? And that’s kind of where we need to draw the distinction because if someone’s made the best available decision based on the evidence, then we learn from it and say, what extra data do we need? And we move on. But if they haven’t, that’s when we need to look at, well, what can we do to prevent that from happening again?

0:49:39.3 TW: All right. Well, I kind of wanna lob like 27 different things for you to tell me how you’d explain them simply, but I have to explain that we’re running outta time. And the way that time works is if we have a clock.

0:49:54.7 MK: Oh, Jesus.

0:49:55.2 TW: And it moves in a circle. No, that’s terrible.

0:49:57.4 MK: Yeah. Wow.

0:50:00.0 TW: But this has been great. Before we go, we always like to do a last call where we each share something of interest, a thought, a post, an article, a book, a movie, a podcast that we think might be of interest to the audience, whether it’s on topic or not. Janet, you are our guest. So we will, do you have a last call for us?

0:50:24.2 DB: I think I have. So firstly, I said I’m gonna be cheeky and ask for two things.

0:50:28.9 MK: Yes.

0:50:29.0 DB: So the first thing is…

0:50:31.5 TW: Totally accepted.

0:50:32.9 DB: Is the pause on AI development because I think that’s, it’s just there as a soundbite. I mean, it’s there to raise awareness and for that it’s amazing ’cause it’s got people talking about AI and what it could do. And I think it’s really important that we start talking about these things and about what it can do and what it can’t do, but do it in a way that’s slightly less hyped up and a bit less Terminator and a little bit more, this is a technology that is gonna change our lives. We’re gonna have difficulty believing the things we read and hear. So, that’s something that I think people should read, understand, but definitely come to their own conclusions about. Because I don’t think personally pausing isn’t the way to go. Not everyone’s gonna do it. Some countries are gonna storm ahead. The pace of development is huge, so it’s gonna happen, but take these letters as a means that you need to understand this better. The other thing I just wanted to touch on, I did mention it earlier just from a financial crime point of view. There’s a book by the amazing Patrick Alley called Very Bad People and it’s the story of fighting financial corruption worldwide. And it will change your view on…

0:51:56.2 DB: How you purchase things, how you look at the world around you and how you look at different countries and governments, it’s a really eye-opening book and I’d recommend everyone read that.

0:52:07.4 MK: That sounds really interesting.

0:52:11.5 TW: That’s gonna be ordered shortly. Awesome. Moe, what about you?

0:52:16.6 MK: Mine has zero relevance to do with anything at all, to do with data or explainability, but it is something that I wanna share, because it actually got me thinking quite a bit. So Jeffrey Kluger is a senior writer at Time Magazine, and he’s written a whole bunch of books on the relationship between siblings and I watched a TED Talk that he did the other day on The Sibling Bond. And anyone who talks to me for like half a second about my sister who also, Janet, conveniently works in the data space as well. So lot’s of the listeners know her. I think she’s like the most amazing human ever, and she’s really shaped who I am as a person and even my career. But just the fact that he talks about what an important relationship it is and the fact that your parents leave you too early and your partner and children come along too late, but your siblings are really there for you for the whole ride. And it’s just such a lovely reminder about how you choose to spend your time and the relationships that you prioritize. Anyway, I just really enjoyed it, so I’m going to recommend people give that a watch.

0:53:23.7 TW: Hi, Michelle.

0:53:24.7 DB: Hi, Michelle.

[laughter]

0:53:27.5 MK: And Tim, what about you?

0:53:30.5 TW: Shocker, I’m gonna go with a podcast episode that was a little while back from Planet Money. They have been a… I’m sure they have been the last calls before, but they did an episode called How Randomized Trials and the Town of Busia, Kenya changed Economics. And what’s kind of fascinating about it, it’s basically, so there’s this town in Kenya that sort of became a center for RCT Design and Implementation expertise. It started off that they did an RCT there, and then it just wound up with this feedback loop where the intricacies of doing a well-designed RCT wound up drawing more researchers there. So it became a knowledge hub for RCT design. They did have early, and they kind of diss econometrics pretty hard, like unnecessarily harshly, I think. They actually talk about some stepped wedge design stuff without labeling it as such. But I was just kind of nerding out saying, “This is all stuff that I would have had no clue about five years ago. And I think I understand what they’re talking about,” but the idea that a town wound up becoming a center for an analytics oriented expertise, just because that’s where they were doing some studies was kind of fascinating. So it was a little meta. So with that, this has been a really fun and man, I really enjoyed this discussion.

0:55:04.4 MK: And me too.

0:55:04.6 TW: And to have like four or five things spinning around in my head that I will be directly applying and referencing back to over from this point forward. So, thank you Janet so much for taking the time to come on and chat with us.

0:55:18.5 DB: You’re welcome. It’s been an absolute pleasure.

0:55:21.3 TW: So as always, if any of our listeners, if you would like to explain anything to us, we’re easily findable on LinkedIn, on Twitter @AnalyticsHour, on the Measure Slack, if you’re part of that. Janet, where can folks find you? Does your Twitter handle have an explanation because I can’t pronounce it.

0:55:40.9 DB: It does. So yeah, I’m available on all of the major socials. I think I am the only Janet Bastiman in the world, so I should be fairly easy to find. My Twitter handle is a non-vowelled version of Isabelle, and it comes from when I was at uni in Oxford and we had to come up with characters for a role playing game that didn’t have any vowels in. So I liked it, and it’s become my online handle. So if you see an Yssybyl on games, it’s usually, but not always me.

0:56:08.8 TW: So the assignment, listener, is if you search for Janet Bastiman on Twitter, you’ll find it, but it’s got three Ys and so I will leave it at that. It’s quite a handle. So that’s, I had figured there had to be a story for that. Awesome. And no show would be complete without thanking our producer, the amazing Josh Crowhurst. So he is… He will stitch things together and make us sound well. In maybe two or three years, he would insert Michael into the episode just through the magic of generative AI. But for now, we’ll have to wait for the next episode for Michael to be back. So Josh, thanks for all the work that you do and with that, no matter who you’re trying to explain your model to, whether you’re trying to have somebody explain a model to you, keep analyzing.

0:57:03.9 Announcer: Thanks for listening. Let’s keep the conversation going with your comments, suggestions, and questions on Twitter at @AnalyticsHour, on the web at analyticshour.io, our LinkedIn group and the Measure Chat Slack group. Music for the podcast by Josh Crowhurst.

0:57:21.9 Charles Barkley: So smart guys want to fit in, so they have made up a term called analytics. Analytics don’t work.

0:57:28.8 Kamala Harris: I love Venn diagrams. It’s just something about those three circles and the analysis about where there is the intersection, right?

0:57:38.4 TW: Rock flag and Type III errors.

Leave a Reply



This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

#261: 2024 Year in Review

#261: 2024 Year in Review

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_261_-_2024_Year_in_Review.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares