#137: Data Science + Words: An NLP Meet Cute for Analysts with Dr. Joe Sutherland

Did you know that there were monks in the 1400s doing text-based sentiment analysis? Can you name the 2016 movie that starred Amy Adams as a linguist? Have you ever laid awake at night wondering if stopword removal is ever problematic? Is the best therapist you ever had named ELIZA? The common theme across all of these questions is the broad and deep topic of natural language processing (NLP), a topic we’ve been wanting to form and exchange words regarding for quite some time. Dr. Joe Sutherland, the Head of Data Science at Search Discovery, joined the discussion and converted many of his thoughts on the subject into semantic constructs that, ultimately, were digitized into audio files for your auditory consumption.

Tools, Techniques and Concepts Mentioned in the Show

* This actually became something of a fascinating tangent several months after the show launched. As Tim noted early in the episode, he’d never heard of “neuro-linguistic programming,” but it popped up as he was doing some show prep. We had a brief chat with our guest about it, and then moved on. As tends to be our default, in the show notes here, we just linked to the Wikipedia page without giving it much thought. We know that Wikipedia should not be considered an authority (anyone with school-aged kids knows they’re generally not allowed to use it as a direct source in school papers)…and, yet, so many of us default to it. Ultimately, we got contacted by the iNLP Center with a request that we remove that link (they took issue with, among other things, the Wikipedia entry referring to neuro-linguistic programming as “pseudoscience). And they provided us with an alternate link that they felt better represents the topic. This is a pretty standard SEO backlink strategy (we get these sorts of requests every so often), and we’re aware of that. But, this was a pretty detailed note, and they included both a link to their own explanation of what neuro-linguistic programming is, and a link to their takedown of the Wikipedia page. THAT post opens with a disclosure notice that they sell NLP training. As a podcast, this was definitely a tangential aspect of the show, and we don’t have the expertise to weigh in on the matter at all. But, we applaud anyone who has the tenacity to take on Wikipedia for SEO, so we’ve removed the link we originally posted and added the link to iNLP.

Episode Transcript


0:00:04 Announcer: Welcome to the Digital Analytics Power Hour. Tim, Michael, Moe, and the occasional guest discussing digital analytics issues of the day. Find them on Facebook at Facebook.com/AnalyticsHour, and their website, AnalyticsHour.io. And now, the Digital Analytics Power Hour.


0:00:28 Michael Helbling: Hi, everyone. Welcome to the Digital Analytics Power Hour. This is Episode 137. You know, when the show opens, I usually tell a story or some clever anecdote that ties loosely to the topic of the show, and certainly, as a listener, I’m sure at times you hear this and struggle with sentence-boundary disambiguation. And while I’m talking, you’re probably engaging in terminology extraction to deduce the topic of the show. But whether you prefer lexical or distributional semantics, I am sure by now you know, we’re gonna talk about natural language processing. Hey, Tim, did you catch all those cool things I worked in?

0:01:08 Tim Wilson: I applaud your preparation. Nicely done.

0:01:14 MH: Oh, Wikipedia is my friend. And Moe, how are you doing today? And you’re excited to talk about natural language processing?

0:01:22 Moe Kiss: I am. I have a thousand questions and can’t wait to have someone explain all those terms.

0:01:27 MH: Well, excellent. Well, I know that NLP sounds like something that might be coming along to take our jobs, and if so, contact the NLRB ASAP, but we needed a guest, someone to give some named entity recognition to this discussion. Joe Sutherland heads up data science at Search Discovery, and yes, he is hiring. He is a Research Fellow on Public Policy and Big Data at Johns Hopkins University, he has his Ph.D. In political economics from Columbia University, and he has flown on Air Force One. But today he is our guest. Welcome to the show, Joe.

0:02:06 Joe Sutherland: Hey, pleasure to be here. Thank you very much, Michael. And also, a pleasure to meet you, Moe. I don’t believe we’ve met before.

0:02:13 MK: Nice to meet you too.

0:02:14 JS: It’s a great pleasure. And Tim…

0:02:16 TW: That’s why you’re saying it’s a pleasure, just give it time.

0:02:19 JS: Oh…

0:02:19 TW: We’ll be able to check in again in about 45 minutes.

0:02:23 MK: Ugh.

0:02:23 MH: I am certainly glad we finally got you on the show, Joe. It’s something that I’ve been angling for, for about a year, and Tim too, because we think you do a great job explaining some of these topics. So we’re delighted to have you. And maybe we can start with Tim’s first question, which is, does NLP stand for natural language processing or neuro-linguistic programming or both, and if so, what’s the difference?


0:02:47 JS: It is absolutely… Well, Tim, do you wanna add any color to that question?

0:02:54 TW: No, I was just… I thought I was locked in on NLP being NLP and then started finding all sorts of other TLAs that were apparently in the same realm.

0:03:04 JS: Okay, that makes a lot more sense, ’cause I thought you were catfishing me there for a second.

0:03:08 TW: No no no, I thought it was natural language… Literally, I was fumbling along somewhere, and I’m like, “What the f… ” Where did this come from? I thought, “Oh, crap, if we’re gonna go off the rails, I have the acronym wrong, this is not a good sign.”

0:03:23 JS: Well, natural language processing is, I believe, the best-practice term for NLP. Or I guess you could say NLP is interchangeable with natural language processing, more so, at least, in my field, than it is with neuro-linguistic programming. But I think that what you’re getting at is, what’s the difference between the two? And look, I am by no means an expert on neuro-linguistic programming, but from what I can tell, it is more an idea that you can change the way that you behave and interact and experience the world by changing the way that you think and speak about it. And so that’s sort of the best idea that I could get with neuro-linguistic programming. Again, not an expert, so if anybody roasts me down the road for that…

0:04:07 TW: But it is something totally different, right? I just… It is totally, totally different world. Okay.

0:04:14 JS: Yeah.

0:04:15 MH: That’s not the same.

0:04:16 JS: Well, it’s not, barely. Look, topically it is a different world, right? But theoretically, there are overlaps. Natural language processing is not necessarily a science that’s just about computers, right? It’s not just about computer science. It integrates real-world conceptions of what content and thought mean, and it tries to tie them together with all the computers that we have on the other hand and make something special out of it. And so I think actually when you say neuro-linguistic programming, one of the first thoughts that popped out in my mind was this idea called the Sapir-Whorf hypothesis. Have you ever heard of the Sapir-Whorf hypothesis?

0:04:58 MH: No.

0:04:58 MK: Moe’s shaking her head.

0:04:58 MH: Moe, just so you know, this happens a lot in discussions with Joe, where he’ll be like, “Oh, by the way, are you familiar with… “

0:05:05 MK: There will be a lot of head-shaking going on.

0:05:07 JS: Did you see the movie Arrival? That’s, I think, potentially more…

0:05:11 MH: Yeah.

0:05:11 JS: Okay. So definitely pop culture reference. In this movie, they bring in a specialist and she’s trying to use computers and NLP to understand this alien language. But then as part of understanding this alien language, she starts being able to experience time in a circular way. And so the Sapir-Whorf hypothesis, I think, inspired this movie. It’s the basic idea that the way that we think and speak and interact with the world actually changes the way that we understand it. And in this movie, it’s the perfect example of the Sapir-Whorf hypothesis being accomplished, which is she changes her language into this sort of temporal language, all of a sudden she can manipulate time with her mind. Again, probably butchering the description of the movie, but the point here is that linguistics and NLP are not separable. In fact, they’re quite the same. That natural language processing is an interesting subset of both computer science and many other fields because it is the combination of true real-world theory concept behavior and the way that we think about measuring the world through computers.

0:06:00 MH: So maybe, because we’re on the… This is the little cultural-historical piece, and I remember we were having a discussion and I said, “Hey, my understanding is that this stuff really came about in the 1950s and the processing power wasn’t just there,” and then you went off on a riff and you were like, “Hah, 1950s, you silly young whipper-snapper.” Is it worth going through the quick… Because it is… Because you just said, it’s the science… There’s the theory, and then there’s the processing power, and those haven’t necessarily moved in sync, right? So it goes back a lot longer than the 1950s.

0:06:00 JS: So, okay, so… History of NLP, huh? Yes. It does go back a very, very long way. If you wanna go really, really… I think the deepest cut to NLP if I could think of it, would probably have to do with just… Just think about text and writing. That’s probably the deepest you can go on NLP. Anybody know the genizah papers, you ever heard of the genizah papers before?

0:07:14 MH: No. Again, head shaking.


0:07:15 JS: These were medieval records of correspondence that were preserved for millennia by edict of Jewish law. The Jewish law said you could not destroy these papers, and these records were preserved for thousands of years. And through these records are essentially the only reason that we know about the development of the Silk Road, the precedence of international trade, the ancient origins of economics and commerce. And I guess you could go a little bit further back. I know that there were some tablets in Babylon and we get hieroglyphics, etcetera. But writing is incredible because it is a way that has preserved human thought in some sort of conveyable way for… Since the time of our birth as the human civilization. And that’s what’s so cool about writing, is you can actually put these ideas outside of your own body and you can contract them over time. And now, what we’re thinking about, that’s interesting. Writing is interesting. You’re all like, “Okay, let’s get to NLP.”


0:08:11 JS: But it’s important to say because, as long as we have had content out in the world through text and images and whatever it might be… Or even, for example, somebody might have put on a medieval play. Content has to be interpreted and analyzed by some other source. And through doing that, we can transmit information from point A to point B. So point A produces the content, content gets put out in the real world. Point B has to actually interpret and consume that content. So that’s what… We’ve usually done that with just words and understanding, and it’s almost something we take for granted. But NLP is interesting because it’s like teaching a computer to love. It’s like, what is it about love that we can actually put words and rules to help something else actually experience it? And it’s really hard to do that. And so we’ve tried to come up with great theories of how we can do that, etcetera.

0:09:02 JS: So that’s interesting from a general perspective. If you go a little bit deeper and think about quantitative natural language processing or even just sort of quantitative text analysis, first time we had quantitative text analysis was in the 1400s. A lot of bands of monks were actually employed to quantify… They would basically look at writings from districts around the Catholic world, and they would look at these districts and they would say, “Oh, looks like in this period of time, we had one, two, three positive pro-Church writings and we had five negative Church writings. And if we add all these together, looks like we’re at a negative three in this district. So better send some missionaries.” Very… It seems almost like, “What? They were doing math in the 1400s?” Absolutely. Yes, they were. And it actually…

0:09:48 TW: Math?

0:09:48 MH: They were doing sentiment analysis. We can’t even do that on Twitter. It sounds like Net Promoter Score, actually.

0:09:55 MK: Well, well, you guys knew that I was gonna get really excited about this. But when you hear that, it is a type of sentiment analysis which is also potentially quite flawed, because it involves a lot of human bias. And I suppose that’s one of the things that would have evolved in this practice, is… You can’t get rid of your bias, but you can try and reduce it, I presume. And still, when you’re writing a model to do what the monks did in the 1400s, there’s gonna be some bias by the person writing the model, but presumably, that’s one thing that’s improved in the field? She says with a question mark.

0:10:28 JS: Question mark. Yes and no. So at the end of the day, we still have what is called researcher-induced bias, which is the way that we set the rules of the game, the way that we set the table still has our own bias in it. And while we’ve gotten better… I don’t know if you’ve ever heard of the jury theorem, but it’s just this idea that groups of people can make better decisions than any individual person can. And even if you get 100 completely non-expert people, they all make a guess, it’s gonna be a lot better than one expert, on average.

0:10:58 MH: See, James Surowiecki and the Wisdom of the Crowds. Now, we gotta take it to the simplified version, but that’s the same thing, is that… Wisdom of the crowds is the same thing as jury theory?

0:11:10 JS: Yeah. Same exact idea. And the cool thing about what we get today is not necessarily that computers are unbiased, because we set the table for the computer. We write the algorithm, we pre-process the data, we put the data into our model of choice, etcetera, etcetera, etcetera. There are so many things that we do that obviously put tons of bias in. But the difference today is that it’s easily replicable and the set of rules can be reviewed by huge committees of people. You just think about open-source projects. This is a perfect example of the jury theorem at work. We’re gonna come to a better answer because we have lots of people critiquing the rules that we’re implementing or the algorithms that we’re using, etcetera, and that’s where really the reduction of bias comes from. There is a nice, simplified codebase or even though some people call it a codebook, from which you can do these NLP analyses. And that’s why the bias is reduced.

0:12:00 MH: So what’s the… I don’t know, we could probably march down the history a little bit farther. But at the same time… So the monks in the 1400s, using some code presumably or and their set of rules for them to count. So that’s text analysis. That feels like a cousin to simple bag-of-words type… Where does the written text analysis, the leap from that to natural language, the spoken audio as well as even to natural language generation, is that one continuum, or is it the same ideas, the same principles, trying to understand written text versus the mechanisms for converting speech to text or… How does… That all falls under NLP?

0:12:50 JS: There’s a broad net of NLP, and we’ll get to the taxonomy in just a second. To answer your first question, which is where does it make that leap, really in the 1900s is when we first see the major first leap happen. The 1900s was the renaissance for many different reasons. You got a booming world economy, you have… You have Fisher. Let’s not forget about Fisher.

0:12:51 MH: Fisher.

0:12:51 JS: He was a revolutionary… Himself. We have all these great intellectuals coming around, and then you have World War II right as we’re getting these… Electricity is widely available, computers are becoming more, at least available to those who have the resources to use them. And if you think about the code-breakers during World War II… I don’t know if you saw the movie on Alan Turing. That machine actually, was essentially, was doing a form of natural language processing or more generally pattern recognition on the codes to try to get something interesting out. And so, the revolution starts in the ’30s. You get people like Harold Lasswell, he’s another guy who was looking at these communications that were being transferred by enemies and trying to statistically predict, “Where are they gonna attack us next,” based on these content analyses of these transcripts.

0:14:00 JS: But it’s really, it’s 1964, in my opinion, that we really see the revolution in true NLP, ’cause a few statisticians got together, Mosteller and Wallace. And in 1964, they published this widely cited statistical analysis of text. And the purpose is actually to determine who was the author of the previously un-attributed Federalist Papers. The Federalist Papers, nobody knew who, or there were arguments for who it was who wrote all these… And actually, it turned out they used the what’s called stylometry, which is looking at statistical patterns and certain types of words that are used. And they determined conclusively with statistical evidence that it was actually Hamilton.


0:14:37 MK: Wow.

0:14:39 JS: And that’s why they say that Hamilton… And this was a big deal at the time ’cause it took computers, you had to tabulate all this stuff. It took eons of or, I guess, reams of cash to throw at research assistants, etcetera. And this is what really popularized natural language processing. So ’64, I would say, would really be the revolution there.

0:15:00 MH: So many ways to go from here.


0:15:02 TW: I just keep thinking as you’re talking about this, and today we have chatbots. And I’m…


0:15:07 TW: What would that… Am I right? If I recall that that… ‘Cause there was a podcast. I listened to a podcast episode about the chatbot. It was a MIT professor who had wound up having very much second thoughts about it because he had his admin… He was finding that people, it was a stupid rules-based, I say stupid rules-based. I couldn’t do it today. But he wrote one that’s and just, and found how willingly and quickly human beings would engage even when they knew what they were talking to was a robot. But again, that was done through chat. It was dealing with the text as opposed to needing to translate speech. But even that… And it was finding different kind of ways to say, “If I don’t understand it, I’ll just do a, tell me more, or something like that.” And was the early iteration of robotic therapists. Am I butchering this horribly?

0:16:05 JS: No, not at all. And this is, you’re talking about Eliza? Is that what you’re talking about?

0:16:09 TW: That’s the one, Eliza. See, there you go. Yeah.

0:16:12 JS: That’s right. Eliza was… It was cool, ’cause I think that to test it, he literally just put it as a computer terminal in like a storefront somewhere and people on the sidewalk could come by and type random things into this robot and would end up sitting there for a long time, ’cause they thought they were having a conversation with this thing. ‘Cause it’s interesting. When you form the habit of being able to discuss something with somebody, if you can have a computer replicate that habit, it becomes second nature to have a discussion. It doesn’t necessarily mean it’s substantive. He was using simple rules to playback things that you said like a… Exactly like you said, as a therapist. Like, “You mentioned you were sad. Tell me more about that.” Things like that. And I think number one, perhaps it is a criticism of the therapists at the time.


0:16:58 JS: Perhaps that was the reason. But number two, it really shows how, even through this mirroring type approach, we can have conversations with computers, and it is very interesting. You start to think about the Turing Test, and I know Moe was very interested in the Turing Test as well. So I’ll let her ask a question about that, perhaps.


0:17:20 MK: Before we get to that, I suppose the thing that I still struggle to understand, is what is actually included under the definition of NLP and what is not included? Because it sounds like it could almost be anything to do with text and interpretability? But I don’t know if that’s accurate or not.

0:17:41 JS: Yeah. I think in general, it has anything to do… It’s a broad topic. It’s a very broad topic. And you can get your Ph.D. In any number of little teeny sub-fields. And you could have somebody who says, “No, we need to approach translation using deep learning.” That could be one little teeny section. Somebody else says, “No, we need to approach machine translation using geometric approaches,” which is the lexical distances and angles and n-dimensional hyperspace and how those patterns are reflected in other… There are so many different ways to think about…

0:18:12 MK: I know a story. When you say translation, do you mean human-to-computer translation, or you mean actual language like Chinese to Japanese translation?

0:18:21 JS: Yes. Actually language to language translation. Those are just small little teeny subsets. So a simple definition is just if it involves language and computers, that’s natural language processing, generally spoken. And within there, you have to divvy it out. There’s the approaches that are trying to convert maybe one data form, one unstructured data form to another. So that first approach might be, think about text to speech, speech to text. That’s a different pattern recognition problem than perhaps something that’s a little bit more relying on symbolic logic. And so this used to be some of the old translation approaches would go to… We actually could code rules, how languages… Did you ever take Latin? You ever take Latin in high school?

0:19:04 MK: No. Historians are not good at languages, especially ones that aren’t spoken much anymore. But…

0:19:10 JS: Well, I agree with you. I think whoever took Latin, I’m sorry for you. But… There’s that scene from the Monty Python movie “Life of Brian,” where he’s trying to write something on the walls like, “Romans go away,” but he doesn’t conjugate the verbs appropriately. So one of the Roman gladiators comes over and twists his ear and puts it against the wall. He’s like, “Did you actually mean to say this?” Well, “How do you conjugate this thing?” And then he’s like, “I’m sorry, I’m sorry.” Kinda like… That’s just to say yeah, some people thought you could translate using rules. There was a very different field from a statistical analysis of text for a long time, but as deep learning has become more important, and you start to see this convergence between these other fields and the statistical practice of trying to interpret language as it is written. And so we are seeing a little bit of a convergence. But yeah, it’s a very broad field. Anything involving language and computers, I think would be probably the best definition for NLP.

0:20:04 MH: Does that then… I mean, take a smart speaker, you have the human being speaking to it. And so part of the NLP there is translating that audio into symbolic logic behind the scenes into text. There’s another almost entire branch that’s saying now that we’ve got that translation done, the interpretability turning that into a query to return results. And then there’s another branch that is, I’ve got to translate my response into a normal-sounding language for natural language generation, is that… Those individually seem like monumental tasks but though they all fall into the umbrella of NLP.

0:20:50 JS: Unfortunately, yes, that definition includes them as well.

0:20:55 MK: That’s a great example Tim of how many different pieces of the puzzle it takes to execute on something like Siri or Alexa.

0:21:04 TW: Well what had me thinking about because we’ve got the major smart speakers competing that its kind of like what’s your lowest, your weakest? Apple in my experience, Siri doesn’t actually have the great results return. They don’t… It doesn’t have that middle piece really nailed down. It doesn’t have the database to actually answer the question, even if it’s got a good understanding on the in or the out. I mean, it’s just crazy that Google and Amazon and Apple have all in an incredibly short time, come up with pretty fully-formed solutions to do substantially the same thing and it’s just… It’s a little mind-boggling to me that they all figured out that the technology, the smarts and how to wire all that up and operationalize it at scale all seemed to happen in a ridiculously short period of time and do a reasonable job. And so you’re gonna say, you have your own, you’ve done your own. There’s the Doctor Joe smart speaker that you’re like, “No, no, no, it’s just a little TensorFlow. Plug that bad boy in.”

0:22:10 JS: My smart speaker answers questions such as, “Don’t forget to eat lunch,” or other sorts of things like that. I have very idiosyncratic questions I ask him, “Did I eat lunch today?” “You know, yes.” But, you’re right, and there are other fields that are developing such as information ontology discovery. This is the idea that how do you actually figure how the information goes together? What are these ontologies that we might hold within our brains and how do we discover those from the data around the world as opposed to coding them from our brain into a computer as a sort of codex of rules that allow you to go through the same logic that you would as if you were a human, right? And your question is, what allows all these things to happen? Nothing more than we have the computing power and we have the broadly available data to be able to come up with better methodologies for analytics. And that’s gonna be in my opinion, where these things go in the future, right? It’s…

0:23:10 MH: When you say broadly available data, is that because there’s data to train? There’s enough data available, that is structured or semi-structured that we can train. What do you mean by that we have the data available being a key?

0:23:22 JS: Machine-readable already available and accessible via like the cloud at scale ’cause there’s a bunch of data just like I gave this example with the Genizah Papers and all these other data sources out in the world. There’s a ton of data out there. The real question is, where is the machine-readable data and is it organized in a way that will easily allow an algorithm to access it? And that’s a big piece of it. It’s obviously a very big piece of it, but now let’s just say, okay, now we actually have that ability. We still need the analytical methods to be able to use that data to develop inferential abilities. We could be able to say something about the world and that’s I think gonna be where we’re going in the next few years, which is new methodologies and new understandings of representations of language and knowledge networks and etcetera, etcetera, that allow us to better make inferences about the world.

0:24:13 MK: And is there something that has actually changed in the methodologies over, I suppose, particularly the last sort of five to 10 years that have allowed companies to accelerate their NLP processes so quickly? Are people discovering new methodologies in addition to the cloud making the data more accessible in addition to having the right data in the right format ’cause yeah, as a methodologist?

0:24:38 JS: Well, again, it’s a mixed bag. So one great example, everybody remembers when Geoff Hinton won ImageNet right back in 2014.

0:24:48 MH: I think everybody might be a little…

0:24:50 TW: Oh yeah, yeah.

0:24:51 MH: I know the anecdote, but I think everybody might be a…

0:24:53 TW: Of course. I still have the poster up from… Yeah.


0:24:58 JS: So it was a big deal. It was a very big deal because he had succeeded… He and his team had succeeded in building a singular self-contained approach to be able to make inferences about the images that it was seeing that had incredible accuracy. And that methodology, I mean, it roots back to the ’50s, back to the ’50s. That same methodology was invented back in the ’50s. And of course, there’s other some people say, “Okay, backpropagation, important discovery, important methodology.” And again it was, but it was really sort of a modification of the maximum likelihood type methods that we had dating back to even Fisher. And so, it’s repackaging these things in a way that is number one, acceptable by the scientific community and exciting to the scientific community ’cause it’s those little bursts that get people really excited and make them wanna go and actually research a route make something that’s gonna applicable, engineer something that’s useful, but it’s also about the computing power at the same time. That new methodologies, in my opinion, are gonna switch from right now there’s a big push for interpretable machine learning. Right now, neural nets deep learning. I mean, a great example of this is Google translate. So Google Translate, few years ago swapped over its rules-based language engine to a fully deep learning-powered.

0:26:17 JS: And the old code base, was like, I think it was either hundreds of thousands or millions of line of code and to maintain that is a complete disaster when we swapped over to the neural network-based code base for being able to do translations, the quality of the translations, changed significantly overnight, it was much more higher quality but the code-based shrank to something like, I think it’s on the order of thousands of lines now as opposed to the massive scale that used to be. Now the problem is we’re not really sure why it works. That’s the main problem and so when you think about new methodologies it’s gonna be starting to tie our understanding from other concepts like Econometrics, how do we actually make inferences about the world, how we better represent social behavior, etcetera, linking those concepts to the actual super powerful models that are very, very flexible and work for a variety of different types of things. So, yes and no, is the answer again.


0:27:10 TW: But how… So as you’re talking about the scientific community and basically researchers and academics who are getting excited or massive organizations that have… Google has a huge organization internally, is there at the same time, the challenge of the lay business person who thinks it’s just magic, you have this stuff available can’t we just do X? Why don’t we just… They dream up something that sure there are examples of it in the market but they have no idea of what is actually really needed to make it happen. And this maybe goes to your day job, like are you running into that where there are expectations that the magical Dr. Joe is gonna come in, and wave a wand and now our competitor’s websites, will be completely scraped and interpreted and assessed or, I don’t know. Yeah, it’s just a few thousand line of codes so real easy, sounds like.

0:28:14 MK: Can I just add to that? And I don’t know if you’re saying the same thing with your clients, but there does seem to be this perception that NLP is easy like customer wrote something, said something, it’s easy to understand what that means and to use it as a meaningful dataset which I suppose adds to Tim’s question overall, but…

0:28:35 TW: [0:28:35] __ Word is that I’ll get back to problem formulation that do people chase NLP without actually articulating the problem they’re trying… It’s like chasing AI or chasing machine learning, they’re focused on the method, not the question.

0:28:47 MK: Yes. Yeah.

0:28:48 TW: There you go. Wind up. Gave you plenty of time to wind up and swing at this one.

0:28:53 JS: Okay, so yes, there is a sort of… And this is the press you think about fake news. The press is definitely somewhat culpable for this picture of artificial intelligence. I think the headline that I most think of is, I think it was even like MIT Technology Review it was, “AI has become so smart it is inventing itself.” That’s the headline that I always go to in my mind because it’s sort of suggests that like, “Oh wait a second… ” It’s like this technology is just so general-purpose, it actually is great enough for itself. Now, just take us out of the loop. You go, you form your next opinion and it couldn’t be further from the truth. Everything that we do in NLP is highly conditional on the problem we’re trying to solve, and when we engineer a solution to something, we have to understand what that thing we’re trying to optimize for is and I get this question all the time, and the text analytics workshops that I teach and one of them is why you’re teaching us this all this stuff isn’t there just an API I can throw this into and have it work.

0:29:55 MK: Ugh.

0:29:56 JS: And I have to say, I’m like, “Well, sure, if you care about the same things that the people that coded the API care about,” because think about stop words great example stop words and common step in NLP type analyses is to remove stop words. Okay, let’s just think really low level, not even thinking about any sort of, “AI.”

0:30:14 MH: Stop words being a, and, of, the… If typically the words that… And ideally don’t have much meaning and will clutter up stuff, right?

0:30:24 JS: Exactly, yes. These commonly used not really meaningful, “Words.” Oftentimes, we remove those and we conduct our analysis after removing it. Now there’s a significant massive research that shows that when you’re researching things, such as politeness which is very, very important in sort of like the natural conversational artificial, you don’t want Alexa to cuss you out. There’s like… It’s a very important topic if you’re trying to make mimic conversation you wanna leave the stop words in because the stop words are actually one of the most indicative thing of pleasant-ness and politeness. You can’t just give anybody a standard size, one-size-fits-all, API for this stuff because you always need to understand what you’re trying to optimize for, and the conception that you can just throw something into a computer and the computer will solve it every time I hear that my eyes roll into the back of my head and my hair lights on fire because I’m just thinking about, “Now I’m gonna have to go and talk this client all the way back down,” and the expectations have to be reset.

0:31:28 MH: A standard response that you feed into Amazon Polly that then it calls them and then reads it in a delightful British accent. That’s the…


0:31:38 MK: So sorry, that view that you throw the data… Can throw the data at an API, do you find that’s coming from the engineers or the senior management, stakeholder type people? Because one of the things that I’m noticing as a client-side is that more and more engineers are delving into this space because they’re building a lot of these systems that then feed into our actual product. Yeah, I’m curious where that or is it coming from all of them? ‘Cause that’s tough.

0:32:09 JS: It’s not necessarily engineers or management. I think we get some of both in every category. I’ve seen management who just wanna get it done and they don’t really wanna understand what’s going on. Nobody cares about the methodology. Just stick it in, let’s see where it goes. And I get that there’s also other… Especially in our professional digital analytics, a lot of people who understand correlation or causation very different things need to be interpreted differently and so our analyses need to reflect that. And they set that expectation for their junior staff. I think that’s a really important approach. Similarly, you get engineers… There’s some engineers who come at it from really highly theoretically motivated background that they then sort of went into computer science and now they’re doing the engineering. I think that those people are usually more at an advantage because as long as they can learn SDLC, and how to work on a team that’s actually engineering, some sort of broader spec or project, I think that what you bring with you is a lot more of interpretability, ability to describe what’s going on to the actual client, and the ability to identify when things are going very, very wrong in your model. And the disservice, I think, for the engineers that just wanna hit an API and get done with it, is that things go very wrong and you miss it. And so, I think it’s a guess not too clear cut, right? It’s a little bit of both in every single section.

0:33:25 TW: What it seems like, you talked about bias earlier, like that seems like the sort of thing that having actually having depth of knowledge and thought and experience in trying to explain and we’ve done, bias ethics podcast episodes before where that whole it’s kind of a deep topic. We think that we know right or wrong, but the idea of like, well, your training set is skewed this way and that’s definitely the sort of thing that I would, your lay executive is not going to think of and even the engineer says, “Oh, I got… Look, I got this training dataset from this open data source with completely oblivious to the non-representation of the dataset,” or something like that.

0:34:06 MK: Wasn’t this the example with like Gmail, where when it was doing auto-complete, it was auto-completing that if you’d said, like, boss or whatever to him, or he, and I mean, that’s a huge consequence. Can someone please tell me if that’s actually a thing, or if I just pulled that out of my memory, and it’s not accurate, but those are the consequences of the models that you build is that it can be biased towards… Well, like you need to recalibrate it so it’s doesn’t have the biases that we as humans have, I guess.

0:34:36 JS: I’m not sure of that one but I know there was one that Amazon had to pull an HR AI that looked through resumes because basically, it was like, if you played lacrosse and your name was Chad, we’re definitely gonna hire you.


0:34:49 JS: Which correlation is not causation. If you’re a Chad based lacrosse player out there, it’s not, you’re not a terrible person just Amazon’s doing it wrong.

0:34:58 TW: Well, yeah, we issues in every single application of machine learning, which is, if you’re using unsupervised machine learning, there are gonna be patterns in the data that are reflective of biases and even worse, you might not know how to label those biases and so they’re gonna be reflected in the model that you ultimately use. I mean, from a supervisor perspective, the information that’s going into the model usually contains other sorts of bias that have been put into the data, because we’re humans and because we have a lots of biases and part of our everyday function, and so one of the things I recommend for people, especially as they’re trying to avoid bias because it is really important and there are beginning to be very significant legal, and regulatory repercussions for not addressing it is to have a internal review board at your organization and make sure there is a model that you build. And you believe that there is a high likelihood that there is bias as part of that model, have that committee of people who should be diverse in their representation, etcetera. Have that committee of people review the use case, and review the model and review its results before actually approving it to make sure that what you’re putting out into the world is as we would want it to be, and not necessarily as it is.

0:36:16 MH: That’s back to… This is like ring so… It’s back to putting people process time and hard work into something that is not mechanical and automatable like I feel like that is the history of analytics and that sort of thing that winds up being critical, and it’s unappealing ’cause there will be 27 vendors out there saying, “Hey, instead of your AI internal review board, buy our tool, it’ll automate the internal review board.”

0:36:47 JS: You use AI to set the AI ethics on the AI that you’re putting out there.

0:36:51 MK: Oh, wow.

0:36:53 TW: Yeah.

0:36:54 MK: It’s funny though, I had this really weird problem last week, where I’m dealing with like the UTM parameters from hell but you do a whole bunch of like encoding issues with spaces and underscores, and pluses and percent to be anyway. And me and this other analysts, we’re kind of like doing our heads in about how we fix it because I kept saying, like, “Let’s just change everything to underscores.” And she said, “Well, what if someone genuinely put a plus in there?” And we got in this whole huge discussion and couldn’t figure out a solution that we were both comfortable with. And my boss actually has a Ph.D. In NLP as well. And now that we’re having this discussion, I’m like, I wonder if this is what influenced his suggestion. So his suggestion to us was like, “Well, just take them all out.” And we’re like, “What do you mean?” He’s like, “Take out everything that isn’t a letter so it will just be one giant string, and then you should be able to match it that way. So all your underscores and pluses get removed from the UTM parameters.” And I was like, “That is a really weird suggestion, but I think it’s gonna work. So sure.” And as we’re discussing this, I feel like maybe his studies is influenced, like how he’s tackled that problem.

0:38:05 JS: Absolutely. I mean, that is a common step in any sort of pre-processing procedure, you would conduct on text before converting it to something like a distant reading type apparatus, such as bag of words, or an embedding or something like that. And, I think, in particular, you’re talking about fields, sort of like the headers on columns, right? And trying to match those up in cases where you’re using these smaller sets of labels on certain columns, right? There’s an idea. Have you heard of fingerprinting before or collision queue matching, types of things like that. So those ideas are like, well, if we can just make them look the same, they probably are the same and we just treat those other symbols as noise, we’re gonna get rid of them. And so, when we take the noise out, we get a really nice match.

0:38:50 MH: But technically you’re actually removing some information in order to, that’s like stemming and dropping all that out. Now you’ve done something where they can be merged together, but you’ve lost the uniqueness, even if they were glitches from unique like that. That seems like that’s another one of those deliberate steps of saying we can do this matching if we dumb things down a bit, but being just very cognizant of… And that just seems like another one of those things that a data scientist is cognizant of… I’m doing this, I’m doing it deliberately, I’m actually modifying my dataset, but the risk versus reward of doing that makes sense?

0:39:30 JS: Yeah. This is all part of lexical analysis. It’s how do we try to represent information and features that are coming out of text, something that we might pump into our model. Yeah, you have to collapse over certain pieces of information to be able to get model parameters out of your model. And it’s an important thing to do because I’d hate to just… Have you ever heard of the concept of interchangeability or exchange-ability?

0:39:52 TW: I feel like this is a trick question, ’cause that seemed like a word I know.

0:39:57 MK: I’m like, “Continue, and maybe.”

0:40:00 MH: Maybe…

0:40:01 TW: Like Eli Whitney, the cotton gin? Are we talking about…


0:40:03 JS: [0:40:03] __ about the world, it should be just as good about saying something about any individual within the unit of analysis for your model. So if you’re looking at documents, each row on that spreadsheet, the unit of analysis might be the document, it might be sort of a person or an individual or an account you’re trying to personalize to, etcetera. Whenever you create a model, the parameters in your model should be able to explain those people just as well as if you had pulled out their ID and just pulled their row. And what you’re trying to do is come up with a model that is very good at explaining them without actually having to know who they are. And that’s why you end up eliminating information that you don’t think will help you say anything about them. That’s what you’re trying to do and you have to do that for… There’s a variety of statistical problems that emerge if you don’t do things like that. But that’s what you’re trying to do, and you decide which information to destroy as the researcher, and that’s where a lot of bias comes from, absolutely.

0:41:06 MH: So Joe, would you say that a good rule of thumb, if you’re gonna do this kind of work, is to, just based on some of the things we’ve been discussing, if you’re going down into a model and doing some of these analyses, don’t try to reuse that same model for another problem, because of some of the challenges you have with all these different things?

0:41:28 Announcer: Yes, yes, now we’re getting into a little bit of my dissertation, right? Here’s a great example. You wouldn’t compare, for example… I guess we’re all marketers, right, so you wouldn’t take a model from one company that is trained on their customers in let’s say healthcare and use that to try to predict something about a bunch of customers in another circumstance where you’re trying to predict something with respect to like e-commerce, right? I don’t think that there’s a huge use-case for a model that predicts, for example, like the probability that you need a liver transplant and trying to predict the probability that you are going to buy this product or convert on this page. You don’t wanna just interchange models left and right because they were modeled, right? And this is one of the big issues, I think, which is out there in both NLP and machine learning. The models are only as good as the data that you’re putting into them and the approaches that you’ve selected to try to make inferences about the world. Conflating the two just because it’s all machine learning is a huge issue. So I think that you really do focus on an important topic.

0:42:30 MH: No, I appreciate the clarity on that, ’cause that was what I was drawing from that was, hey, you’re gonna make all these choices to strip this out or add that in or do the analysis this way, but then if you try to take that same exact thing and then shift 90 degrees and go after a different problem, you open yourself up to some wild thing that you never expected, so that makes a ton of sense, actually.

0:42:52 JS: To make it even clearer, a great example is, you can run, for example, a model that tries to predict your political ideology from Twitter posts, right? And if you’re looking at… Let’s say you see a really misspelled Twitter post, it’s definitely a Democrat, it’s definitely a liberal. And that’s what your model’s telling you. And you go and you run that same model on a kindergartener and their spelling test, and you’re like, “This kindergartener is definitely a Democrat.”


0:43:24 JS: You’re not mapping to the same underlying question, right? And our models have this problem which is, they don’t care. Machines will do with complete impunity the dumbest things in the world because you told them to do that, and that’s an important consideration to take into account.

0:43:41 MK: Okay, can I take things to a introductory level, perhaps? I feel like there’s a common problem that lots of analysts always struggle with. And I’ve been there myself, and I know what I did was probably not the best way to tackle it, but I just did what I did ’cause life pressure, etcetera. Basically, you have a dataset which is open text responses from customers. What are the first few things you would do with that? Because I feel like there are so many different ways you can go, and I’m interested to hear how you would start to tackle that problem.

0:44:18 JS: So especially on my team, we have a pretty well-defined process that we go through, and it starts with the basics. So yeah, fine, we may have the dataset, but before we even get to that, what are we trying to say? What is the population that we’re trying to make inferences about, and how does the sampling procedure that generated this dataset that I’m looking at map back to that population? That is, even at a basic level, a question that we need to answer, ’cause that’s gonna affect the type of method you wanna apply, perhaps some of the pre-processing you wanna do, some of the EDA, exploratory data analysis, that you wanna do. And so you need to be able to answer that question first.

0:44:54 JS: What is the population we’re trying to say something about, and how did that population generate this dataset, and can I create a model that approximates that process? That’s the first question. Now, when it comes to text, it gets it a little bit more interesting, right? Because you have to start… Text is very big, it’s very sparse, you run into a lot of different issues with text that you wouldn’t run into with another type of exploratory data analysis problem, which is… The biggest problem with text is, it is a sparse form of data when structured. And a sparse form of data just means that you have a lot of zeros. If you think about it as an Excel spreadsheet, you have a lot of zeros in the cell values because not every observation has every single feature. And that’s one of the big issues with text…

0:45:38 MH: ‘Cause usually your columns are words or pairs of words or triplets of words, that you may have a million columns and they’re almost all zero, but each row, if each row is a sentence, that’s what you mean by sparse?

0:45:52 JS: Exactly, exactly. Not every document, or sentence, in this case, share the same types of words. And so when you have sparse data, you need to be able to represent it in a sparse form. That’s, from an engineering perspective, one of the first things that we need to consider. The big problem being, if you try to blow that out onto your computer, you just crashed your computer. And so… That’s the first thing you need to consider is, “How do I manage this data from a sparse perspective?” Now once you have that down pat then we just use sparsed representations, point in space type representations of data, then we can move on to something a little bit more interesting, which is just from a frequenter’s perspective, what words are being used frequently in this dataset? That’s a basic question we might wish to answer. And, if the first word that comes up is, “Gblkgweh,” with that you need to perhaps do a little bit more cleaning on your dataset. I… A lot of speech-to-text data sets actually have very very poorly translated or transliterated words, and, that can create huge, huge issues for models because it introduces a lot of noise, and you’re basically a less accurate model.

0:46:58 JS: You can also start to think about running what’s called a pie-squared analysis or what’s called a Keynes analysis; Keyne, K E Y, and those basic analyses are… You now tie the text that you observe, simple bag of words approaches. You tie the text that you observe back to a variable that you care about that’s been binarised. So let’s say… You’re either a woman or you’re not a woman, right? One if you’re a woman, zero if you’re not a woman. Are there differences in the types of words that are used amongst that one group or woman group as opposed to the not woman group, and what does that mean? Does this look right? How can we try to verify and validate these data. Now, given that you’ve gone through that, you can start to tell a story even with these basic techniques before you get into anything such as some more intense cleaning, some more intense perhaps dimension reduction exercises. Because when we get into regression using text, the fundamental issue, especially with linear types of regression…

0:48:00 JS: Linear regression suffers from a huge issue called over-saturation, which is if I have more columns in my dataset than I have observations, I cannot run the inversion process on the matrix that I need to actually get those coefficients out to make predictions. And so, over-saturation is a giant giant issue, and, that is to say, another way of phrasing the problem is I need at least one fewer columns than the number of observations that I have. And, in text data I could have 12 million columns from a dataset of only 10 or 20,000 people. And… And so we need to… We need to account for that by reducing the dimensionality of those columns. Now there’s two ways to do that. One is through feature selection processes. Are you familiar with the lasso regression?

0:48:46 MK: No, I’ve used the Boruta feature selection before, but that was a very different problem.

0:48:52 JS: Okay. So there’s lots of different ways of how to select which features are potentially related to the outcome of interest, and, those features might change based on what your outcome is. If you’re trying to map to gender or predict gender, you’re gonna get different features from the dataset than if you’re trying to predict something like probability of conversion, or, for example, like an affinity to a certain topic, or, even more generally something like the probability that they’re gonna donate to this cause or something like that. And so, your different feature selection mechanisms will give you different features. That’s one way to do it, which is basically to say these columns are exactly what we want, we’re gonna reduce the feature.

0:49:29 JS: Another way to do it is using unsupervised learning techniques such as latent semantic analysis, latent Dirichlet allocation is another good one, essentially clustering techniques that are gonna give you a smaller lower-dimensional number of columns, maybe something like 30 that represent clusters of words that you see that appear together, and can maintain the same sort of informational basis as if you were gonna use that entire matrix, and then you can run a regression on something like that, more classification. You classify or something like that. And so, I skipped a lot of steps in there which the bigger point here was the lexical analysis step and, we can definitely go into more detail if you want if it’s useful for you.

0:50:07 MH: But… I’m sorry, I missed… I missed… What was the specific API that we just point the data to?


0:50:15 TW: Yeah, so just plug in what in a Google Cloud Platform?

0:50:20 JS: My team is the API, all you see is the data that go in and the data that go out.

0:50:25 TW: That’s right. The Sutherland API. Alexa, please read my statement on complexity of analysis.

0:50:34 MK: Oh my God, that was… I’m mind blown. That was really terrific. Thank you for…

0:50:39 TW: Yes, and we are now time blown Moe, so we have got to start to wrap up. And, it is awesome, so great to finally have you on Joe. This is awesome and I really appreciate it. One thing we’d like to do is go around the horn and talk about anything going on that we think might be of interest to our listeners. Moe, do you wanna kick us off?

0:51:01 MK: Oh, I would love to. I have a weird one, this episode, and, people keep tweeting me being like, “You have weird but good last calls,” so I’m gonna keep doing weird stuff. So, whenever I start managing a new team, I go through this exercise of coming up with what our team values are. And, I think it’s really important because it kinda sets the ground rules or the things that we want to have in place so that when we hit a speed bump, we kinda know, “This is what we stand for, so let’s revert back to what we stand for in this situation that’s difficult.” And I was catching up with one of my mentors last week after a really fascinating leadership conversation with Helbs at Super Week, and I…

0:51:42 MK: Basically her advice was you should be doing the same thing for yourself, and what you wanna stand for as a people leader. So, as… We all know that I love Brené Brown. I’m reading, “Dare to Lead,” again, and this time taking more notes and actually doing the exercises properly. But one of the things that I’m trying to go through is an exercise of what my leadership values are, what I wanna stand for as a people leader and what’s important to me, which has been really eye-opening. And I guess the thing is that I thought it was good to do as a team, but I never really thought about doing it for myself. So, whether you’re in a team or leading a team, I really recommend that you give it a crack, and, I’d love you to reach out and hear what you figure out is your values.

0:52:12 MH: Oh my gosh Moe… So impressed by that, and I’m thinking to myself, “We need to do that for the podcast.”

0:52:34 TW: Oh good lord.

0:52:35 MK: That is a grand idea. No, I love that.

0:52:38 MH: I don’t know.

0:52:39 MK: Yes, Tim.

0:52:40 TW: Brené Brown, University of Texas alumni and commencement speaker this spring at the University of Texas. Hook ’em horns.

0:52:46 MH: Alright Tim, what about you? What’s your last call?

0:52:49 TW: So mine is a competition that I would be in no way equipped to enter The RoboTHOR Challenge 2020. It’s indirectly through the Allen Institute, which was Paul Allen had founded. But the idea behind the competition, as I understand it, is that they’re trying to train a model based on a simulated environment, yet there’s kind of a challenge of how do you actually build and design things using simulations and then have those work in the real world. So this competition is basically rooms or spaces set up in a simulated environment, and then the competitors have to build kind of their AI solutions that will navigate this space and accomplish certain tasks. And they can only work within the simulated environment and then the actual competition goes to the real-world environment and actually then tries to compete in that. And I may have kind of butchered it ’cause I read this like a week or so ago, but you basically do training in the simulated environment, then training in the simulation but with a real robot. And then you actually have the final challenge is a real robot that your stuff’s been loaded into, into a real environment to see which one can actually achieve it. But it’s apparently trying to solve some big challenge around trying to work stuff out in simulated worlds instead of real-world environments.

0:54:12 MH: So it’s kind of like a Tough Mudder for robots.

0:54:14 TW: Yeah, there you go. That’s exactly what I was thinking, now that you said.

0:54:20 MH: Alright. That’s very interesting. I don’t know if we need one of those for the podcast or not. Joe, what last call do you have?

0:54:33 JS: Mine I’m gonna be a little more mundane, right, ’cause I like bad jokes and I’m not that interesting. So I think that the way I… Upcoming, I teach a lot of these conferences, I teach a lot of text workshops at conferences. And the most recently released one is gonna be talking about artificial intelligence for executives. We do a short two-and-a-half-hour workshop. What we do during that workshop is we show you what artificial intelligence is, we talk about what it isn’t, we talk about how to think about use case generation and then finally we build those into an executable playbook where you can start to message around your organization and start to build a coalition to support your artificial intelligence and data science initiatives. I think it’s a great talk. We talk about everything, the trends in the industry, what’s going on at Davos, all the way down to this is how you can think about capturing Alpha and automation using your data science prowess and finally build that into maybe a quick win project that leads to the next project that leads to something that actually leads to transformational change for your organization. And so that’s what we do during the workshop. If you have any questions about when they’re coming up, I can’t say at the moment, but go to searchdiscovery.com and check out the upcoming events schedule for any one of those.

0:55:53 MH: Very cool. I’d like to attend although I’m not sure I’m the target audience.

0:55:57 MK: I really want to attend.

0:56:00 MH: Yeah.

0:56:00 MK: Got to figure out a way to make that happen.

0:56:01 MH: Well, I went ahead and fed all of our last calls into machine learning to come up with mine and this was the result. No, I’m just kidding. So actually something I’ve been wanting to talk about as a last call for a little while is dataset search on Google came out of beta. This has been a while back, but we haven’t recorded a show where I could talk about it. I don’t think. Anyways, it’s really cool. And I’ve been using it a ton for finding academic research and datasets on topics. So go check it out, is you look for things to apply natural language processing to dataset search may be your friend. Okay, I’m sure you’ve been listening and thinking, “Hey is this whole podcast machine learning generated.

0:56:53 MH: And if so, wow, what a great job by everybody.” Yes, it was, and thank you to all of our computer, robot, AI overlords for generating such great show. One of the cool things about that is we have input mechanisms that you can reach out to us on like the virtual space, the Measure Slack and our LinkedIn page, and of course the greatest machine learning experiment of all-time Twitter. No, I’m just kidding. Alright, so I’ll quit with those jokes while I’m ahead. Oh, not really. Also give a shoutout to our very real and not virtual producer Josh Crowhurst, as he’s always really helpful to us to getting the show out to you and we would love to hear from you. Alright, Joe, thank you so much for coming on the show. Been a pleasure. Honestly, too long in coming. So finally thank you for doing it and I think there may be a show number two in the future down the road there. So hopefully you don’t stay too as busy as you have been over the last year.

0:57:54 JS: I appreciate it very much, thank you so much for having me on. I mean you guys are the real stars of the show as you know.

0:58:02 MH: That’s Tim Wilson, the quintessential analyst guy. So yeah. Technically that might be me trying to so… I might be doing some neural-linguistic pro… I’m doing the other NLP, Tim…

[overlapping conversation]

0:58:16 TW: I thought you were… I thought you were quitting while you were not as far behind.

0:58:20 MH: Well, that’s different, not natural language processing. It’s neuro-linguistic programming. Okay, I am sure that I speak for my two co-hosts, Moe Kiss and Tim Wilson, who are fine examples of the best that there is in our industry for sure. When I tell you as a listener, also a fine example of some of the best in the industry that no matter what your dataset in your model is showing you keeping you listening.


0:58:50 Announcer: Thanks for listening and don’t forget to join the conversation on Facebook, Twitter or Measure Slack group. We welcome your comments and questions, visit us on the web at analyticshour.io, facebook.com/analyticshour or @analyticshour on Twitter.

0:59:10 Charles Barkley: So smart guys want to fit in, so they’ve made up a term called analytics. Analytics don’t work.

0:59:18 Tom Hammerschmidt: Analytics. Oh my god, what the fuck does that even mean?

0:59:26 MH: But everything looks pretty good.

0:59:28 MK: It sounds great.

0:59:29 MH: Well, thank you. I’ve been working, I warmed up, I gargled with some gin and honey.

0:59:35 TW: Gin and honey, I just stop with usually just the gin personally.


0:59:45 MH: Yeah, I hate to cut this short. I know you like it. I just saw the tweet that STI has acquihired another Joe today. So we’re gonna go ahead and drop now and jump on the phone with Tim instead, I’m kidding.

0:59:58 JS: Time to get down, time to get down.

1:00:01 TW: Old news. This is old news. We got another acquisition. Okay, anyways. I just want you to say Claude Shannon so that Matt Gershoff can let us know that like, “Oh there’s instant credibility.”

1:00:12 MK: So Tim, I have a funny anecdote to share with you. Remember Jamie and I, we’re sitting… We’re in Hawaii and were sitting at like this nice restaurant. I have my drink bottle with me and he’s like, “Whoa.” And I was like, “What is wrong with you?” He’s like, “Shit, I didn’t really think you’d legitimately have Tim’s face on your drink bottle.” And I was like, “Oh yes, the quintessential analyst.” And now my sister is just beating me up since I don’t have any stickers for her. So…

1:00:36 TW: Oh, I’m taking care of it.

1:00:38 MK: Please do so that she loves me again.

1:00:41 MH: Yeah.


1:00:44 MH: Mr. Bill might be… Maybe appearing again in the out-takes this time not wrapped in a towel.

1:00:50 MK: Well, there was a discussion at dinner last night about how they could all break in and start doing some kind of Hawaiian themed song. Joe, my family has a habit of all walking in when I’m recording and given we’re on a family holiday.

1:01:04 TW: Sometimes, fully clothed?

1:01:05 MK: Sometimes not.


1:01:06 TW: Rock, flag, and I’m sorry, I don’t understand the question. Could you try asking it a different way. How was that?

1:01:19 MH: That was good.

1:01:19 MK: That was great.


2 Responses

  1. […] (Podcast) DAPH Episode 137: Data Science + Words: An NLP Meet Cute for Analysts with Dr. Joe Sutherl… […]

  2. […] pleasure of diving deep into the fascinating world of natural language processing (NLP) with the folks over at Analytics Power Hour. In the episode, “Data Science + Words: An NLP Meet Cute for Analysts,” we discussed […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

Three pay phones mounted on a wall next to each other

#245: Dear APH-y – An Analytics Advice Call-In Show

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_245_-_Dear_APH-y_-_An_Analytics_Advice_Call-In_Show.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares