Imagine a world where business users simply fire up their analytics AI tool, ask for some insights, and get a clear and accurate response in return. That’s the dream, isn’t it? Is it just around the corner, or is it years away? Or is that vision embarrassingly misguided at its core? The very real humans who responded to our listener survey wanted to know where and how AI would be fitting into the analyst’s toolkit, and, frankly, so do we! Maybe they (and you!) can fire up ol’ Claude and ask it to analyze this episode with Juliana Jackson from the Standard Deviation podcast and Beyond the Mean Substack to find out!
00:00:05.73 [Announcer]: Welcome to the Analytics Power Hour. Analytics topics covered conversationally and sometimes with explicit language.
00:00:14.49 [Michael Helbling]: Hi, everybody. Welcome to the Analytics Power Hour. This is episode 278. You know, these days, every device in your home, for some reason, wants to connect to the Wi-Fi. I mean, does my coffee machine want to write poetry or something, or give me sports scores? I guess sometimes the capabilities and features available might not fit the function of the thing. And maybe this is an intro to a podcast about where AI might fit into data analysis. So recognizing that AI is most likely here to stay, let’s, I guess, dive into it. First, let me introduce my co-host, Val Kroll. How’s it going?
00:00:51.74 [Val Kroll]: You always do that to me. You give me the most intro.
00:00:54.80 [Michael Helbling]: I was so close. No, I do it to everybody in fairness. I love doing the how you’re going ever since I learned it many years ago.
00:01:02.57 [Tim Wilson]: Well, I’m good.
00:01:03.23 [Michael Helbling]: But I try to only use it on Australians. OK, well, good. I need to get you a more Chicago hello. How’s you guys? That’s so New York.
00:01:14.39 [Tim Wilson]: We’ll work on that.
00:01:15.51 [Michael Helbling]: We’ll work on it. We’ll work on it. We’ll work on it. We’ll workshop it. All right. And Tim Wilson, big old Howdy partner.
00:01:22.80 [Tim Wilson]: How y’all doing?
00:01:24.34 [Michael Helbling]: There you go. Get back to your Sour Lake, Texas roots. There you go. And I’m Michael Hubling. And today, I’m really excited about our guest. Juliana Jackson is an associate director of data and digital experience at Moenks. She’s also the co-host of the Steender Deviations podcast. And we all love her newsletter, Beyond the Mean, on Substack. And today, she is our guest. Welcome to the show, Juliana.
00:01:48.14 [Juliana Jackson]: Thank you. And I should answer with this very Eastern European accent so I can fit the same stuff that you guys were doing. You see, I can actually speak.
00:01:57.16 [Michael Helbling]: You’re Europeans? What?
00:02:01.61 [Juliana Jackson]: I was trying to speak the Eastern European accent when I speak. You see, I can speak like this too if I want to. That’s how I actually sound.
00:02:10.13 [Michael Helbling]: I like it. Well, this is actually sort of a long time in the making because I think, yeah, I think so. All of us have been reading and sort of a lot of your recent newsletter posts and passing them around to each other and talking about them. And so we always find that when that happens, we’re sort of like, okay, we need to get this person on the show with us so we can have a conversation. So we’re really excited to have you today. And I think our first and most burning question is, how did you get Simo Ahava to do a podcast with you? No, just kidding.
00:02:43.16 [Juliana Jackson]: Listen. Listen. So every measure camp that I used to go like maybe a year or two years ago, Everybody was telling me, it’s so nice that you’re doing, Simo lets you do his podcast. And I’m thinking about Alban, specifically, thanks Alban, if you’re listening to this. So what had happened was, actually, the podcast was done by myself in the beginning, and I actually had team as a guest, and we had one of the most esoteric dope episodes ever. We’re talking about segmentation and horoscope and colonization, and correlation was so cool. I’ve always been a big fan of Tim and I still can’t believe he talks to me sometimes. And he knows that because he told me to stop, which I’m clearly not doing it. But anyway. So I was working at CXO back then. So they were sponsoring the podcast. It’s pretty expensive to run a podcast, as you guys know. And then it was like the Spotify wrapped. And actually, for some reason, people were listening to me on my own and doing the way I was doing it before, which was more, again, esoteric, more whimsical and shit. Oh, you’re going to lose your PG, by the way, after this.
00:03:52.45 [Michael Helbling]: No, no. We’re explicit. So just keep it.
00:03:55.06 [Juliana Jackson]: Oh, that’s good. OK, great.
00:03:56.70 [Michael Helbling]: Bombs away.
00:03:57.72 [Juliana Jackson]: We love it. Jim Sterne, it says it’s a sign of intelligence. So whatever Jim Sterne says, I will follow. Yeah, anyway, so I posted the results from the Spotify rep actually did pretty good for somebody that, you know, I don’t know, irrelevant person that I still am. And then I was speaking at the DHO Analytic Summit. with Team Coppans and then Simo was speaking, he spoke before me. So while I was getting ready to go and speak to the event, Simo messages me on Slack. He was like, hey, you know, like I’ve been thinking about it and I think I would love to do the podcast together with you if you’d have me. I’m like, what? I’m like, oh shit, message my husband. I was about to go live. My husband was in the next room. I’m like, oh shit, Simo Ahava wants to do the podcast with me. He was like, who the fuck is that? I was like, don’t worry. Don’t worry about it. And I was freaking out. And I said, let’s talk about it after I finish this, because I was generally going live. I cannot tell you what the hell I was talking about at that event, because I was like, oh my god. Oh my god, Simo wants to do the podcast with me. Why? Why is this a joke? It’s all you could think about. Yeah, like I couldn’t think about nothing afterwards. It was something I don’t even know what I was talking about. So anyway, I finished the talk. I feel bad for you too if you’re listening to this. I’m sorry. I really enjoyed the event though. It was very nice. Thank you for having me. It’s a sign that I was never asked again. But yeah, so I messaged Seymour afterwards and I’m like, fuck yes, you know, obviously. And yeah, we just ended up doing the podcast together, and he’s been sponsoring it since. And we didn’t know at that time how this will work, and we still don’t, if we’re listening to the podcast. It’s just, I think, while we’re very different as people, we’re also pretty similar in terms of banter. So we kind of like just like play off each other and he has obviously his topics about, you know, GTM and server side and data privacy and, you know, all the stuff that I have no idea about. And I just am more into my commercial analytics, more data storytelling and AI and mobile apps. So it’s kind of a good fit. I think, yeah, and we’ve been doing it like for three years. I think it’s now close to three years. I think ever since, yeah, three years, three years, almost three years. And for some reason, he still wants to do it. For some reason, people still listen to it. And for some reason, I have an episode that I should have edited a month ago, but sorry, team company. But that’s a very long-winded answer. No, it’s great. I love the origin story. Yeah, so I did not convince him. I’m not Alfred, OK? I’m tired of being told that I’m Alfred.
00:06:51.64 [Michael Helbling]: So recently, I think the genesis for this topic came about from something on your newsletter where you’re talking about sort of can large language models do data analysis? You kind of broke it down into a number of different ways. And I sort of want to dive into that just a little more deeply because I think there were some really important distinctions you made that were really really important about sort of, okay, here’s LLMs, here’s data analysis, think holistically about that. So first off, what brought the topic to you in terms of, was it just sort of everything that’s going on in the industry or were there specific things going on that were kind of like driving that to help you to start to think about it?
00:07:40.01 [Tim Wilson]: Were you being triggered by specific interactions?
00:07:43.25 [Juliana Jackson]: Better way to put it. I mean, it’s a mix of things. It’s funny because the last message I sent to Tim yesterday is that I got triggered by somebody on LinkedIn, so I’m writing something about agents. No, I mean, it’s a mix of things. I think, first of all, I’m a consultant and part of my role is to make sure that I test all these AI systems and products and services to make sure that once I go in front of a client with a solution, I am able to speak confidently about it and make sure that I am a trusted advisor and not just trying to shove people different products and services on their throats.
00:08:21.64 [Tim Wilson]: Oh, actually, that means you’re not doing it right.
00:08:27.33 [Juliana Jackson]: Yeah, I guess. What kind of consultant are you? A pretty shitty one, I guess. I don’t know. But yeah, so I’ve been working with language models since 2021, so no, 2022, when I joined Moenks. And I was always happy about using, so I’m not saying large language models, I’m saying language models, very big difference. And I was very excited because I was always struggling to analyze unstructured data. And obviously, they’re very, very good at that because you are given this unstructured data and they’re transforming it into numbers. So you can finally use your brain and come up with something. And I’ve done this for a lot of clients and on different types of data sources. But I keep on seeing a lot of people on LinkedIn dumping a lot of data into JetGPT or to Cloud or whatever they’re using. And they’re kind of like minimalizing. Is that the word? Yes, minimalizing the work an analyst would do and like how much experience does it actually take to be able to look at numbers and draw conclusions. So I guess I was just a bit triggered. And I did what any other trigger person would do, just go and read science literature.
00:09:39.62 [Michael Helbling]: That’s right.
00:09:42.08 [Juliana Jackson]: I showed them. And I was talking to Jason Packer. I was talking to him. I talk a lot with Jason, actually. He’s a poor guy. So I was talking to him about it. And I went through all the science papers from this year that were people basically just trying to do data analysis, tabular analysis using LLMs. And the only way I noticed that anybody can get close to it is if they use a mixed mixture of Python and of code in LLMs. And obviously, I knew the answer to that before, because I tried it myself. I actually dumped a fake data set, a CSV into chat GPT. And I asked it, what does this data set do in exploratory analysis? The exploratory analysis was actually pretty spot on. But then when I started to go granular, I noticed that the results are not there. So then I just started to research what people are speaking about this. And then I got very angry because there’s I think a big problem that we have right now in the industry is the way we look at innovation. Before innovation used to mean that you are somebody that’s looking for outside of conventional wisdom to solve a problem. You had the problem that you wanted to solve, so you were seeking outside of the conventional wisdom ways of doing and going about something. now what we’re doing is we’re actually having solutions that are chasing problems that don’t exist. Like why do we need a new way to, it’s my perception. So why do we need the new way of doing data analysis when we have very proven statistical methods and calculations like why the hell was I struggling to pass that stats course and listen to what’s his name to Georgie Georgiev were going to turn whistling stuff like why do I have to do all that effort to figure out what the fuck my COV is. And that’s solved. It’s a problem solved. I don’t need an LLM to tell me how to do this, because it’s the same as vibe coding. I can’t understand vibe coding to a degree, but it’s kind of the same concept. Why are we creating new ways? for problems that are already solved. Because it’s just very infuriating to me, the conflation that happens in this industry, and it’s people that should know better. It’s people that are digital analysts. And I’m sorry if you’re listening to this. I don’t care. You can be mad at me. That’s fine. But we should know better. And you have people with platforms on social media, platforms that they should use for good. Instead, they’re just contemplating these doomsday theories and propaganda and whatever the hell you’re calling it. And it’s very infuriating. So I did my research. Obviously a lot of the stuff there is subjective. I could be wrong. I always say that in my, you know, don’t take my stuff as gospel. I’m just somebody that learns out and then unlearns and then I download it on people. But yeah, that’s kind of like how it started. And in general, my newsletter is kind of like everything that I write is basically just some sort of frustration that I have. And I end up just writing about it.
00:12:46.35 [Michael Helbling]: Hey listeners, ever feel like your data pipelines are messier than your inbox after vacation? Say goodbye to chaotic integrations and hello to Fivetran. Fivetran delivers automated, reliable data pipelines that are as easy as hitting subscribe on your favorite podcast. With hundreds of pre-built connectors, 5Trans streams your data seamlessly into your favorite warehouse, keeping everything accurate, fresh, and ready to use. No more manual maintenance or midnight data drama. Just clean, reliable data when you need it. So don’t spend another minute wrestling with your data. Visit fivetran.com slash aph today for the latest news and events and discover how Fivetran can streamline your data life. That is F-I-V-E-T-R-A-N dot com slash aph for the latest news and information about Fivetran.
00:13:46.04 [Tim Wilson]: Okay, let’s get back to the show. Well, in that, and you went pretty deep in that the one post, which we will, we will definitely link to, but the, like the, I mean, there’s some pretty funny things when you’re like, oh, you know, people will tell me they dumped some CSVs into chat GPT and the results were shit. And you’re, or the rules were terrible.
00:14:05.19 [Tim Wilson]: And you’re like, yeah, no shit.
00:14:09.48 [Tim Wilson]: Oh yeah. Yeah. But, but I think you like, I felt like I’d sort of seen it and people were kind of saying it, but you called out very, very specifically that we all kind of know that LLMs in an overly simplified way are doing this probabilistic kind of next word. It’s inherently probabilistic, yet when we’re doing stuff with data, there’s probabilistic when we’re predicting things, but when we run, if I run a linear regression on a set of data today and use R, and you run a linear regression on the same set of data using Python, We’re going to get the same basic model. There’s not trying to be like, oh, and kind of mix it up so that it’s natural.
00:15:02.02 [Juliana Jackson]: Yes, it’s the same results, but it also can be a bit different. I mean, it’s different on how you would do it in Python versus R.
00:15:10.85 [Tim Wilson]: the mechanics, but there’s like a very descriptive reason, or maybe if I ran it in R today and if I ran it again in R. If I put a prompt into chat GPT and immediately put the exact same prompt in, I’m going to get a different response. And so that was like one slice that I was like, oh yeah, yeah, we still have this fallacy often in the business that the numbers and the data are objective and there’s a failure to sort of think about uncertainty. But now somehow we’re wanting to take LLMs, which have kind of uncertainty built into them. It’s like, well, that’s a different and unnecessary aspect of uncertainty for this task, which is you know, crunching numbers. So I don’t know, that just was one of the parts of that that I was like, that is very useful to point out to people, even as people, you know, a couple of episodes ago when we had a BI, a Palinzema on and we were having that discussion about natural language queries. I think that’s kind of similarly trying to solve the wrong problem. There’s this idea out there that that’s what we need to do is to just have natural language business questions get converted to models or queries. And that seems like it’s focusing on the wrong The wrong part of the process, we need really good questions, which I think LLMs can actually really help with. I don’t know if that’s not a question. I just kind of went on my own little.
00:16:52.15 [Juliana Jackson]: No, but listen, you’re right. I have some thoughts if I can share them. That’s why you’re here.
00:16:57.27 [Tim Wilson]: Okay.
00:16:58.53 [Juliana Jackson]: I don’t know. I feel like it’s so hard to be a guest because I’m used to be a host. I don’t know how to guess. Like I’m like, yeah, exactly. We should ask this bitch. What was just thinking about that?
00:17:10.78 [Michael Helbling]: Hey. I’m the facilitator of this one, okay, Juliana? So like, you just rest easy.
00:17:17.07 [Juliana Jackson]: Okay, so to watch what you said, Tim, I think, okay, so I will speak for myself. I’m a very lazy person, so I’m always gonna look for the easy way out. I think most of us in this industry are inherently lazy. So we’re always gonna look for a way to fix something faster to optimize the way we work, because that’s how we’re wired as analysts or analytics people. I’m speaking for myself, but I suspect some of you guys would probably feel were relate. Anyway, so I would love to exist in a world where I can just have thoughts and put them into a model and just get my results and move on and then I will play Diablo for the rest of the day. That would be so good for me. I would love that. But the problem right now is that the actual core of the problem is a bit deeper. So we have… Do you guys remember Silicon Valley? Yes. Right? You know the series. Big fan.
00:18:10.94 [Tim Wilson]: Yeah, yeah, yeah.
00:18:12.78 [Juliana Jackson]: Great. So you know Son of Anton, right?
00:18:15.21 [Tim Wilson]: Oh, yeah. That’s the AI who winds up the lead, yeah.
00:18:19.34 [Juliana Jackson]: Exactly. So, Anton was the server solution, and then we had Son of Anton, which was the AI in the show that basically brought, I don’t know, 30 pounds of hamburger meat anyway. So, what I’m trying to… I just watched that clip last week. It’s hilarious. But it’s so relevant to where we are right now. So the actual problem of where we are right now is the VC bullshit that’s happening because venture capital is killing, basically, investing hundreds, thousands of millions of dollars in these people. And that makes people from the top, and it comes to the lower market to vendors in the industry. It comes like this thing, oh my god, I need to put a eye on this stuff. We need to put AI on this product. We see these people trying to obviously add AI to anything that they’re doing. For instance, I wouldn’t need an LLM on my CRO tool to think about the hypothesis. Why? Just let me do my A-B testing. Let me do my A-B testing in peace. I’ll come up with my hypothesis around my test. I really don’t need an LLM to tell me how to create a variation. I don’t. But this is an example or this J4 insights, come on, conversational analytics. It’s so triggering even now. I always think about that meme with Kelly Rowland typing in the Excel sheet. That’s not exactly how, but that’s conversational analytics. You’re writing text into an Excel sheet. That’s it, period. So the problem comes from the top of the market. It comes from VCs and so on. And us here in the world, we’re dealing with people that have the stories made up and they’re trying to convince other people it’s classic business stuff. Now, the problem is it creates this pressure on analysts and marketers or technical marketers and so on to keep up with the stuff. You need to keep up with the times. You need to adapt. you need to make sure that you’re using it in your workflows. For sure, I am truly excited about using artificial intelligence. I’ve been for a long time. To some degree, all of us in analytics have been using artificial intelligence with programmatic ads. We have been using it today be testing with predictive customer lifetime value. This is where the problem is we, and I wrote about this intensively, we think that AI is just large language models, and that’s where all our problems come. Because if you have that conflation in your head, you’re going to basically ignore everything else that happened until 2022 when GPT 3.5 launched. And that’s it. And then you’re going from that. So I understand. People are trying to use these models to do better analysis or to be faster or to do, I don’t know, like people, I want to think that people have the best intended heart when they’re trying to do these things. But I think there’s a huge pressure that comes from upmarket. There’s a huge pressure that comes from vendors. And there’s a huge pressure that stakeholders will inherently put on people working in different teams or in agencies to innovate at all costs or to do shit at all costs. And it sucks. And it’s kind of depressing me. And I found myself quite often and even like the other day when I, oh, this is sad. I feel vulnerable sometimes and affected about how much do I need to do as an analyst and learn to be able to feel that I’m still valuable to the business or if I’m still valuable to my clients. And I feel like it’s kind of like such a disruptive phase in the digital space for all of us. I don’t know if any of that makes sense, but I’m pretty vulnerable right now. That’s how I feel about this. Yes, to what you said, yes, we should use statistical methods because that’s how it works. But yet here we are dumping private personal customer data into GPT and ask for customer signal.
00:22:20.37 [Tim Wilson]: Well, and there was a point, and I perpetually am like four to seven years behind actually understanding distinctions, or even between data and analytics, I think I was probably 10 years behind understanding that distinction. But a lot of time, I think that’s because there’s inconsistency in the way people use it. And I hit a point five or six years ago where someone, and it’s like the Venn diagrams were shown to me that said, you know, machine learning is all of these statistical methods and programming and you’re doing models. And AI is when you actually enable that machine learning to actually do something to take an action. And I don’t know where now, as you’re saying, AI, when people hear AI now, and they think chat GBT, or they think an LLM. And partly, I think even now, people are thinking of agentic AI is becoming like, well, it’s a cool word to say. So I’m going to have this agents that I’m like, it’s used kind of all over the place. But when you say AI is not I mean, it’s fair to say LLM equals chat GPT or cloud or perplexity. AI is getting equated as, well, in some cases, just magic cool stuff, as you were just talking about, that we just got to slap the label on it for all sorts of technically unrelated reasons, but we need that label. I think what you were getting to is there’s kind of a more dangerous narrowing of AI equals LLM equals chat interface to anything that I want. And that’s like an overly and unfairly narrow definition of AI. Is that fair? That’s better than what the word fuck I just said.
00:24:16.53 [Juliana Jackson]: So thank you for that. But no, that makes sense. That’s 100% how it’s perceived and it’s very dangerous. Yeah.
00:24:24.71 [Val Kroll]: I think one of the threads also back to the Colin Zima episode about BI is he was talking about this balancing self-service versus the control that the analysts have within the tools. And it seems like there’s like, I’m curious your thoughts on this, Juliana, that democratization and making it accessible to the business users. It’s just about getting the right permissions and the data clean enough so that they can query to get everything they need. Do you think that part of this is a response to past failed attempts at putting all the data in the hands of the business users to get them to get insights to reign from the ceiling? Is it specifically about failures of self-service that people think, oh, the chat interface, that’s going to be the thing that cracks this nut? or do you think it’s not as related?
00:25:13.19 [Juliana Jackson]: That’s a really good question. I have two answers to it. I think in general, The business will have an easier way to deal with an abstraction than with statistical methods and calculating, I don’t know, the p-values of the world and having to deal with SQL and BigQuery and so on. So I think for a business, from a business perspective, it’s easier to deal with an abstraction, which is the chat interface. It’s so much easier. And I saw clients lit up, for instance, when we build a SQL query writer that was running was running into their reviews and you basically were writing in the SQL query that I want to see what’s the sentiment for the reviews in this period and then the chatbot was giving you the SQL code so you can just put it and run the SQL. So I do understand the need for abstracting a lot of the stuff that we do because for the business, truthfully, businesses don’t give a shit about the complexities of our job. They do not. They only care about what do I need to do? How much is it going to cost me? What’s the conflict? And what are the next steps? So I do think, yes, from a business perspective, it does make sense.
00:26:23.27 [Tim Wilson]: I think there’s something important. And when the analysts have actually also, we’ve lived in that when somebody says, why when I talk to this analyst, I just ask a simple question. And they have a bunch of other questions they want me. Can you give me revenue for last? Can you pull leads for me? And it’s like, well, I need to know the time frame. I need to know the region. I need to know, et cetera, et cetera. So I think there are aspects of the data complexity. Every aspect of data complexity represents some underlying complexity of the business. So I do feel like that is often misguided. A good analyst can say, based on my experience of the person, their role, the nature of the question, my knowledge of the business, my knowledge of the data, I can safely make a bunch of assumptions about how I’m going to write this sequel. And there are aspects of that that I want to hide from the business, but that it’s not like that’s complicated just because it’s just gratuitous, meaningless, unnecessary complication. It’s complicated because the business is complicated. That’s where it’s an aspect of the misguided, this desire to I want to ask a simple question. I don’t want to have any probing beyond that of my question. I’ve had this line for years when people would come by and say, hey, I got a quick question. It’s like, well, the length of your question has proven no correlation to the complexity or the effort required to answer it. And there’s a piece of that where I’m like, now people are looking and it’s like, oh, because the LLMs are sick of fans, you know, I can ask a super simple question and it doesn’t come back. If I don’t tell it to probe for more, it is going to run off, you know, willy nilly.
00:28:18.87 [Michael Helbling]: So wait a second, Tim, is that why all of your questions on the show are so long? It’s like a reaction to the quick question? It’s just a light bulb moment for me. I finally have come into my life. Okay, I was sorry, Julie.
00:28:37.26 [Juliana Jackson]: No, I think Tim is right, as always. I would sound like… Don’t do that. I would sound like… That’s a great thought. You should write a book about this.
00:28:50.86 [Michael Helbling]: Wow, you’re so smart and intelligent.
00:28:53.41 [Juliana Jackson]: No, but you are right. This shows to, this points actually to the biggest problem that I think it is. The second biggest problem, the first was, you know, what’s coming from, you know, enterprise VCs. The second one is we tend to anthropomorphize everything that we work with because of course for humans. You see that there’s so many people that use LLMs as a therapist, and it’s not a funny thing. It’s not a funny thing. These people go through some serious shit, and they feel like they have nobody to talk to, so they end up anthropomorphizing them. These are more extreme examples, but if we take it to the analytics world, It’s very nice, I think, to be able to talk to somebody that gets it and is able to give you instant answers or to be able to talk to you at the same level as you are without you feeling that you’re an idiot. Because I do feel like an idiot most of the time. That’s fine, but it’s so it’s nice to be able to talk to this models and you know like i brainstorm a lot on different topics are like again like one of my biggest hobbies is to read science paper so i put my science papers into notebook lm and i’m trying to parse them and mine.
00:30:10.36 [Tim Wilson]: For more evidence of how lazy you are, as you were just saying, you’re like, I’m so lazy. So I’m just going to go read another scientific paper about deep analytical models.
00:30:21.15 [Juliana Jackson]: Listen, I’m putting them in notes. I have ADHD, Tim Wilson. I have ADHD. So for me, it’s very hard to keep my attention. Like you can see right now in this podcast, like I’m killing myself right now to be able to stay on point, which doesn’t work. So anthropomorphic, right? So as an analyst, we, using these things, we tend to get very attached to them. And it’s, I felt it on my own skin. Many times, like when I discovered small language models and I discovered hugging face a few years ago, I lost it. It’s probably how people felt when they discovered Stack Overflow. I was like, oh my God, you know, like, can you actually do this? Like, it was this emotion wheel plush, plush cave. I think I’m, hope I’m spelling it right. If not, I’m sorry. But it was this emotion wheel. So to be able to analyze emotion in text, you need to be able to have data that is annotated by a lot of people. You have to have huge data set of annotated data to be able to give it to a model, to fine tune it, and then process some content. So it’s like, oh my god, this is already here? I can just copy this shit in my Colab and just feed it something? And I can use Cloud to write the code? That’s amazing. So I get it. But one thing that happened for me, and I hope people will take away something positive out of this, is that it made me realize what are my knowledge gaps in terms of understanding how stats work or machine learning or data analysis. So what it did for me, it sent me back to school. It sent me back to read, it sent me back to study. So I never, like one thing I can say when I write, I never write from the perspective of somebody that knows better because I don’t. I’m learning shit and I like to give it back to the community for free. And I’m like, when I discover knowledge graphs, I did the same. I do, but I got attached. Like I use them as part of my routine. So what happens is in the analytics space and back to your question, Val, is that we tend to use these tools and we kind of become attached to them. That’s kind of what it is. And it happened in the 60s with the Eliza chatbot, if you remember. And remember that Eliza chatbot, and people started developing feelings and stuff for a fucking chatbot. Like it didn’t fail all this time. Like we are the problem.
00:32:48.80 [Val Kroll]: So quick aside, I have to share, and we can cut this out later, that anthropomorphizing, oh, this is one of my words I have trouble saying, anthropomorphizing is one of the reasons why I had to live in a single in college in our dorm room, because my roommates didn’t understand why I wanted the DVDs to be organized alphabetically, because it made them happy. And so I did a lot of living on my own. So I get the attachment part to it. So back to the chat interface, making it accessible. Perhaps this is some nod to the self-service. And I think in reaction to some of the things you were saying to Tim, it’s like only a half step away to be like, oh yeah, we’ll just put a chat interface on it. Put a bird on it for my poorly idea fans. If the existing relationship between the data teams and the analytics and data teams is There’s just a certain amount of aggregation or filtering that needs to be applied to the data to answer my question for me to know the truth or the answer to my question. It’s so reductive that it’s like, oh, if this is already how we operate, I ask you a question, you go pull a number or two for comparison. And boom, like now I know what to do. Now that answers all my questions. It’s so easy to sell. Like you’re talking about Juliana, like for the VCs to come in and say, what if? we made that even faster, easier. You can do a thousand a day versus just the time it takes the analysts to pull it. So I think it’s like building on existing broken models and relationships of working, which is why I think it’s like so easily able to prey upon these audiences for why that this is going to solve everyone’s problems.
00:34:35.97 [Juliana Jackson]: 100% then it’s actually infuriating to me, like I lost so much respect for so many people in the last year just because like, I’m sorry, I’m just being honest, like you guys knew what you got yourself into when you invited me here. No, I just lost, no, I’m generally.
00:34:55.35 [Michael Helbling]: No, I, you know, I understand, never meet your heroes.
00:34:59.90 [Juliana Jackson]: Yes, I like that.
00:35:02.55 [Michael Helbling]: No, no. I’m expressing a thought that comes from my own journey as well.
00:35:07.82 [Juliana Jackson]: No, it’s real.
00:35:08.80 [Michael Helbling]: You look up to people and then as time goes on, you’re like, I can’t look up to them anymore.
00:35:13.67 [Juliana Jackson]: Or you invite them into your podcast because you think they’re, you know, they’re cool, but then their bitches have I lost respect for a lot of people because we are not doing enough education as we used to do. I wrote about this recently. There used to be a time when vendors were doing so much educational content on their blogs, they were doing courses and webinars, and there was so much educational content for the community. I’m thinking about the Christa sadans of the world. Like remember when she was running the community for GA? So much, so much stuff, it got us all together. There will never be times like that. Anyone in CRO, the only people that do content that is useful are just the guys at Conductrix. I’m sorry. I’m saying this with all the responsibility. I was going to do like a macro shot. I’m begging him to write. Like, I’m just saying, please just write something. Just go on the blog. Help me figure out something in life. So there’s very few people that still do content that inspires and help the community. And what’s happening right now is what you said, Val, we’re praying on the lack of education in terms of AI. I’m not saying you guys are uneducated. This is so bad. Delete this. There’s a lot of vendors that are riding the hype and instead of educating the market and being there for them, they’re just kind of like taking advantage of the blurry lines that exist. And they’re just trying to basically sell shit that doesn’t make sense. And I had this note on my phone, which I tried to post on LinkedIn. but I didn’t want to get blocked, but I’m going to read it to you guys right now. Where is it Jesus? Oh, I sent it to my husband and he was like, my husband, shouts to my husband.
00:36:53.88 [Tim Wilson]: He was like, he’s still caught up on like, who is Simo? Yeah, he’s still researching that. Yeah.
00:36:58.15 [Juliana Jackson]: Cause I wanted, okay, I said automation can scale outcomes, but not competence. So in the eye cannot fix broken value props for vendors. So if users didn’t come before AI, they’re not going to come. Yeah. So like this, this is not changing anything. So I have this, you know, just here from my husband and I said, sending it for me, don’t worry about it. I just had this random thought and I wanted to post it, but yeah, I’ll shut up now.
00:37:24.86 [Tim Wilson]: So that’s this other part that concerns me that, as you’re saying, is trying to oversimplify to the point that I don’t have a base of knowledge. Going from a base of knowledge and using an AI tool, or I think about my son’s a software engineer, and he’s like, yeah, I use code assistance, I use cursor all the time, because he deeply knows how to write code. And so it’s a very, very good supplement. If somebody who’s reasonably fluent in SQL and understands the company’s business data, data structures is using it, that’s like one use, but somehow we kind of conflate that with, oh, and the natural extension is that some random person can come into a random company and say, ask a random question, and that’s what’s going to come back they’re going to be able to safely run with is kind of terrifying. So it’s like, there is a spectrum, but there’s somewhere there’s a line that needs to be drawn that says, no, you still need to learn and like learn in the broadest, learn the tools, learn the data, learn the business, learn how to think, learn critical thinking. Um, and, and it is, and maybe that’s just, I’m extending your rant that it’s just getting dropped in as though AI is the shortcut because AI can do this cool thing. So let’s just extrapolate it to say, Oh, it can do everything else.
00:38:54.01 [Val Kroll]: Michael, let’s put you in the spot. No, that’s good. Michael, let’s put you in the spot. But remember when we were talking before about like, if you’re at zero to get to one is impressive, but if you’re at one, getting to two, like, do you remember when you were talking about that? Those nuggets of wisdom. I feel like that applies here.
00:39:10.26 [Michael Helbling]: LLMs are prediction engines, so they just take everything they know and they try to predict what it is that would be the best answer. But there’s good and bad answers that they’re incorporating into that training set, and so they’re usually giving you some average that gets you further than you would get yourself. I don’t know how to write a lot of code, so it can write code for me and does awesome. And it’s so cool. I go from zero to, yeah, I go from zero ability to write code to half a developer. That’s incredible. Now, what I’m not is an excellent developer with AI. Excellent developers thinking about all kinds of other things about architecture, about speed, about latency, about all these other things that matter to how that code is going to work and interoperate and all those different things. I’m not thinking about any of that. And the AI is not thinking about any of that. We’re just clutsing along. So when you take that and you bring it to an analytics context, there’s a bunch of things an excellent analyst is doing. And those sort of like operations or tasks are things that like, yeah, wow, it’s amazing. An AI can go compare some data sets and show you a little graph and all those things. That’s pretty decent. But it’s kind of like that sort of really junior analyst that sort of comes in with this amazing insight that time on site goes up when the page is confusing or something. It’s like, okay. Thank you. And you’re like, oh, I’m happy for your excitement. We’re not going to present that to the client because it’s dumb. But then you go through that process. And so actually, this is sort of like the last thing I want to kind of jump into. So thank you for teeing me up, Val. One of the directions that I see AI going is sort of multimodal AI, which basically is sort of like, okay, so when we’re talking about how you dump your data into an LLM and it tries to analyze it, it’s not good. What do you think, Juliana, about as multimodal comes online and we start leveraging different LLMs for different parts of that process? Is there a brighter future for analysis of data when we could leverage this thinking for this and this process for that and so on and so forth? Because that’s a step out into past where we are today.
00:41:48.67 [Juliana Jackson]: This is a great question. I have so much I want to say, so I’ll answer to your question first. Language models are good for? Language. Newsflash, you heard it here first. I think we should use LLMs sparingly for discovery, for proof of concept, to try new things, to test drive something. I will never choose to use an LLM to do any type of unstructured data analysis. I will always go for a small language model. And the reason why I would do that is because you have more control over how that small language model is going to perform over time. I genuinely hate using LLMs and I know that sometimes you’re forced in life because Google put Vertex AI and they allow you now to prompt inside Vertex AI and then you can create magic and do topic classification and all these statistical analysis things. It’s great, but the F1 and precision and recalls and the accuracy calculations afterwards are hard. And you have to manually go and yes, no, yes, no. Oh my God, look, we have 80% after six weeks. And I would always choose a small language model because then I speak from experience here. I was working with Crosimir and it took us some time to tame it. We used Distilbert. I love Distilbert for text. It takes time. You need to build a vocabulary. You need to do some topic mapping. You need to do some things next to it, which I’m terrible at code. I’m mediocre at best. At best with Claude, I’m good and it looks pretty. But I cannot code, but what I can do is I have the critical thinking in the commercial experience so I can challenge the person that does the code and the person that shows me that data and I can work with it. So I would say if you want to analyze video, you want to analyze images, you want to analyze GIFs and other media formats, don’t use an LLM to do it at scale because it’s not going to work. It’s very hard to pipeline an LLM. Because if let’s say you do a dashboard with an LLM and you do some sort of task like topic classification or analysis of some specific creatives, you create a Looker Studio dashboard or Power BI dashboard. It’s good for a one-off, but try to pipeline that because data changes over time. Data changes the format, what’s the word for it. the shape, the context changes for the data. How are you going to automate that? How are you going to do a pipeline in BigQuery that’s going to go and analyze that data and make sure that you do it over time? With a small language model, you can. I built with Krasimir a pipeline. in 2023 and it never crashed because the model that you were working with, you can control, you can fine tune, you can change, you can feed it new data, you can do few shots, you can do one shots where you can give it different ways of looking at the data so it enriches the vocabulary that it uses. So it’s a bit different. I get so, I don’t know why I care so much because I’m not saving lives for fuck’s sake, but I’m I’m generally like, I also asked Jason like one time, why do I care so much? But I generally do care about what we do as an industry and as a community and what we tell each other and what we advise each other. And I think ESUs, LLMs to prove something, do it to test. I use them all the time, deep research. For example, I found ways to automate deep research and do something before a pitch, before an RFP. They’re so good. to do a market analysis. So good. Where I, for instance, I’m writing something about agents and I asked, like, what are all the valuations and how did they change in the last years? I’m not going to do that manually. I’m lazy. But I’m going to use the deep research for it. And then I’m going to manually read it and be able to do a visualization in my call app and show people, hey, this is what it is. And then I get people to hate on me on LinkedIn because the pyramid is not to their taste. Yeah, yeah, I know. I know. I know who you are. Multi-model is already there. I’ve tested it on YouTube videos. The way it would work with an LLM is not the actual video that you analyze is the metadata of the video, and you would analyze the title, you analyze the tags, the description, and so on. And you would infer from those what is actually going on. Now, there are models that can actually analyze the video, but it’s very expensive and way more complicated. So an easy fix is to use the metadata. Like this is how, for instance, I do similarity analysis between search intent and the search query and the content of the website. So I create the function. that looks at the website page title, excerpt, description, the tags, and everything. And then I look at the search query and try to map to what extent the search query is found into the content of the page. So I think, again, if you want to go multi-model, you should. And that is already there. It’s just you have to choose the right model for the right job in an LLM. It’s definitely just good for language and not for other types of data sources.
00:47:09.31 [Michael Helbling]: No, that’s awesome. Great. Okay. We do have to start to wrap up and we could keep going for a long time, but we’ve got to start to wrap up and.
00:47:21.12 [Juliana Jackson]: I feel like I gave you nothing. No, this is.
00:47:23.75 [Michael Helbling]: I’ll tell you later.
00:47:27.91 [Juliana Jackson]: I gave you nothing. I’m a horrible guest. Welcome to the standard deviation.
00:47:34.89 [Tim Wilson]: Okay, after the show, Michael’s gonna give his stern lecture about this negative. No, it’s not gonna be stern.
00:47:41.03 [Michael Helbling]: It’s gonna be heartfelt. Okay, let’s break back in. All right, we do have to start to wrap up, but one thing we love to do is go around the horn, share a last call, something that might be of interest to our listeners. Juliana, you’re our guest. Do you have a last call you’d like to share?
00:47:57.19 [Juliana Jackson]: I mean, I could always read my shit, but I actually saw a video today. It’s a video from I have to go to, okay, so it’s called, this is what the digital coop looks like. And it’s a TED talk by Carol Cadwalader. Is the person from, I think the Guardian or the observer that wrote about the whole Facebook data privacy stuff that happened a few years ago, and she did a TED talk recently. My homie, Sean, David, sent it to me, and I watched it. And basically, she’s talking about how all of this that’s happening right now in the market with AI is a coup for data. And basically, data is the crack cocaine in Silicon Valley. And it’s what everything that’s happening right now is to make people complacent and lazy and use these models and use this technology. to kind of cancel their ability to think, which it’s very… is fitting where we’re at right now. And I’m going to share the link with you. I think it’s a great talk. And she talks about how all our IPs and all the stuff we create is being taken by this models is used as training data. And again, it’s just basically the message that she sends is that the future is already here and it’s happening. This is not something that’s going to happen in a few years. This is already starting. And I can definitely see it, and it’s kind of a scary place to think, but I want to end up with a high note so you can read my article about the UI.
00:49:32.90 [Michael Helbling]: Actually, what’s funny is I did read that article, and it was really good.
00:49:37.70 [Juliana Jackson]: Are you still alive after reading?
00:49:39.77 [Michael Helbling]: I mean, there were some parts where I was like, oh, but that’s not you. That’s the depth of the whole thing.
00:49:47.29 [Juliana Jackson]: Shouts to Aurelie, shouts to Aurelie Paul, to Sivan, to Robert Bach, to Fabrizio for reading. And those are the privacy experts. I was like, please, guys, read this to me. I don’t say stupid.
00:49:59.17 [Michael Helbling]: No, but it’s really good. And it’s also kind of interesting to sort of get an understanding of sort of like, okay, yeah, there’s some really important distinctions that should be made. So anyway, say thank you. That’s a great last call. All right, Val, what about you? What’s your last call?
00:50:14.09 [Val Kroll]: Um, so I’m gonna pull a Tim. I have a two for today. Um, so the first one is I have, and I know Julie has too. I think, um, she got me hooked on the allergies podcast with Ali word. There isn’t a topic that she talks about that I’m not interested in, but there was one recently on saluted analogy. Does anyone know saluted analogy? What’s it called? Salute analogy.
00:50:36.80 [Michael Helbling]: And is that?
00:50:37.70 [Val Kroll]: No. Okay. So it’s the opposite of pathology, which is the study of disease or sickness. Salute genealogy is a study of health, which is super interesting.
00:50:46.98 [Michael Helbling]: Yeah, salutary.
00:50:47.98 [Val Kroll]: Exactly. That’s the to health. Exactly. So it breaks it down and it’s like, it’s all salad. No, there are actually five categories like movement, nature, art, service, and belonging. And it’s very related to social prescribing, which NPR has been doing some topics on, which we’ll also put the link to that. But it’s all about things you can do to help improve your health, not a replacement for medications or other means to get healthy. But anyways, that was really interesting. Then my second one is, the last time I’m going to do it, my plug for Measure Camp Chicago, it’s about three weeks away. I have the honor of being one of the volunteers on the planning committee, and let me just tell you, we are supported by 10 gracious sponsors to make the attendee experience amazing. We’re feeding you breakfast. We’re making sure that you are well caffeinated. There’s going to be a third party show at the end, the first ever live performance of the third party show with Jess Silverbauer. And then we’re headed to a dart club, and we’re going to rent out the whole first floor so people can play darts and hang out and free food. So anyways, amazing event. Please register. We’d love to see you there.
00:52:12.46 [Juliana Jackson]: You guys and you always do it better than us.
00:52:14.84 [Michael Helbling]: I don’t know. I’ve never been to your measure camp. Me neither. I really like to go.
00:52:20.31 [Val Kroll]: I was like, there’s got to be some play about like hitting your target. Like if no one makes jokes about this,
00:52:25.53 [Michael Helbling]: Well, there’s a difference between accuracy and precision, Val. The classic.
00:52:32.61 [Tim Wilson]: All right. Last call time, Tim. Well, I will go with just one. And it is a YouTube video that came out a couple of months ago. I think I picked it up from flowing data, but it’s called The Plea, the incredible story of smallpox and the first vaccine. And it’s really just a data visualization in video. It’s got like just some clever ways of sort of illustrating numbers. It’s got kind of this Plinko board that sort of comes down. So it’s like 25 minutes long. It is actually pretty interesting around kind of the history of smallpox and the first vaccine. But it’s also just from a data visualization, data storytelling, like, I mean, really, really finely crafted and well done. So it’s just kind of a interesting watch and kind of impressive. What about you, Michael? What’s your last call?
00:53:34.90 [Michael Helbling]: I’m so glad you asked. So, a recent paper I came across, you’ll notice that I reference Christopher Berry’s newsletter all the time. So, shout out Christopher. I get a lot of my last calls from him. But recently they did a study where they put together a MacGyver set of tasks. And so immediately I was enthralled because growing up MacGyver was my favorite show. And yes, I am Gen X. Don’t worry about it. But it’s basically the thing was there was practical challenges that force the large language model of thing creatively, like substituting things or using things not for their specific context. So basically checking to see whether or not And AI could creatively solve a problem using sort of innovative or out-of-the-box ways of thinking about things. And humans are still much better at doing that than LLMs. So just wanted to throw that out there. Anyway, it’s an interesting read and kind of neat to see what people are trying to do. But I think it’s also cool to watch to see, like, OK, yeah, can we help LLMs get more creative with solving problems or thinking outside the box on things? Because then maybe there’s potentially, yeah.
00:54:56.43 [Juliana Jackson]: Remember AI overviews when they were telling you to make pizza with rocks?
00:55:00.48 [Tim Wilson]: Yeah.
00:55:01.10 [Juliana Jackson]: That’s right.
00:55:04.50 [Tim Wilson]: Well, and to clarify for the Ginex listeners, of course, you’re referring to the OG Richard Dean Anderson MacGyver. But did you ever watch the review? What, MacGruber? I don’t know, five years ago.
00:55:17.04 [Michael Helbling]: Oh, no, I did not see that. But no, I mean the old school MacGyver, the real MacGyver. Yeah, like me TV, like paperclip and a chewing gum wrapper and like solving real world problems.
00:55:31.43 [Tim Wilson]: Yes. And I will now age myself that when I said they rebooted it about five years ago, apparently it was in 2016 that they rebooted it.
00:55:41.63 [Michael Helbling]: No, never saw it. All right, Julianne, thank you so much for coming on the show. It’s fun to talk to you, and I think the reason why is because you’re like us. You are smart, passionate, a little bit too self-deprecating. And that’s a nice mix. And so it felt really nice. It was awesome. So thank you so much for coming on the show. I appreciate your research, your insights, the things you can share with the community. I love how you’re sharing your knowledge as you’re learning and pulling these things through in your work. And it comes out like, yeah. So thank you so much. It’s been a pleasure having you.
00:56:25.66 [Juliana Jackson]: Thank you for having me.
00:56:27.10 [Michael Helbling]: Yeah, no, it’s our pleasure. And as you’ve been listening, you’ve probably been thinking, oh, man, I want to learn more about this topic, or I want to interact with these folks. You can reach out to us. We’d love to hear from you. And you can do that on the Measure Slack chat group or on a LinkedIn. Or you can email us at contact at analyticshour.io. And you can also find Juliana Jackson in a lot of those same places. And also, she has a newsletter, which is exceptional. called Beyond the Mean, I think. Yep, okay. Make sure I didn’t get that right.
00:57:01.30 [Juliana Jackson]: Beyond the meme, that’s how I should have named it.
00:57:03.88 [Michael Helbling]: Beyond the meme? Oh, well. It would have been more fun. There’s nothing wrong with having two newsletters. And of course, the standard deviation podcast that she co-hosts along with Simo Hava, who asked to join her podcast, let’s get the records straight. And so anyways, there’s lots of ways to reach out and interact with her as well. So. But please don’t give me feedback. No. No feedback. Positive.
00:57:30.49 [Juliana Jackson]: positive feedback. I was about to ask an LLM to do a bunch of data analysis and your insights stop me from making that terrible mistake.
00:57:49.09 [Michael Helbling]: So there you go. Only that. That’s what we want to hear. All right. Well, anyways, please reach out to us. We’re delighted to hear from you and those kinds of things. And of course, as you’re listening, we’d love to give. Have you rate and review the show on the platforms that you listen to it on? That helps us out quite a bit. All right.
00:58:07.39 [Juliana Jackson]: No way.
00:58:09.23 [Michael Helbling]: Algorithms. We like the feedback, Juliana. We’d like it. We won’t tell you, we won’t tell you, it’s okay. No. I’m saying about the show in aggregate. Okay.
00:58:26.26 [Juliana Jackson]: I don’t feel from here now.
00:58:28.48 [Michael Helbling]: Yeah, it’s fine. But there’s one other thing I think is very important to say, and I think both of my co-hosts agreed with me, Tim and Val, that no matter how you’re trying to use LLMs, the thing you should never stop doing is analyze it.
00:58:44.32 [Announcer]: Thanks for listening. Let’s keep the conversation going with your comments, suggestions, and questions on Twitter at @analyticshour on the web at analyticshour.io, our LinkedIn group, and the Measure Chat Slack group. Music for the podcast by Josh Crowhurst. Those smart guys wanted to fit in, so they made up a term called analytics. Analytics don’t work. Do the analytics say go for it, no matter who’s going for it? So if you and I were on the field, the analytics say go for it. It’s the stupidest, laziest, lamest thing I’ve ever heard for reasoning in competition. Because we’re doing the YouTube shorts.
00:59:24.62 [Michael Helbling]: I just love that Tim was like, oh, it’s like, what? I know, but which LLM? Like seriously, just whatever. And Tony, listen. Out of work, hell. He’s our editor.
00:59:43.73 [Tim Wilson]: He’s had some comments about Michael’s audio yet.
00:59:48.24 [Michael Helbling]: Michael’s butchered about it. No, that’s not even what I was referencing. That’s not even what I was referencing. I was like, because we’re five and a half minutes into recording and we haven’t started yet. Oh, you haven’t started yet? So Tony’s like. This is God Takes material, hopefully. I know. Trust me, I am so on board. Yeah, yeah, I love it and I’m amazed at everything that’s happening with those and I will try it out. I will. I’m here to learn.
01:00:14.49 [Tim Wilson]: You’re here to try.
01:00:15.78 [Michael Helbling]: As much as it is.
01:00:16.64 [Tim Wilson]: He’s got another couple of years before he’s like, ah, fuck you. It’s just him whining.
01:00:22.04 [Michael Helbling]: I’ll just be like, how did you convince Simo to do a podcast with you? I think that’s probably the thing on everybody’s minds.
01:00:27.69 [Juliana Jackson]: Actually, it was the other way around.
01:00:30.15 [Michael Helbling]: Really? That’s fascinating. That’ll be my first question.
01:00:35.30 [Michael Helbling]: OK.
01:00:35.96 [Michael Helbling]: Perfect.
01:00:42.09 [Val Kroll]: Rock flag in AI is more than LLMs.
Subscribe: RSS