From running a controlled experiment to running a linear regression. From eyeballing a line chart to calculating the correlation of first differences. From performing a cluster analysis because that’s what the business partner asked for to gently probing for details on the underlying business question before agreeing to an approach. There are countless analytical methodologies available to the analyst, but which one is best for any given situation? Simon Jackson from Hypergrowth Data joined Moe, Julie, and Tim on the latest episode to try to get some clarity on the topic. We haven’t figured out which methodology to use to analyze whether we succeeded, so you’ll just have to listen and judge for yourself.
Photo by Gunnar Bengtsson on Unsplash
0:00:00.0 Tim Wilson: All right, before we start the show, we have an announcement and a request. Val, would you want to explain what that is to me and to our listeners?
0:00:08.2 Val Kroll: I would love to Tim. So we wanna hear from you dear listeners. So we’re conducting a survey and logistically we’ve made things as easy as possible for you.
0:00:17.7 TW: Survey. I mean, I kind of like just pretending I knew what listeners were thinking, but sure, fine. We can do it your way. And I’m assuming that because it’s you that there’s probably a short URL and there’s probably a nice little Google form that’s mobile friendly that listeners can go to to actually give us this information.
0:00:35.8 VK: Ah Tim, you know me so well. You bet. And so starting today, the survey will be live at bitly/APH-survey.
0:00:46.9 TW: So putting my listener cap on. So we want something from them. We’re not going to actually give them anything in return. Are we?
0:00:54.9 VK: We sure are. At the end of the survey, there’ll be an opportunity for you to give us your address so we can get a laptop sticker off in the mail to you. And also you can choose to enter a raffle for some sweet sweet power hour swag.
0:01:06.8 TW: I don’t know who approved this expense, but fine. So this means, I mean, I’ll take it ’cause we can now kind of get rid of those old branded mouse pads and Blackberry covers.
[laughter]
0:01:19.2 VK: No, Tim, we’ll leave those to you to still use as your stocking stuffers. [laughter] This is actually gonna be way better think comfy, cozy hoodies just in time for summer.
0:01:29.3 TW: Hoodies for summer. Good planning. I’ll take mine and just like rip the sleeves off and kind of rock those analytics guns through the summer.
[laughter]
0:01:36.0 VK: Your finance pro look.
[laughter]
0:01:39.1 TW: Yeah, that’s right. I’ve been, I think I’m ready. [laughter] So how long, I mean they’ve gotta do this sometime before summer, so how long do listeners actually have to participate?
0:01:47.1 VK: Yeah. So the survey opens today, May 28th and this is 2024. Just in case you’re living in our future and listening back to some old episodes. And we’re gonna keep it open through the month of June.
0:01:57.5 TW: Okay. So if I was paying attention, listeners should visit bitly/APH-survey before June 30th in the year 2024 to share their opinions and maybe actually get some pretty cool swag.
0:02:11.4 VK: You got it?
0:02:14.6 TW: Sounds good. Now on to the show.
[music]
0:02:22.1 Announcer: Welcome to the Analytics Power Hour, analytics topics covered conversationally and sometimes with explicit language.
0:02:30.0 TW: Hi everyone, welcome to the Analytics Power Hour. This is episode number 246. 2, 4, 6. 2, 4, 6, 8. Who who do we appreciate? The analyst. The analyst. And there’s no better way to make an episode that has a cheer worthy episode number than to deliver the perfect show. Obviously to do that just meant we needed to do the analytical work to figure out what that would be. So what kind of analysis should we use? A little primary research, a survey of our listeners or secondary research reading up on a bunch of podcast studies to craft the perfect show. Or would it be a job for multiple linear regression? Would there be some way to run a controlled experiment to get the answer? Ah, we couldn’t figure it out. So instead we just threw up our hands and decided to do a show about this somewhat regular analytical conundrum. Which analytical methodology is most appropriate for the job? Joining me for this episode is Julie Hoyer from Further. Julie, have you ever had to choose between, I don’t know, just quickly eyeballing a time series chart or instead doing a first difference calculation?
0:03:37.0 Julie Hoyer: Yeah more than once.
[laughter]
0:03:40.7 TW: Awesome. And we’re also joined by Moe Kiss from Canva. So what about you Moe? Have you ever had to choose between matched market media test and a media mix model?
0:03:51.6 Moe Kiss: I feel like I still have like PTSD from those terms, so [laughter] definitely the case.
0:03:58.1 TW: Just do them both. You’ve got, you got Canva money, do them all.
0:04:00.6 MK: Oh dear.
0:04:01.3 TW: I wish. And I’m Tim Wilson from Facts and Feelings. All right, clearly we’ve all had to grapple with this challenge in the past, so we thought it’d be great to bring on a guest to chat about it and see if we can put some clarity around the topic. Simon Jackson is the founder and principal consultant at Hypergrowth Data. Or I’m sorry, you’re in Sydney. So it’s hypergrowth Data [laughter] He’s held past roles in data product management and data science at Canva, Meta and Booking.com. He holds a PhD in psychology from the University of Sydney and did a postdoc at Australia’s Defense Science and Technology group, which frankly sounds like it could be a topic for an entirely different episode and maybe even a more interesting one. But no, I don’t think so. [laughter] welcome to the show, Simon.
0:04:44.5 Simon Jackson: Thank you Tim. Thanks everyone. Thanks Julie and Moe, big fan of the show, so I’m very excited to be here.
0:04:50.3 TW: It’s gonna be interesting. We’re gonna have a 50-50 split on the American versus Australian accents, taking over.
0:04:58.0 MK: I like it.
0:04:58.7 TW: So maybe to kick things off, Simon I think of Booking.com, Meta, Canva, I think, well those are all companies that have a shit load of data, but they also obviously are very different companies. You had a range of different remits when you worked at each of them. Like is it even fair to say without just having you talk for an hour straight [laughter] can you summarize the biggest sort of similarities and differences when it came to what the methodological choices were that you needed to make in those different roles is kind of the last X years of your career.
0:05:33.9 SJ: Yes. So anything sub hour is okay. Is that, that.
[laughter]
0:05:38.9 TW: Yeah. Yeah 59 minutes is fine.
0:05:41.6 SJ: Perfect great. So similarities and differences, the thing that pops into my head similarities, reality is there’s always constraints like I come from a statistics background you learn [0:05:56.4] ____ for this particular problem and then you get the real world and none of your data fits that perfectly. Everything, you don’t have what you need. It’s like, so you are familiar with this feeling. You always have to make some perfect choice or trade off no matter what method you’re picking, I think is super common. And another thing in maybe this background is you’ve always got people to convince or influence or help change their behavior in some way. And the methods you pick, you have to think very the end. It’s one thing to be like me and have a stats background and talk about some sort of technical method. But when you show that to say like a product manager who’s never done any stats training, that falls totally flat, right?
0:06:37.7 SJ: So I think the things that like come to mind is you’re always constrained and you’ve always got to deal with people. But then on the differences front, you mentioned a few of my role and I’ve been a data tech manager. I was in academia, I’ve done surveys, I’ve done the deep stats stuff, done machine learning. It’s like the problem and decision context are constantly changing. So it’s, oh, I have [0:06:58.1] ____ I analyze something about my customers. Cool, I’ve got this massive warehouse, like you said, shit loads of data. I can play around with that forever. Or I’m a product manager and there’s no other thing I want build, I’ve gotta go out and interview with qual method. So definitely the problem with decision clicks and I would love to chat about about that situation with you all. The other thing I would love to hear for you three as well, the other thing is me, my interests changed, my motivations changed, my experience, my like knowledge of methods have changed and unfortunately, I’m sure the more you into data, the more you learn this so much more to learn [laughter] a nutsy one, right? It’s like it’s never ending. So anyway, I’ll stop there. I won’t hit 59 minutes, but I’ve probably used enough time.
[laughter]
0:07:42.4 TW: So maybe keying off of that last part, the fact that you as analysts, we change and learn more things. And I did sort of slide into the opening a reference to first differences ’cause Julie and I happened to learn about that at the same time, which was like the simplest concept. And I was 20 plus years into my analytics career, never heard of it. So where does that fit? Like the size of your toolbox hopefully is steadily growing over time. But have you had cases where you’re like, oh, here’s an entirely new technique that either weren’t aware of or had not really explored, were vaguely aware of it, and then you found it, or you had a reason to dive in and you’re like, oh my God, this probably would’ve been more appropriate on that thing I did four years ago. Or the flip side now that’s your hammer and you find yourself walking around and seeing all the world as a nail.
0:08:36.9 SJ: Oh, isn’t that the way the hammer and nail? Like you learn something new and you’re so excited and you’re like, oh my God, I just learned, I don’t know. Like at one point I’m like, I just, everything must be solved by neural networks, and fell flat very fast. [laughter] But on the flip side, yeah, I know, it’s painful, right? When you learn something, you are like, oh my gosh, that would’ve just [laughter] How did I not know this? I would’ve loved to have known. That happens all the time. And for me, it happened I think more when I became a manager because I suddenly, I was not doing the work necessarily. It was like data experts and data scientists. And I’d be like, how are you gonna solve this problem? And they came from a totally different background to me, like physics or economics or something, and they’d throw out this method. I’d be like, whoa teach me about that. Like, leaning in, I’m like like holy Molly, just, if I had known that earlier in my career that I would’ve spent my life so much better [laughter] So yeah, I feel that [laughter] I know it’s gonna be embarrassing one listening who respects me, don’t judge me. But what did you say? First differences.
0:09:41.3 TW: Yeah. Is that new?
0:09:44.3 SJ: Yep. Never heard that. Never heard that phrase. I mean. Yeah.
0:09:50.9 TW: Well maybe now I feel…
0:09:52.4 SJ: You don’t have the perfect example.
0:09:52.9 TW: Yeah.
[laughter]
0:09:53.4 JH: Makes me feel better.
[laughter]
0:09:54.0 SJ: Great.
0:09:55.4 MK: Okay. Full disclosure, I don’t know either.
0:09:57.4 SJ: Oh good hurray.
0:10:00.5 MK: I just was like, everybody else knows and I was like, I really need to Google this after the episode and now it’s becoming a real learning.
0:10:08.6 TW: Well now I feel like we just have to go down the little side path. Julie, you want to give the quick explanation.
0:10:15.4 JH: Tim correct me if I forget some of the situation, but pretty much we were looking at correlation of two variables that we were interested in. And there was like a loose relationship that was detected. And me and Tim were talking to Joe, who we were working with at the time, Dr. Joe. And he was pretty much like, oh yeah, but that’s probably just like spurious correlation. Like you really need to use the first difference. And so me and Tim were like okay, like what is that? And he said, oh, pretty much.
0:10:44.1 TW: Well these were time series variables, right? So this, that was kind of a key. Yeah.
0:10:47.6 JH: Thank you. Time series, and he pretty much said, oh, you look at the change between your two points and then you look at the correlation between the change in your variables, which was like, oh duh, that seems pretty obvious [laughter] Once he said it that way. And then once we did that though, it was pretty crazy. There was absolutely no correlation at all. They were not changing similarly at all.
0:11:11.3 MK: That is really obvious.
0:11:13.2 JH: So, it was pretty powerful. Easy and powerful right?
0:11:13.2 TW: It’s like you just eyeball it and you’re like, well they’re both, I mean, in this case they were both going up over time, but they weren’t from period to period. They weren’t consistently, when one moved up, the other one wasn’t moving up. Proportionally the other one may move down. So it was just, they both were trending upward. So it’s a, it’s not a definitive check, but it’s like, yeah, I guess I ran around, I’ve probably now talked about that like half a dozen times at like, I’m like, let me show you this cool thing.
[music]
0:11:45.2 Announcer: It’s time to step away from the show for a quick word about Piwik PRO. Tim, tell us about it.
0:11:50.1 TW: Well, Piwik PRO has really exploded in popularity and keeps adding new functionality.
0:11:54.9 Announcer: They sure have. They’ve got an easy to use interface, a full set of features with capabilities like custom reports, enhanced e-commerce tracking and a customer data platform.
0:12:07.9 TW: We love running Piwik PRO’s free plan on the podcast website, but they also have a paid plan that adds scale and some additional features.
0:12:14.9 Announcer: Yeah, head over to piwik.pro and check them out for yourself. You can get started with their free plan. That’s piwik.pro. And now let’s get back to the show.
0:12:26.0 MK: But do you know, the funny thing about that example, and I was about to ask Simon about this when we started to delve into the first differences discussion, I do sometimes worry that like as you, maybe not as you get more tools in your toolbox, but I do think there is a tendency to sometimes prefer the more exciting methods. And it’s almost like the more experience you get, the more you realize that you should go back to basics. And that’s a really good example of like where a more simple methodology is actually stronger than something very, very fancy or sophisticated you could build. But I think sometimes it’s difficult to get the data science folks to be bought into that. And it all kind of then comes back to like explainability, right?
0:13:13.4 SJ: I couldn’t agree more with that.
0:13:14.7 MK: Sometimes using a more simple method means that it’s more explainable and therefore you can get your stakeholders on board.
0:13:21.2 SJ: Like I have an example to share. Just riffing with you.
0:13:22.8 MK: That wasn’t really a question.
[laughter]
0:13:24.1 TW: Do you agree with me? Is that the question?
0:13:26.5 SJ: I can example though, just riffing off that is when I was working at Booking.com, we had a very like big strategic experiment. So there was a lot riding on the results of this thing, but there was a lot of like noise in it. So we’re like, all right, we wanna use some variance reduction techniques to help boost our signal here. And I can’t remember the background of the data scientist who was leading the project from the technical point of view, but they’re like, our data’s very skewed post suppression. I was like, okay [laughter] sure if you say so anyway, they did the analysis and everything was like [0:14:05.0] ____ It’s like the results were out of control. And so we started presenting them and someone somewhere, and I guess that’s a theme we’re having already is like, it’s good to talk to other data experts [laughter] someone somewhere is like, do you check the false positive rate of that approach on your data?
0:14:23.4 JH: Then your heart drops to your stomach.
0:14:25.3 SJ: Are you sure that’s not a false positive? And we went and simulated this technique on the dataset. Yeah, the false positive rate was like 70% or something. Like, it’s more than a coin flip. You will get a result that looks good and we’re like, let’s just dial it right back down to something simple. Ditch this whole Poisson thing and the results were fine in the end, but we did all these assumption checking everything. Oh, so painful. So yeah, like good slap in the face for me and the team, but definitely complex often does knock out in the real world.
[laughter]
0:15:01.7 SJ: That’s not a question either Moe, but I thought I’d riff off it.
0:15:04.5 TW: That’s getting to a point that I think drives some of that complexity. Sometimes the data, no matter what you do, it’s only gonna get you so far. But I think just as a society, as a business society, there’s this belief in, oh, basically it comes down to the data is in maybe as sparse or crap or incredibly noisy. Well, can’t you just layer on a more sophisticated methodology? Which I mean, the first time I’ve worked with an actual data scientist, I was shocked on how some of the things that could be done. There are cases where you can do that, but there are times where it’s like node no all you have is flour and eggs. Like no amount of fancy KitchenAid mixer is gonna turn that into an edible cake, right? I mean, it’s like you just don’t, you don’t have the ingredients. And so the best thing is to do is say, look, I can mix it up and throw in the oven, but this is the best cake you’re gonna get, are you starving and you’ll eat it [laughter]? So, or is it no, you really catering a wedding, you need to go out and collect some more data. That was a highly spontaneous analogy that held together.
0:16:12.4 MK: Wow.
0:16:14.2 TW: Better than the cake would have.
[laughter]
0:16:16.3 JH: That’s right. But so to that point, I actually do have a real question, guys. [laughter],
0:16:23.7 TW: Somebody has to.
0:16:26.4 JH: When you are [laughter] when you are facing a new business question, a new situation, and you’re trying to determine what method to tackle it, what dimensions are you kind of weighing for your options, right? You think I could do X, Y, or Z to tackle this type of problem? Are you looking at how many assumptions you have to make? Are you looking at complexity of the actual method, like what it takes to run it? Are you looking at, I don’t know, anything else that you would have to like balance the certainty, things like that?
0:17:01.8 SJ: Yeah, great question. I think, I’m not sure if this is part of your question, Julie, but a lot of it is around what’s the problem? Like what decision that we’re trying to make with this data or the outcome we’re trying to achieve. I’ve had cases where a leader might have asked, oh, we need to report on the impact of our business unit. And then you see like data people run around going, holy crap, we’ve gotta figure out this perfect method. Did anyone ask what the margin of error is that they care about? Oh, they can deal with plus or minus 30%. It’s like, cool, let’s just pull some descriptive numbers. That’s fine. So a lot of it I think is around figuring out like what’s that margin of error that people are comfortable with and what is the decision they need to make? Tim, I feel like you’re rearing up. You wanna jump in, you wanna add.
[laughter]
0:17:48.8 TW: No, no, no.
0:17:50.4 SJ: Well, just some memories like surface there.
[laughter]
0:17:53.7 TW: Well, I’m just thinking of the, like, when you ask what level of uncertainty, they’re like, oh, I want no uncertainty, give me the best response.
0:18:04.2 SJ: Yeah. Yeah.
0:18:04.3 MK: Do you really like, okay, I, this is very controversial and I feel like Tim’s probably gonna be like start shaking and shuttering and like need to pull out the soapbox, I…
0:18:15.7 TW: We’re gonna start like a little jar, a little coin jar that every time you predict that I am going to lose my shit. Like you just have to pay into it [laughter] and then you get to take half of it out. If I actually do lose my shit.
0:18:27.0 MK: Okay. But I get to spend it on wine. Right.
0:18:29.5 TW: Sure.
0:18:29.7 MK: I feel like that’s the trade off. Okay.
0:18:31.4 SJ: Good trade.
0:18:35.4 MK: I feel like Tim, over many, many years. The reason I say that I feel like you might have a reaction is because you have largely argued that we need to explain uncertainty to our stakeholders so that they don’t need to know all the complexity, but you need to start to get them familiar with that concept. And I guess the reflection that I was just gonna have is that I feel like the level of uncertainty that you approach a piece of analysis with is actually more of a call made by the data scientist or data analyst. You have to kind of interpret what level of uncertainty the business is comfortable with and then you need to make more of a judgement call. But like I said, I don’t know if you’re gonna…
0:19:21.1 TW: Well, but I think the trick is then actually conveying that level. I mean when you can quantify the uncertainty, including, I mean I’m a big fan of Error Bars when reporting test results or confidence bands, like you’re making that call. So I think you’re right. I think where we get in trouble is when you report the average and then people wanna just run and treat the average as the truth as opposed to representing the uncertainty that you’ve got baked into it. But I don’t know that there are methodologies where you don’t get, I mean I guess even you’re doing just the linear regression, you still get, you can quantify the uncertainty based on just that R squared, but you didn’t have to figure out a way to represent that to them, right?
0:20:06.2 JH: Yeah. Like Can you make a decision based off of this? So I ran into a situation recently where the result was statistically significant, but when you actually quantified it with some of the other parameters, it was just a linear regression. Some of the other parameters, it wasn’t actually an impactful result. And so we had to tell them, “Sure, it’s statistically significant. There is a relationship here, but this is actually not a great data point for you to be using to predict what you care about like later in other projects with other methodologies.” And it was really interesting to have to try to communicate that and thread the needle of the certainty compared to being impactful. What decision should you actually make? It was interesting. They took it really well.
0:20:48.3 MK: What was the response? Yeah, what was the response?
0:20:49.4 JH: They actually, she took it really, really well. She totally understood. She was someone too who has worked with data, I think a long time in her career. And so when we were bringing up those two aspects to balance, she was pretty comfortable with that idea. To Tim’s point of this certainty thing, she understood that idea enough that when we gave our recommendation, we could be transparent of this was the result, but our recommendation is actually not to take action based on it. And it went well.
0:21:18.5 TW: Wow. Show off.
0:21:21.6 JH: I know. You know what, Tim? Give me one win. All right. I got a lot of other ones that go so Well.
0:21:27.7 SJ: I was gonna loop it way back.
0:21:28.7 TW: Loop away.
0:21:28.8 SJ: To your first question, Julie. Loop away. Yeah. Like What else to consider? And it’s the person you’re with. It’s like. And I think that takes time to build that relationship and understand, like, if I show you a regression or like discuss what incremental means in the context of statistics, are you going to understand this? Are you going to react in a way that is reasonable and logical? Or just be blank faced, be like, cool, we’re done here. Just tell me the answer. That’s a really hard part you need to suss out.
0:22:06.7 MK: So Simon.
0:22:07.8 SJ: Yeah.
0:22:08.9 MK: Just Having worked with you and I feel like you did have some tricky stakeholders that you didn’t always at the start especially have a really good understanding of their level of knowledge and experience and that sort of thing. ‘Cause you were often working with very different parts of the business. It wasn’t like you had one main stakeholder. How did you tease those things out and learn those things in the early stages of a relationship and try and figure out at what level of knowledge that person has?
0:22:36.0 SJ: Oh, I really, really like this topic. Thanks for asking Moe. I never get to tell anyone this stuff. I usually, when I start talking to someone about come in with, all right, I’m gonna assume level zero and I’m going to explain things like I’m explaining to maybe a high school student or something somewhere around that. And then I’ll start to layer in some complexity and then say, like let’s say I think a regression is suitable. I go, “Oh, I might consider a regression and then I’ll pull back and go, sorry, I’m not sure what your experience with data is. Does that mean something to you?” I do that a lot. This sort of give and take with I layer on complexity and I pull back and check with them and that’s usually how I suss it out. I try to do it more quickly when I start a relationship and they’ll very quickly be like, “I’ve got no idea what that means or like nope. Or they’re like, yeah, yeah, I’ve done that before and then I build up.” That’s generally the flow of my conversation when I start with someone. I’m curious if you have tactics as well, but that’s my go-to.
0:23:39.9 JH: I love that. I wanna use it. I will be stealing it.
0:23:41.8 SJ: Please.
0:23:43.7 JH: It’s so simple, but it’s like, yeah, that’s a nice approachable way to kind of gauge it.
0:23:47.7 TW: Well, I mean I feel like I’ve had the opposite. I’m thinking of ones who come in with, they’ve got a hammer, although I mean they have a blowtorch or something and they may come and say, “We need to do a cluster analysis.” And they’re like, “Okay, well, and they’re like, we should do a cluster analysis to figure this out.” And it’s like, well wait a minute, but a cluster analysis is an unsupervised technique and what you’re asking for is, is that right? You don’t have a dependent variable with a cluster analysis and so then you’re like, “Huh, they come in with a methodology or I feel like in the marketing analytics, the whole world of multi-touch attribution, it’s been a few years and I feel like there’s a greater awareness in Moe.” You were kind of on the forefront in my universe of people who were like, there’s match market tests, there’s mixed modelling, there’s multi-touch attribution. These are different things, but the marketers who come in and say, “You got to do whatever the fancy word is for.”
0:24:48.6 MK: Can I interject?
0:24:49.5 TW: Please.
0:24:50.8 MK: The marketers that come in…
0:24:53.1 TW: This sentence will go on for another 47 minutes and never end. So please.
0:24:57.8 MK: The marketers who come in and go, can you just cut the data this exact way and look at this and drill by this and give me this exact metric and this? And it is often half a page, it almost always relies on attribution data as well and is like, if you just give me these things, then we know the answer and every time I have to sit back in my seat and just be like, “Hmm interesting.” And the reality is I still feel like I don’t have a great answer for that. I obviously try really hard to have a conversation with that person about the methodologies we use and all that sort of stuff. But when a stakeholder comes in with a very clear idea of exactly the steps to solve the problem or to get the answer, I get terrified.
0:25:42.9 SJ: That happens a lot.
0:25:43.8 MK: Low key, terrified. So Simon what, I wanna know what you would do in that exact situation because we always use the podcast to solve my personal dilemmas.
0:25:53.8 SJ: Great. Happy if I can ’cause I think I’m like you. I don’t have a perfect answer here, but I can share what I do try to do just in the time usually agreement because you want that. The natural reaction I think is to get defensive or especially if you’re a data expert and you need to guide this and usually having trust is an agreement is a good way to do it. I’m like, cool, great, love that. Something like that. Just to help me do this the right way for you. Tell me a little bit more and I’ll lead them into question. I think another assumption I once made earlier in my career is that they had no idea what they were thinking about and I knew better. I’ve learned actually, there are definitely times that people have had better answers than me, so I do wanna suss out.
0:26:38.9 SJ: I’m like, is this the right thing? Maybe. But as soon as I get a whiff of that’s a crazy assumption, not a good idea, which happens a lot, let’s be honest. Then I’ll start to maybe, I don’t know if this is right or not, but I like to flex some of my expert muscles and be like, no, you are the patient in the doctor’s room. I’m the doctor here. So like, oh, now I understand the problem. We could do it the way you suggested, but other things that I know that could solve this problem and just throw crazy work, this is the opposite of the thing I did before. I throw complexity at them and they’re like, “Whoa, I’m in a world I don’t understand anymore. Help me.”
0:27:13.4 SJ: Then It’s like that. And I’m like, “Okay, cool. So if there’s all these other options, maybe let’s just check if there’s something faster or simpler. Try to appease what they value.” Like Here’s something that will give you the answer, fuck with more certainty or blah, blah, blah, and then try to lead them that way. But I can’t promise this is gonna work all the time, but that’s usually the flow I would go through.
0:27:36.9 TW: I’ve never thought of a doctor patient relationship. It’s kind of intriguing ’cause we’ve had their experiences with doctors who do come in with the I know everything and they don’t listen to the patient ’cause they assume the patient is, they hear one keyword and they just key in on it. The great doctors are the ones that say, “My instincts are telling me it’s totally X, but I have the experience to take a breath and assume that maybe I haven’t, I don’t fully understand what they’re doing.” ‘Cause I feel like as an analyst, I’m likely to jump to I know what you’re really asking and I know what you really need. You’re asking me for this. That doesn’t align with what I need, so how do I navigate out? As opposed to you kind of set it up there that or with a degree of humility, one, it kind of puts you in a position with the conversation of saying, “Okay, what you’re asking for and what I understand, I’m having trouble connecting some dots.”
0:28:33.9 TW: I wanna make sure I deliver what you really need, but just help me understand. So that keeps from being a talking down to that’s a stupid request. I do still think Moe on the multi-touch attribution, especially when they’re like, “But we know that the algorithmic model is gonna give me the right answer.” It’s like, “Well, the answer’s gonna give you is right, but it’s not the question that you’re really asking.” How can I help probe in a way that I get to your real question? Which I mean, getting to the… Leading them to incrementality through a series of questions is a super delicate back and forth and discussion.
0:29:15.8 MK: I was like, tell me how to do that, Tim. Please, please tell me how to do that. But it’s funny, I was chatting to a girlfriend the other day who’s a marketer, and we had this big thing going on and everyone is piling on their ideas in this slack channel of like, “Have we thought about this? Have we tried that? What about if we do this?” And she was so frustrated and I had this real epiphany where I was like, “Oh, everyone does that in that space because there’s kind of not this expert, there is actually expert knowledge about when it comes to do how to do marketing, but it feels like a more creative space. So people are very happy to pile on their ideas. And I kind of take for granted that in data most of the time, most of your stakeholders would be like, “You have special knowledge and understanding of methodologies that I don’t have and I am going to lean on you for advice about which one to do.” And yes, we do have the odd stakeholder that comes with a very specific ask about a particular methodology, but for the most part, people don’t pile on being like, have you thought about this? Try this blah blah blah because we work in this space that they don’t have as deep of an understanding of, and it just gave me this real profound respect for the job that some marketers have to do in some situations that we grapple with less and maybe we’re less good at handling that stuff because we don’t have to do it as often.
0:30:37.5 SJ: You reminded me of a question I ask a lot, Moe, with people if they come with that really serious question also like method. Sometimes I’ll just straight up ask, be like, what is it about this method that may think it’s worth doing? What’s attracting you? And sometimes you tease out totally other stuff. It’s like, “Oh, I got top eye this person at that other company that they do and that’s what we should be doing.” Oh, okay. I find that’s really cool. Just why do you want this?
0:31:07.9 TW: It has me thinking that some of the techniques, and it’s been brought up on the show before, that Julie’s analysis template planning, the importance of having assumptions, figuring out what the assumptions are heading into any and every methodology has constraints and criteria and what it does and doesn’t do. And that’s kind of a lens to understand to say, “Well, these assumptions have to hold. Are we good with these assumptions?” Like, oh, I’m gonna do some classification technique. Okay, well that means that we assume this and we assume this and we assume that. And that’s another way to back into the discussion. I think it’s a way to back into uncertainty too, ’cause it’s like saying you’re not gonna have perfect information. So that means there’s always gonna be assumptions. I mean the big assumption is this is gonna be good enough to inform my decision, but then what are the assumptions below that? ‘Cause if you’re wanting to use this methodology, then we’re making these assumptions. Do you my business partner agree that these are assumptions we can validly be making? And if they say no, we can’t make that, it’s like, oh, great, then we need to talk a little bit more before we head down whatever path we’re gonna go down.
0:32:20.9 MK: Can I totally take us off track.
0:32:24.0 TW: Please. Of course.
0:32:25.9 MK: We talked about our toolbox and that we all have a toolbox and generally it gets a bit bigger as we go further in our career. Can we touch on what we mean by different methodologies and what are the things that are in your toolbox and also what are the things that you preference? ‘Cause we all do have preferences or methods that we probably lean towards more. Simon, do you wanna kick us off?
0:32:51.5 SJ: Sure. Terrifying at all to try and answer that. What do we mean by different methodologies? That one, I think I said earlier, my background sort of inferentially go down even further. It’s really in correlation based techniques like linear regression, structural equation modelling and these sort of things. But there’s a whole world of other techniques that I was just good at and then I got into machine learning for a while. I’m like, oh, okay. The supervisors unsupervised is a whole other way to slice things. And so I’m not sure how to easily define or describe all the myth. I might call out one split, which I like, which I think is, maybe it’s my experimental background, but I think it’s underappreciated, is like the differences between how you get data like how you control the world and [0:33:50.8] ____ data. So experimentation is a method of controlling the world, but then you can use a G-Test, you can use a G-Test, you can just look at the difference of fancy stuff.
0:33:55.4 TW: It’s been a moving target for me, so this is I’m cheating a little bit and that this is actually in the book that coming out January, 2025 that the latest way that we’re organising it, it’s literally three levels that you have anecdotal evidence, you have descriptive statistics, and then you have scientific evidence. So anecdotal would be could go everything from I personally had an experience and it seemed like this is the case. It could be I casually asked some people, I think you would even put a focus group would be anecdotal descriptive, gets into when you’re starting to apply a level of statistics to it. So Dr. Joe, even saying linear regression even is still descriptive. And currently we’re kind of saying scientific is when you do a controlled experiment. So that’s the gradations of different methodologies and you could kind of bucket anything into it. But now that’s like hot off the presses. It’s not even on the presses yet. So I mean, boy Simon, if you’ve got a hot take on that one, I’d love it. ‘Cause I think it’s challenging for people like Simon, frankly and Dr. Joe Sutherland that have been doing this in so many ways. It’s hard. I feel like you guys struggle to say, Here’s a simple organising framework ’cause there’s just so much in your brain and you just kind of are able to access you head down the right path. It’s a really tricky question to answer, but hot take feel free to…
0:35:32.9 SJ: So my brain is generous, but I’ll take it. Tim, thank you. I love that. It reminds me of the hierarchy of evidence. It’s really like that pyramid structure. So I like that mapping. I had not thought about it like that. I usually think about methods of manipulating something. So I have from one side of my brain are methods of manipulating the data that I’m gathering. Am I controlling something about the environment through an experiment? Am I gathering data through a survey versus the machine recording an event? So like I have one half of my brain is, What’s the form of the data methods I have to gather data and control things? And the other part of my brain is like, Okay, I have data. What are the methods that I have to analyze and manipulate that data to get to some outcome or answer? So that’s how I split it in my brain. Yeah, it’s really hard. It would be great to have something simpler to explain.
0:36:31.0 JH: Can I ask.
0:36:32.2 SJ: Please.
0:36:33.0 JH: So it’s interesting too because would it be fair to say that we may actually all be talking to more in the inferential statistics or causal inference space? Because I actually was just talking with some of my colleagues about this topic and one of my colleagues actually categorized it as he was saying, and it reminds me what you just said, Simon, about you were thinking about how you control and gather the data. And he was saying, “What do you want from the outcome of the data? And he was saying he splits it in, are you trying to predict individual outcomes or are you trying to infer something?” So he was even splitting it as inferential and predictive in big kind of buckets that way, and which was totally like, oh, I didn’t think of that.
0:37:18.0 JH: And then I went down this whole rabbit hole. So then Tim, when you brought up your hierarchy, would that be more applicable to just the inferential side? Would that apply to predictive? Like it’s kind of funny, you can categorize and bucket these things in so many ways.
0:37:35.1 TW: That’s a good question. And I don’t…
[laughter]
0:37:35.9 TW: I don’t know. I mean I…
[laughter]
0:37:37.1 TW: Well ’cause I even go back to… It’s like the ugliest little diagram, like the psychic learn algorithm cheat sheet that sort of walks you through, what do you have and how much data do you have and what are you trying to do and do you wind up in a, is it a classification problem or a clustering problem or a dimensionality reduction problem or a regression problem. And it’s just kind of a decision tree. But I think to both of your points, that’s still, that’s actually just organizing within one bucket of the grand universe of methodologies. And it sort of starts with what is the data that you have? Like it doesn’t have anything in that bucket of, to do this I can go and collect data and if I know what outcome I want, it can influence which data I would need to collect. Yeah, nice question Moe.
[chuckle]
0:38:29.2 TW: Just like life, the universe and everything, go. Anybody have an answer?
0:38:33.9 MK: Yeah, pretty much.
[laughter]
0:38:37.0 SJ: I was thinking, as Julie team, we were speaking like we’re very, I think we’re better at classifying problems with these methods. Like the… Julie, what you mentioned your friend talking about, it reminded me, oh I cannot remember the names. It’s a very canonical paper that describes the two worlds of algorithms, statistics, sort of algorithmic prediction. And that was sort of foundation for the discussion between inference and machine learning. It’s great paper, but I’m sorry I can’t remember the name or who wrote it.
[chuckle]
0:39:04.6 SJ: But those are different ’cause regression… Like I learned regression as an actual technique. I learned regression in the context of you have P values on every coefficient and you add and remove independent variables to understand, can this explain additional variance as a hypothesis. And then I went to the world of machine learning and they were using regression, but there were no P values, there were no confidence intervals, you know, train and test sets and accuracy measures. And so the method is solving these. I’m very good at classifying the problems. But can we bucket the methods? I don’t know. I find it really hard to do that.
[laughter]
0:39:44.8 JH: So maybe it’s, you have to define your problem up front and classify that.
0:39:49.5 MK: Oh, shocking.
[laughter]
0:39:50.5 S?: Shocking.
0:39:51.2 JH: And then based on your assumptions and available data and your stakeholder, you know a lot of variables under there. Then you decide on your methodology. I mean is that what where we’re landing?
0:40:00.0 SJ: I…
0:40:00.1 MK: Kind of.
[laughter]
0:40:00.5 TW: I mean there’s the, yeah, I mean there’s the other way to look at it and this is gonna be another, I am writing the book with him so I’m gonna have Dr. Joe on my brain all the time. But his…
[laughter]
0:40:11.8 TW: His like, I mean this even goes to talking to business stakeholders. Like what’s your dependent variable? Do you have a dependent variable? If you have a dependent variable, then what are your independent variables that you’re considering and what is your unit of analysis? Because you know, if your unit of analysis winds up being a person, well that start to make, that may get you to sort of predictive at a personal level. If your unit of analysis is, time or a channel or something that may be more that you’re just trying to get just a little more inferential maybe. I don’t know. I mean I’d be using these words, right. I’m a fraud. That’s it. I’m coming out of this episode realizing that I have no business being an analyst at all. I know nothing. Thanks.
0:40:56.1 MK: I don’t think that’s something the quintessential analyst would ever say.
0:40:57.6 TW: Oh geez. Michael’s not even on the show.
[laughter]
0:41:04.3 MK: Yes.
0:41:04.3 SJ: Michael.
0:41:07.8 TW: That’s $5 in the swear jar. Well on that note, I actually am now in my role as the poor man’s Michael Helbling. I’m realizing that we are actually at a good on the time front.
0:41:21.4 JH: Wait.
0:41:22.3 TW: We need to start wrapping up.
0:41:23.0 JH: Wait, wait, wait.
0:41:25.5 TW: Which means there’s always…
0:41:25.6 JH: Can I please ask one more.
0:41:25.7 TW: Either Moe or Julie has one more question.
0:41:28.8 JH: Julie.
[laughter]
0:41:30.5 JH: I love that you were like channeling the inner me. This is amazing.
0:41:34.6 MK: Those inspired me to always ask for at least one more question.
[laughter]
0:41:35.0 JH: Yes. No because okay. You have gotten to have a lot of great experiences working with really smart people and you’ve gained so much knowledge yourself. And if you were to give advice to newer analysts, younger analysts, people newer in the space, what would your advice be on how to actually go learn about some new methodologies or any pitfalls around this crazy scary world of methodologies ’cause you always feel like a fraud ’cause you realize there’s more to know and… You know what I mean? Like you’re never gonna know it all. So advice for a young analyst.
0:42:10.1 SJ: Yeah.
0:42:12.0 JH: Listening to this or having to choose methodologies in their job.
0:42:15.1 SJ: That’s a great one. I think the one piece of advice I always give the people I’ve mentored or coached is do it in the context of something practical. Like I spend a lot of time, this is super nerdy, but like I read like statistics textbooks. And while that might have increased my like knowledge in certain domains, really putting these things into practice at your job or on a project is where you’re going to learn the most about the strengths and limitations of those techniques. So always try and focus your learning setting. That’s like the first thing I think of. The second thing, and we’ve talked about it a lot so far, is even in this discussion, we’ve learned from each other go and get involved in communities and they don’t have to be more advanced than you. They don’t have to be more experienced. It’s usually just they have a different background. They see the same techniques from a different perspective. They know different techniques even. Yeah, like I said, us here with all our experience teaching each other. So find a community if you don’t have it at work, there’s heaps online. Talk to people about different methods as you try and apply those methods in a practical setting. I think they’re probably the two things I would start with.
0:43:32.1 TW: I feel like there’s a third one, which was because you said when you were coaching or mentoring people that there is having somebody who actually does have the more experience. Like it’s really hard to be, I’ve talked to people who’ve said this, it’s hard to be the lone data scientist at a company in a, as a junior data scientist and grow effectively. One, it’s terrifying, but two, you just don’t have somebody to say, here’s another tool for that. Your toolbox, here’s another tool for that toolbox. I feel like that was less the case for when I was coming up as an analyst. It was a little easier to be a, maybe there’s just less breadth of things to know. But because you can like do it to your point, like you can take a wrong turn. You can do something that’s profoundly wrong and that any checklist you’re following, it’s gonna take somebody who’s seen it. It’s like, whoa…
[laughter]
0:44:23.0 TW: You know. You can’t do that. I know why you think you can do that. Here’s why you can’t do that. Now you have that knowledge and you’ll never do that again.
0:44:31.2 SJ: I like it. Yeah.
0:44:34.7 TW: So I feel like there was a third one, you kind of opened it with it, you just didn’t realize you were opening with it when you said that it was analysts you were coaching or mentoring.
0:44:40.1 SJ: Yeah. So find an experienced mentor, bounce off peers and put your knowledge to practice.
0:44:45.8 TW: Nailed it.
0:44:47.4 JH: Amazing. Thank you.
0:44:50.9 SJ: Thank you.
0:44:53.6 TW: All right, so we’ve gotta head to wrap, but before we wrap up, we always like to do our last calls. We go around the horn, have everyone share their last call, something that they found interesting or think might be of interest to our audience. And Simon, you’re our guest. Would you like to kick us off with your last call?
0:45:10.3 SJ: Cool, yes please. So this week I read an article in the Harvard Business Review. I think it just came out. It’s called The Myths and Realities of Being a Product Manager by, I’m thinking my pronunciation Apoorva Mishra and yes and no. It’s about product managers. But first of all it starts with a very fun jab at social media influencers. So, if you wanna laugh, have a look at that.
[laughter]
0:45:29.9 SJ: But then it goes really deep into the skills and like the reality day-to-day sales Product managers need. And one of them is data skills. And he talks about, and just for context, he’s a senior PM at Amazon and he says Data skills for PMs today are table stakes. As far as saying they need to be able to do SQL.
0:45:55.3 TW: Yep.
0:46:00.1 SJ: And [laughter], I expect many listeners to agree with that, but it got me thinking really deep side of those of us who have data expertise, who teaches these other people these skills? Like a histogram, you know, who teaches anyone what a histogram means. I’m like, I learned that in Stats 101. Who teaches a product manager? I did to figure it out. So I got thinking really deeply about how do we educate other people on the table stakes skills and also what, where’s the lines table stakes for other people and other cross I, PMs should probably know how to interpret graphs and maybe run SQL but probably not regression. Right? So like where’s that line? Anyway, I thought it was a great article. Great. Either wanna be a PM or you work with PMs especially data point of view. That’s me.
0:46:49.8 TW: Interesting. That could be a whole other episode. I have thoughts. I’ll swallow them right now.
[laughter]
0:46:50.7 SJ: Great.
0:46:52.9 TW: And instead Julie, what’s your last call?
0:47:00.7 JH: My last call is actually coming from a conversation I recently had with Mike Gustafson. We were at work in the Kitchen chatting and we were talking about, some announcement had come out about ChatGPT I mean when is something not coming out about that recently? And he was just telling me that he had come across this really great YouTube series on neural networks and I had said, oh I would love to check that out just because I wanted to like learn more about them in general. And it’s by three blue, one Brown and it’s like five chapters total and it’s pretty like simplified. It brings it back down to how the math actually works, which I love that like spoke to me where I was like okay, I can grasp this [chuckle] And it was just really helpful, really simple and it wasn’t a crazy amount of time. So if you’re looking for some helpful information out there to get you started down that road of learning, I highly suggest.
0:47:55.9 TW: She’s like, says the master’s degree in Applied Mathematics says it was pretty straightforward. It was easy to follow. So just gonna throw that little caveat in for listeners who think they’re about to dip their toe into a pool that’s gonna be a lot deeper than they think it is.
[laughter]
0:48:12.8 TW: Moe what about you?
0:48:15.3 MK: I just had so much inferiority complex when, or imposter syndrome when you said that. ‘Cause I was like, holy shit, everyone on this show is really fucking smart. And like as you were talking about Histogram Simon, I was like, well I kind of taught myself that. Or if I got really confused I’d ask my sister definitely did not learn about it in stats. So just a reminder to everyone out there. Imposter syndrome is always a thing.
[laughter]
0:48:37.6 JH: Always.
0:48:40.5 MK: Okay. Hopefully by now everyone has worked out. Like I get really obsessed with like one thing for a couple months and I get really into it. And the thing that I keep banging on about at the moment is the Acquired Podcast. I cannot Stop listening. It is the only thing I’m listening to at the moment. I’ve stopped even listening to news because I am like such a big fan. And I did talk about the Acquired Podcast previously.
0:49:03.9 TW: They’re like really long episodes, right?
0:49:05.9 MK: They’re Like three, five to five hours. Yeah.
[laughter]
0:49:10.9 TW: Oh My God.
0:49:11.0 MK: And I think that’s why I enjoy them ’cause they go into so much detail. They do analysis on the companies. And I previously mentioned the LVMH episode, which is hands down one of the best episodes of anything I’ve ever listened to. However, I have had on my reading list for years to read the book Shoe Dog by Phil Knight about Nike. And part of the reason I wanted to read it so much is because I wanted to listen to the Acquired Podcast episode on Nike, which I have just listened to. And it is an absolute banger.
[laughter]
0:49:39.9 MK: Now the reason, look, I do recommend the book. The book is fantastic, standalone, fantastic read. However, the thing that I probably observed the most about the Nike episode that I really enjoyed from The Acquired Podcast was some of the stuff they went into on the marketing side. Because like the book ends kind of very early in the history of Nike, it talks a lot about like the relationship that they had with athletes. You know, the fact that Phil Knight didn’t really believe in marketing, but actually he was doing basically partnership marketing by working with these athletes. The whole deal with Michael Jordan and how that has kind of revolutionized sports marketing, but then it takes you through where they are today from a brand marketing perspective and how like, when you see Nike advertising, it’s quite aspirational and inspirational. And anyway, I just loved it, loved it, loved it, loved it. I love the Nike brand even more. I am, yeah, big fan. And Tim, over to you.
0:50:42.0 TW: So this is gonna be out of character. Mine’s gonna be like short, it’s also a YouTube video. It’s only six or seven minutes I think, maybe 16 minutes, I don’t know. But there’s a guy named Doug Neill who does sketched visuals of books. He does sketch noting, which is just the sketching for taking your notes. So he did thinking in bets. Annie Duke’s seminal work, speaking of uncertainty, but he basically stands there. It’s kind of like when Rand Fishkin used to do his little Friday sketch things. But this guy is an absolute pro and he basically sketches every concept in the book and he turns and talks to the screen and then he draws a little tree and talks about potential outcomes. And I kinda headed down a path like I did not, I had not heard of Sketch noting, but he sort of makes, he’s done other books and done other things and he teaches people how to do sketch noting. So…
0:51:37.1 JH: Ooh, that sounds so cool.
0:51:39.3 TW: All right, so Simon, thanks again so much for coming on the show. That was a great discussion. No show would be complete if we did not thank our producer Josh Crowhurst. Actually Josh will not have edited this episode because Josh is getting himself hitched.
0:51:55.2 MK: Woo woo.
0:51:58.8 TW: So congrats to Josh and Gina. We’re soldiering forward if you’ve noticed any glitches in the quality of this final recording. It was not Josh’s fault, the music was still all Josh.
0:52:09.1 MK: Please send all feedback directly to Josh.
[laughter]
0:52:15.1 TW: All Feedback directly to Josh. So we love to hear from you. If you have struggled with analytical methodologies or have obvious tips that we missed, reach out to us on the Measure Slack, on LinkedIn, on X if you must. This is coming out right before Marketing Analytics Summit. So if you’re gonna be at that conference, then please stop by and say hi. Don’t be like some people who, somebody emailed me today and said I saw you across the room at Adobe Summit and I didn’t come over and say hi. I’m like, why? ‘Cause, and he’s like, well, you seem like you might be a jackass, but no, that wasn’t his response. But please, we’d love to see you. Hopefully we’ll be out and about at other events over the coming months and years. So with that, no matter what methodology you’re picking between, whether it’s running a controlled experiment or whether you’re doing a Mini Batch K-means test, or whether you’re doing an SGD Regressor or doing something else that I’ve never actually heard of and know nothing about, you should always use that data and keep analyzing it.
0:53:21.1 Announcer: Thanks for listening. Let’s keep the conversation going with your comments, suggestions, and questions on Twitter at @AnalyticsHour, on the web @analyticshour.io, our LinkedIn group and the Measured Chat Slack group, music for the podcast by Josh Crowhurst.
0:53:37.2 TW: So smart guys wanted to fit in, so they made up a term called Analytics. Analytics don’t work.
0:53:48.8 JH: I love Venn diagrams. It’s just something about those three circles and the analysis about where there is the intersection. Right?
0:53:57.7 MK: Simon, one of the reasons that I empowered you under the bus for this episode…
[laughter]
0:54:05.4 SJ: Never heard that.
0:54:09.2 TW: Julie, did you see I had you as the rock flagger?
0:54:11.6 JH: Yeah.
[laughter]
0:54:13.0 JH: Feels like my name’s been shown up there quite a bit.
0:54:19.1 MK: I agree with Tim though. They, who starts the show prep doc yields the power. True.
0:54:26.2 TW: So just to Michael Jordan or Magic Johnson?
0:54:29.3 MK: Michael Jordan.
0:54:35.2 TW: Okay. So the… You watched the air, did you watch Air?
0:54:36.3 MK: Yeah, that was Jordan. Yeah.
0:54:38.2 TW: That was Jordan.
0:54:39.2 JH: Yeah.
0:54:40.6 TW: That was Magic Johnson.
0:54:43.7 JH: No.
[laughter]
0:54:44.3 JH: That’s Michael Jordan.
0:54:47.2 MK: Yes, I’m right. I am so right. It is Michael Jordan.
0:54:47.6 TW: Okay.
0:54:48.6 MK: That never happens.
0:54:49.2 TW: Oh, you know what I’m… No, I’m mixing it up with, Yeah, you’re right. I’m mixing it up with what? Yeah, that’s a…
0:54:54.8 MK: Okay. Tim’s.
0:54:55.7 TW: Yeah.
0:55:00.1 MK: Tim, you’re sleeping.
0:55:00.7 TW: We’re all in a, I’m totally…
0:55:00.8 MK: I’m shocked.
0:55:01.0 TW: Mixing it up with that other…
0:55:05.0 JH: That other one about Michael Jordan.
0:55:07.9 TW: Yeah. There we go.
[laughter]
0:55:09.1 TW: No, yeah, the other… No.
0:55:11.2 JH: Anyway, rock flag and how do you even book at them?
Subscribe: RSS