How does one build a strong culture of experimentation at an organization (and what does that even mean)? One way is to spend a few years working at a company that already has such a culture… and then jump ship to another organization that is well on its way! That’s (sort of) what our guest, Lukas Vermeer, did when he left booking.com to go to Vista. With Val Kroll guest-co-hosting, we dug into the challenges — organizational, educational, and mindset-al (?) — when it comes to having an organization successfully and appropriately integrate experimentation into their operational ways.
Photo by Lukas Vermeer
0:00:05.8 Announcer: Welcome to the Analytics Power Hour. Analytics topics covered conversationally and sometimes with explicit language, here are your hosts, Moe, Michael and Tim.
0:00:22.8 Michael Helbling: I assumed that we’re just doing that whole rolling intro thing where we just launch into the podcast with the intro.
0:00:28.0 Tim Wilson: Oh, yeah. Are we recoding it? Sure, it’s the Marc Maron move.
0:00:33.1 MH: Here we are. It’s pretty cool. I’m here with Tim Wilson, Valerie Cole stepping in as a guest cohost, and I’m honored to have our guest, Lukas Vermeer, who I’m calling one of the Dutch masters of experimentation. [laughter] We’re delighted to have you. Thanks for coming on the show, Lukas.
0:00:56.6 Lukas Vermeer: Thanks, love to be here. Did you know I actually live in the lovely city of Delft, the same city that the famous Dutch master Vermeer actually lived.
0:01:06.7 MH: Well, I assumed you’re a direct ascendant, right? Vermeer and…
0:01:09.9 LV: No, no, no, no, no. It’s a different branch.
0:01:13.9 MH: Okay. Well, but to give our listeners a little bit of context, so Lukas Vermeer is the Director of Experimentation at Vista. And prior to that, he also led experimentation at booking.com. He’s pretty much spoken in almost every country in the world, I think all over the world about these topics around testing, experimentation, optimization, and I’m just delighted that you’re our guest today, it’s been pretty cool to set up episode 216 of the Analytics Power Hour.
0:01:40.7 MH: I think one of the things I think, Valerie, you brought up that I think is actually really important, is there’s definitely a misconception around what culture is when we talk about experimentation or a culture of experimentation, and I think this is true. People more broadly use the term data culture, and I think there’s a big problem with that word too but let’s keep it focused to experimentation, which is a group of things that happen under data, if you will. So maybe that’s the first thing to jump into is, what is it and what is it not?
0:02:13.1 LV: Oh, boy. Starting with the easy questions.
0:02:17.1 MH: I had a whole intro written where I had us doing some deep breathing exercises to like…
0:02:23.7 LV: Some yoga.
0:02:24.8 TW: That’s question one, and then we’re just gonna talk about sample ratio mismatch, we’ll call it a show. We’ll cover the two extremes of culture to one very narrow…
0:02:31.7 MH: When I first read the description. I thought It said culture of meditation and not culture of experimentation, so that was the reason why.
0:02:40.6 LV: I’m on board with that.
0:02:41.3 Val Kroll: I wanted to prod you in this way, Michael, just because I feel like so many people come to that conversation with like, Oh yeah, I know exactly what this is. And a lot of times when I meet with clients who we’re kicking off an engagement and they’re like, oh yeah, our culture is really healthy, we have a monthly lunch and learn, and we throw people a t-shirt if they get the most test ideas submitted in a month, that’s our culture. Check that box. Moving right along, right? And to me, that’s not what a culture of experimentation is, and so I was like, “Oh, I’d love to ask Lukas, how people have potentially misunderstood or misrepresented in some ways what this means and what the definition is that’s kind of near and dear to his heart.” [chuckle]
0:03:21.7 LV: This is such a difficult question because I used the term before, does a fish know it’s wet? Because I was part of the booking.com organization for such a long time that I was essentially brainwashed into thinking that that was normal. And I’m using that term lightly. I don’t mean actual brainwash, you are part of an organization for such a long time, you start to think that the way that things are done there is just normal, and that everyone else must clearly be behaving this way also. That makes it really difficult to say what does and what does not make an effective experimentation culture. The fact that I’ve only been exposed to one, maybe two with Vista now, companies that are running experiments at scale, who am I to say anything about experimentation culture? Honestly, my exposure has been limited to n equals two. In fact, it’s one of the reasons I left. One of the reasons I left is I wanted to know what are the things I’m going to miss, what are the things that I won’t have when I go somewhere else, ’cause that will hopefully help me understand better. What are the things that made it tick?
0:04:27.6 TW: But you were out… I certainly became aware of you when you were at booking.com, and did you run into that when you would be out speaking? Is there a moment where you realized like, “Oh, I’m talking about these challenges or opportunities and techniques, and my audience is not living, they’re in a completely different spot.” Did you have times you were talking to people and saying, “Oh, wait a minute, every company is not like this?”
0:05:00.3 LV: Yeah, yeah, again, that’s why I left. So clearly the realization was there, that doesn’t mean you actually understand what the difference is, what’s missing. And I think that is… It’s difficult to see from the inside. What is exactly different? There is this book called Empowered by Marty Cagan. It’s not about experimentation, it’s about product team, empowered product teams, and it has this… He was in a podcast recently where he said, “There’s this gap between organizations that have empowered product teams and organizations that have what he calls feature team,” so teams that are just churning out features for the business. And there is this gap which is so wide that people on either side of the gap cannot imagine what it’s like to be on the other side, and I think people who are embedded in these high velocity experimentation cultures, these high experimentation-minded companies, it’s really difficult to think about what is it like if you’re in the organization that isn’t like that. Yeah. I wonder how many people can really speak to that.
0:06:05.0 TW: So what is that… I guess, what is the high powered… And maybe back to Val’s a little bit, does high-powered mean every thought or idea underneath it has a… Is this something that could or should or would make for a good experiment? And it’s not an afterthought? What does that mean? If it’s a product and it’s an empowered product team, and we’re talking about experimentation, what does… Because with the throwing out a t-shirt maybe produced from Vistaprint, right? Does Vista do t-shirts.
0:06:46.4 LV: Obviously, yes. That’s where you would print it, yes.
0:06:48.5 TW: That’s where they’re throwing them out, but that confusing just the velocity of experiments or the volume of experiments with the embedded thought around ideas and uncertainty and reducing uncertainty in an experiment is one tool to do that, is that the framing between the… Or a way to think about the difference.
0:07:17.0 LV: I think you used the word uncertainty, I think that’s an important content here, or explicit uncertainty or a humbleness, as they like to say at Booking. The willingness to accept and even proclaim that you don’t actually know whether it’s going to work.
0:07:33.7 LV: And I think then from that stems the desire to then experiment. The experiment is sort of the output of this process where the leaders and teams say, “Well, we don’t actually know whether this is going to work or how this is going to work, and so we’re going to try different things and we’re gonna do those things based on ideas that we have about what might work, and we call those ideas hypotheses. And then we’re gonna test them to see which ones work,” and I think that is when experimentation comes at the end of this process. And that’s why we were talking earlier about Val’s comment at the conference that asking people to proclaim what they think is going to win… At the time, I said, “This is a stupid idea, the only thing that you’re gonna learn is that you’re bad at guessing,” but actually in hindsight, Val was right. If you want to teach people that they should be humble and that they often don’t know the right answer, then asking them to out loud and ideally publicly, guess and proclaim, I think B is going to be the winner, and then they’re gonna be wrong sometime. And the lesson there isn’t necessarily whether they’re a good guesser but the lesson they need to learn is that they are essentially guessing and that they need to be humble and understand that they won’t always have the answers, but that’s actually…
0:08:53.1 LV: That might be a good example of me not understanding what the outside world is like. At the time, I said, “This is a stupid idea, you shouldn’t do it,” now two years down the line I go actually, maybe that is a genius idea, Val.
0:09:05.3 TW: This is one where it’s unfair, I got to go and watch that, you probably don’t remember, ’cause you followed it up with saying, “Sure, put what you think your challenge is gonna be, but frame it as a hypothesis,” ’cause I think that’s where the two of you were… You were saying the same thing and articulating in a very, very similar way of getting people to say not, well, let’s try button color, let’s just try different button colors is bad. I think the thing with Google that feels like the unhealthy, just test test test test test test test. You said, “No, no, no.” You think that a high contrast button color would work better because it would be more visible or because we had user research, and I think that that goes back to… I think trying to get the culture being, no, no, you have to have ideas and those ideas have to be rationalized somehow with data, with just experience, with your gut feeling, I don’t care what. If you’re gonna pick, I think that’s gonna win, it feels like…
0:10:07.5 TW: That’s bad. You’re asking people to flip a coin and just knee-jerk one, they’re picking one or the other, because they see something that they think would make one a winner. Getting that captured even if they’re wrong, well, now you’ve got a richer something to pull from, ’cause you’ve got people saying, “I need to think about it,” and unless they take it, it’s like, well, my rationalization and my ideas are dead wrong, I might as well just test test test test test test, which would be unhealthy maybe.
0:10:37.3 LV: Yeah, it reminds me of something that we were talking about earlier. This isn’t in the show notes, but we can still talk about it, which is that Vista is fully remote first. So we’re not remote only, but we are remote first, there’s literally no one else in this city that works for the same company. I have no office to go to, I work from home. And one of the consequences of that is that I have to do a lot of more of my work in writing and a lot more of my work asynchronously in writing, so not chatting, but actually writing documents, and that’s hard.
0:11:11.8 LV: It’s mentally more demanding to write things down than it is to have a chat over Zoom, and it made me think about why is it harder to write things down? I think one of the reasons is that writing things down requires being much more clear about what it is you actually want to say. You review your words and you think, oh no, that’s not actually exactly what I wanted to say. And so it crystallizes your thinking and forces you to be more clear.
0:11:39.1 MH: Why do you think we have a podcast instead of a blog? Yeah.
0:11:47.3 LV: Well, actually, I say we should be writing all of this down.
0:11:51.6 MH: There is a transcript, there’s a transcript.
0:11:55.0 LV: That is not the same thing.
0:11:56.5 MH: No.
0:11:58.6 TW: As evidenced by reading the transcript, it’s horrifying.
0:12:03.4 LV: But then the thing is that writing a hypothesis and specifically calling out the metrics in the direction that you expect them to move is yet another step beyond that, it’s right, the speaking is relatively easy, writing things down requires even more clear thinking. But then when you want to specify, and I think what will happen is this number will go up by about that much, that requires even being even clearer about what your thinking and what your expectations are. I think this is one of the things that people struggle with.
0:12:36.0 VK: So I have to say, as someone who’s been in the audience of several presentations of yours being on the outside of like, things don’t work for me like it does at Booking, why can’t we be more like Lukas’s organization? I always thought that one of the ingredients of the recipe for success is that you were co-located there, and I know that one of the things you called out in your paper is that a lot of the infrastructure and the process and the methods that were built, were built with the experimentation in mind to support that from the very get-go. So I’m curious if the remote feels like an element to you now that you’re at Vista as one of the things that contributed to getting to where you were at at Booking.
0:13:15.2 LV: That is a very interesting question.
0:13:17.1 LV: I have no idea. You made me think, now damn it. I’m not sure whether the remote, so being physically collocated was the key thing, ’cause to be clear, the booking.com started running experiments when they had about five developers in total, and they just never stopped. Just grew from that. And so when we said the infrastructure was built with the culture in mind, we were talking about an organization that from almost from the beginning has been using experimentation as a way to figure things out and as a result, the way this usually goes in start-ups and scale-ups is that the people who end up in leadership are the people who were there at the beginning, you grow with the company. And so as a result, I think the thing that helped Booking a lot wasn’t necessarily that people were co-located, but that the C-Suite deeply understood how to run experiments. I have done peer reviews of experiments together with the Chief Product Officer, and he could click through the experimentation tool, knew how to look at the metrics, he knew how to read a hypothesis, he knew how to point out problems, and so the fact that the leadership understood the importance of experimentation and also understood how to interpret it, I think was critical.
0:14:38.4 LV: I’m not sure whether being co-located, had any…
0:14:41.4 VK: That’s helpful. I remember you told a story as part of your presentation one time about someone who was riding their bike on the way to work and had an idea about a test and launched it before lunch, and I was like, “How does that happen?” ‘Cause I remember being at an organization where I was in essentially a satellite office, and I didn’t have an opportunity to go to the home office until maybe two years into my time and being able to sit across the table and look at some of the people who were maybe adding a little resistance to my process and slowing me down. And sitting across from them, looking them in the eye and saying, “This is not a bad thing we’re doing. You helping me on this front, this is gonna raise all ships. Whatever that saying is.” And so that was really powerful. So I was curious if you thought that that was one of the ingredients, but I love that you were able to have those deep dives with senior leadership creating that space for experimentation, I just, I can only dream of what that would feel like, that sounds pretty powerful.
0:15:40.6 LV: I think that particular example was a copywriter who… The infrastructure that Booking had built, at some point, there was a convergence, and this was thanks to one of the copywriters who later became a PM, and she saw a convergence between the content management system that was being used to change copy on the website and the experimentation platform. And she said, “Copywriters can change copy using this content management system on the go and they can change the word in the website, and developers can run experiments using this experimentation platform that you’ve built, why can’t we change copy and make it part of an experiment?” And so together with the one or two developers, they used the SDK from the experimentation platform integrated into the content management system, and that is what allowed this particular thing to happen, where a copyrighter has an idea on the bike, gets to the office, goes into the content management system, changes a bunch of words and clicks Start. That is the technical foundation that allows it to happen. There’s also the organizational foundation where no one told her not to do this. In fact, leadership encouraged it and said, “Actually, if you have a good idea, then please do test it,” and so they’re…
0:16:51.8 LV: That’s the organizational friction that you’re speaking towards.
0:16:55.4 TW: But that seems to get back to where she had an idea. I think that to me… And I think through some of the examples, I worked with a client years ago that said, “We gotta do testing, we gotta do testing we gotta do testing, we need to pick our testing platform.” I’m like, “Great, we can do that.” I’m not entirely sure you’re at a point where I’m hearing the ideas to make changes, so I kind of said, “Sure, we can go figure out Optimizely versus Target or whatever, but maybe we should just as a thought experiment, kinda go through what ideas we have to test”, and we spent three months without getting a single idea captured. I was like, why and yet they still kept coming back to the infrastructure. And I guess, can you talk a little bit about the infrastructure? It feels like that’s one of those things we were talking earlier about somebody hearing something and maybe running with it in an unhealthy way and thinking, oh, I have to get my MiaProva or effective experiments and Adobe Target.
0:18:00.9 TW: I gotta get all that in. I gotta build the infrastructure, and then I will have a great program because I’ll be able to quickly run a test as opposed to the flip side, the cultural infrastructure of tests need to start with a theory that you’re trying to validate with a clear objective, where does that fit? And that seems like that’s maybe harder for you to put your thumb on from your booking.com experience, it seemed like it was there, or?
0:18:29.7 LV: And even before that, we’re talking about, even if someone has ideas, are they allowed to make those changes without having 20 meetings with stakeholders to decide whether this is a good thing to try? So there’s the organizational infrastructure that comes even before we’re talking about ideas. I think that there’s probably some back and forth there where if people know that they will never be allowed to try something, they will not come up with ideas, right? You have to give them the freedom to try things first before they will actually come up with things to try, I think.
0:19:06.2 VK: And one other thing about this, and rarely the… Let’s take one further step back person in a conversation, but for a copywriter to have the power to change the content and to say, actually I wanna run this as a test, goes back to something else you said earlier, Lukas, which is humility to say, I might not know what the optimal title of this piece of content should be. I’m gonna go ahead and put this out and let my users kind of decide with their clicks or whatever. But I think that that element, that goes back to culture for me is because you’re working with people who are curious and know that maybe we should lean on the data to help pave the best path forward. Right?
0:19:46.7 TW: But to go one step further back, there are lots of organizations where the copywriter would’ve been taught that they just need to go to the analytics team and the analytics team will tell them what the best copy is by analyzing the data. To me, that’s the battle of go analyze my data and tell me what I need to change. And there’s not even an awareness of experimentation. So I’ve been in plenty of those where they’re saying, and I’m like, no, no, no, you the copywriter needs to have ideas. And they’re like, no, no, no, the data needs to tell me what the answer is. And that’s a challenge.
0:20:17.8 LV: Yeah. They have to go back to the data analyst to come up with the answers and then have to go to the business to ask for permission. I think this is one of the things that I learned when I left Booking, is that when people at Booking talk about the analyst, it is not a different team. It is someone on their own team that has the skills that they need, ’cause they’re empowered product teams, right? They’re cross-functional. And this copywriter wouldn’t go to a different team. They go to a teammate that is on the same team that they can ask questions. So that’s one. And the other is that when people talk about the business, that is not a different department. The business is the website that is running, that is helping them make money, right? So when people at Booking talk about the business, they don’t mean a different department that is gatekeeping decisions. It is the machine that is running. It’s a different way of viewing things, I think.
0:21:08.6 VK: So this is my… I’m so excited to ask this question. I’m curious if you look at an organization and they have a successful program, you feel like I’m sniffing out some healthy culture here, does that always mean that experimentation is democratized? That it kinda lives everywhere, or that everyone is empowered? I’m wondering how closely tied those two things are for you?
0:21:33.2 TW: Why these, geez, you’re asking good questions. So if I think about the things we talked about before, right? So in the sense that experimentation for me is tightly coupled with the organizational structure and are people allowed to try things and are they humble enough to admit that they don’t have the answer? I’m trying to think of, if experimentation where that mindset isn’t widely dispersed, but it’s localized in a small part of the organization. Even if they’re very good at testing things, there’s gonna be a large share of the company that’s still making decisions HiPPO-based. And then I wouldn’t call that a successful “culture of experimentation” because part of it is that it has to be permeating all decisions. I talked about this in another podcast that I won’t name, earlier. I think the challenge isn’t necessarily to make a few very good decisions. The challenge is to make lots and lots of just a little bit better decisions and a lot of the decisions are going to be made…
0:22:38.9 TW: I’ll name it, that was the First Principles podcast. People are welcome to go listen to one episode and then come back to the…
0:22:46.8 VK: No, that quote was gold. I think when we talked about that, I told him, I was like, “I want that on t-shirts, I want that on mugs. I want that printed ever. It should be my background.” Oh, that was such gold.
0:22:56.4 LV: Well, I happen to know a good website where you can print things on lots of things.
0:23:00.1 VK: Oh, perfect. We’ll drop that in the show notes. [laughter]
0:23:04.3 LV: Excellent.
0:23:05.5 TW: So Booking and Vista are both ones that I think is fair to say are dominantly online businesses. Do you think about, or even when you were leaving Booking to go to Vista, did you talk to organizations where the website was, played a… Or the digital aspect played a narrower range in what was going on and then… ‘Cause I feel like when I worked a philanthropic foundation or a pharma company or a healthcare system, there’s somewhere there’s a lot more happening offline. So it wasn’t conversion rate optimization on the website. Have you thought about, or did you think about kind of one of those types of opportunities and where kind of having a healthy culture of experimentation fits when it’s maybe a little, I don’t wanna say less messy. I do not want to be the one saying, you have it easy with these simple little pure play online things. [laughter] But I do think I’ve run into it a lot where…
0:24:06.3 LV: Dude, cut me some slack. I’m already largely out of my comfort zone. I didn’t wanna make too big of a jump. But seriously though, one of the things that actually compelled me to Vista is that they actually have a pretty large manufacturing part of the business, ’cause the things they’re printing, they are making in their own factories. So there’s a lot of physical experimentation opportunities going on. So yeah, that was one of the things that appealed to me. But I did want to… So I was looking for a place where I could explore this idea of how do you grow a community of experimentation practice in an organization that isn’t Booking, leaning on the strengths that I had. Because I could have also said, well let’s try something completely different. Let me be, I don’t know, an airline pilot, but you do… We wanna explore the space of opportunity and try to find new things that you can learn without completely flipping the tables and trying something entirely new. This seemed far enough.
0:25:06.3 TW: Well, but I guess add… Don’t reveal anything. It seems like there would be, at Vista there would be ideas, if we could promise a faster turnaround or if we could promise there would be stuff to execute the experiment would require ultimately an operational manufacturing or production or fulfillment change, is that… Are you into that? Is it broadening out to that where the experiment is not necessarily optimizing a production line, but saying, if we’re gonna test this on in the digital experience, it’s gotta be backed up with production changes.
0:25:40.5 LV: Yeah, or even just even more basic, if we make changes to production, can we test those on what is the impact on customers when we make changes to production process? Even if we’re not displaying those things on the website, we could still change things about how we produce things. And same for shipping, right? So shipping is a good example of how much do we charge for shipping early? How does that impact the production process? So there’s lots of opportunities for experimenting in that space.
0:26:08.5 MH: What boxes to use, what packaging, what impacts returns. Yeah, there’s lots and lots.
0:26:15.6 LV: It’s an interesting space also, because Val was talking about friction and pushback and production manufacturing is a very operational heavy optimization focused area, right? People aren’t really thinking about, oh, how do we reduce uncertainty and what things do we not know? They’re really thinking about how do we optimize this process as much as possible, but from a different light.
0:26:38.3 VK: So I have a question in a slightly different direction. So Lukas, because you have computer science in your background and I believe you identify as a data scientist. When you were going through that training…
0:26:51.4 LV: Wait, is that identifying… I’m confused.
0:26:52.6 VK: Well, I don’t know because your titles don’t have data science in it, but I think that that’s what your trade… Yeah, titles and roles for things, right?
0:27:00.7 LV: I think I was a data science for a full two years. So yeah, fair enough.
0:27:04.5 VK: I think you can wear that badge. I think we’ll allow it. But when you’re going through all that training, did you imagine, Lukas, in the future, today focusing and spending so much time thinking about process and people and culture and operationalizing all this and what percentage of time…
0:27:20.0 LV: Oh, absolutely not.
0:27:24.9 LV: No, not in the slightest.
0:27:26.4 VK: Yeah. You spent more time I’m assuming in that than you imagined you would?
0:27:30.6 LV: Oh yes, 1000%. Yeah. So I studied the computational intelligence. So a lot of what we now call AI, right? So a lot of what I learned was around the… So either you get an expert to explain to you how they make decisions and then you put that in the computer, easy, right? Or you asked the business to tell you their objective function and then you asked the computer to optimize for the objective function, easy, right? And so I spent I think the first five years of my career as a business intelligence consultant, trying to help companies first measure how they were doing, quickly realizing that lots of companies don’t know what their objective function is. They have no idea what they’re trying to optimize or how they would measure it. And then by chance I rolled into a project where the product they were using was a next best offer or next best action tooling was using some sort of fancy basin bandit algorithm under the hood.
0:28:28.8 LV: And I started working with a lot of clients trying to figure out how to define their objective function so we could figure out what to actually sell to customers. And I would get into conversations with someone from marketing, someone from sales and someone from the call center where the actions were going to be taken. And they would say things like, “Okay, so we want… ” I would ask them, what do you want this thing to optimize for? And sales would say, “Well, we want you to optimize for profit, clearly.” And marketing would say, “No, no, we want you to optimize for selling iPhones.” And then call center would say, “No, actually we want you to minimize call handling time.” And I would sit there going like, “Dude, guys, these are all opposites, right?” The best way to minimize call handling time is to hang up the phone and not sell anything. IPhones are actually a low margin product. So if you wanna make a profit, then we don’t sell any the iPhones. And they would just look at me and go like, “We just spent $2 million on an expensive piece of software that was gonna use machine learning to optimize our business. Now you’re telling us that you can’t do it.” And so I got really frustrated with these businesses that just for some reason or other haven’t figured out, what is that single thing that they want to optimize for? What is it they’re trying to achieve?
0:29:43.1 TW: Would you respond to that? If you look at your older wiser self now, if you dropped yourself right back into that situation, I’m thinking of a former coworker who I was talking to earlier this week, who exactly that, he was like, the product team, they rolled out a redesigned thing and they wanna optimize for three things, call one complete bullshit, but two of them that could easily be at odds with each other and neither one of them is performing better. And I’d said, “Well, could we walk through a decision matrix maybe and say upfront plan out, what would you have done if this one goes up and this one goes down?” Ideally you just go to one. But I guess what would you do now, ’cause that is, I think, tons of organizations run into that and where they’re not really clear on what they’re trying to do. They wanna make everything better and they say they’re very clear. We just wanna make everything better. [laughter] What is wrong with you? Why are you looking at me that way?
0:30:40.1 MH: We like to measure everything and make everything better.
0:30:43.8 TW: It’s very simple. We just want to make everything better. Or really though, we’ll say we just want to drive revenue.
0:30:50.7 LV: Lemme make one thing clear. I am definitely older. I’m not sure about the wiser, and I think that… I’m not sure I would say anything to my younger self, this to me was a very important part of my learning, right? The being part of these conversations, seeing the disconnect. That slowly starting to understand that these companies weren’t struggling for technical reasons. The problem wasn’t the tooling, the problem was the perverse incentives and the fact that these people were part of the same company, but at the same time they were actually trying to optimize for different things. I think that was an important part of my growth as an individual. And I wouldn’t really go back and whisper in my ear and go like, “You can skip these steps. There’s a faster path.”
0:31:35.7 TW: Did you then, or do you now see it as the role of you in that role to actually help? You just did a very crisp… Guys, you see the lunacy in this, right? These are a… And I think, there are times where analytics teams or optimization teams say the organization is bonkers. This is so obviously working across purposes. Does it fall? Who can carry the torch of saying, let’s explain why we can’t get these. I feel like there are the examples, the don’t answer the phone. That example circulates in experimentation and analytics circles all the time, but not necessarily in the business. And if you’re talking about the culture of a business and who does that education? Who tries to say, “I don’t want to be a jackass, but I do want to kind of gently point out the logical inconsistencies of what you’re asking for.” And is that a key part of trying to help shift to a more productive culture of experimentation, I guess to use our show title?
0:32:43.4 LV: There’s a challenge here though, right? So, ’cause I think it’s, to me at least it’s relatively easy to find these things that are at conflict, that is a whole world apart from actually being able to solve it, right? Because these conflicts are in my mind arising from the fact that different parts of the organization are incentivized to achieve different things and that is, in some sense, necessary for an organization to scale. In your previous example you said, we just want things to be better. Well, if you tell everyone just go ahead and make things better, then there is no conflict in the objectives now, it’s just that it’s super unclear what everyone is supposed to achieve. And I think think the challenge is that when leaders become clearer in what they expect parts of the organization to achieve, the more clear they become, the larger the odds that some conflict is going to arise, right?
0:33:37.2 LV: At some point you’re gonna tell one department to do X and the other department to do Y, and because they are different things, they will conflict at some level. And this is why I said defining metrics is thinking but more difficult just like writing is thinking, but more difficult. And one of the reasons it’s more difficult is because you become so crisp and so clear that the conflict naturally arises. And then some idiot like me is gonna walk in the room and say, “Hey, did you guys realize that you have conflicting objectives?” And they go, “Yeah, yeah, we knew that. How do we fix it?” And I think that is a far more interesting challenge. How do you design an organization, both in terms of reporting lines at which departments exist, who’s in which department, what objectives do those different departments have? How do you design an organization so that you minimize the number of conflicts and the number of conflicting objectives that teams have?
0:34:32.9 TW: That’s simple. I love the idea of the writing it down.
0:34:36.2 LV: See what I did there? I just, Val was asking all of these difficult questions. I just turned the tables on you, like how would you answer that, Tim?
0:34:42.8 TW: That’s right. Well, I do think you did just kind of, you did call out the… What’s wrong is that it’s not like, well you’ve gotta pick one. It’s like, well, no, the call center does want to see fewest calls and then they want to handle them as effectively and efficiently as possible. You wanna have the highest customer satisfaction, well quadruple your staff, so there is, from managing an organization, you want conflict. You can’t just say, “I’m gonna do just one thing.” Unless again, you say, “Well we just wanna maximize profit.” Okay, great. But there’s a lot of different paths to do that. And the path that you want to pick is your strategy where you are balancing different factors, which means yes, there may be these things may appear to be in conflict.
0:35:34.7 TW: Don’t try to make one change and have it improve things that are both in conflict. You may say, “I wanna make this change.” I have to put the thought in to say,” I think this is going to sell more iPhones without negatively impacting overall profit by X amount.” To me it starts to go into the guardrail versus primary optimization metric. But all of that is like, that’s like critical thinking that the business needs to do. I think I live so much… I’ve lived so much in the space where it’s like the critical thinking part is kind of the ball that gets thrown up in the air and everyone scatters, which I realize sounds a little elitist and high-minded. I think anybody can step in and say, “Well, let’s try to understand how we might catch that ball and get people to react to that.” But if we just think that the machine learning tool, let’s just do something Bayesian and then that’ll solve it. Yeah, but Bayes is dead, Bayes isn’t catching any balls, that’s gonna go to the business.
0:36:43.0 LV: Oh boy Tim, did you just start another war?
0:36:45.3 MH: That’s what he does.
0:36:46.5 LV: Bayes is not dead. Bayesian thinking is alive.
0:36:49.4 TW: No, no, no. I was saying Bayes himself. I was like, Thomas Bayes is dead.
0:36:54.5 MH: No, no, this podcast is firmly Bayesian. Don’t worry.
0:36:58.2 TW: Yeah. I am definitely not taking a Bayesian frequentist stand. I am not equipped to and absolutely will not. What’s your prior? Sorry.
0:37:12.6 MH: I think one thing… I’ve been taking some notes as we’ve been discussing this. And so in a certain sense, we had you on the podcast, Lukas, and we wanted to ask you all these questions about building a culture of experimentation and I think you demonstrated a couple of things that I want to share back to you and then get your reaction. Because it’s almost like what we just had was more of a reflection on experimentation or culture of experimentation. But one of the things that… Some of the things I heard you say was there needs to be, and I’m gonna use a term you didn’t use, but epistemic humility, which is basically operating under conditions of uncertainty, right? And so that was one of the things you brought up, you also brought up bringing clarity about expectations and that was in the concept you’re bringing up about how you’re now writing a lot of the things and the difference that’s creating.
0:38:04.5 MH: The other thing was that leadership understood and considered critical the role of experimentation. That was something that stood out. There was organizational encouragement to try things. There were people allowed to try. So the story of Anna the Copywriter, and there were cross-functional teams who were close by to the people who were actually making the whole thing work. There were also, you mentioned something about making lots and lots of slightly better decisions, not necessarily throwing one big rock in the ocean, but incrementality I guess. And then I just have a note that says, be smart. I don’t know what that was for, but I think that’s probably helpful.
0:38:48.3 VK: Can’t hurt.
0:38:49.4 MH: And, let’s see there’s one more. Yeah, okay, so we were just talking for a good amount of time of thinking through what is the most important or most critical thing we should be optimising for. And so as I was listening to you describe all those things, what started to emerge was a bullet list of, oh yeah, these are all markers of what probably a culture of experimentation should probably look like. So congratulations, you totally did it.
0:39:14.6 LV: Oh, thank you.
0:39:15.6 MH: Yeah.
0:39:19.0 LV: I think the thing I would add there is that you mentioned that leadership understands experimentation, but critically also their role in experimentation, how they play a part in enabling this. And not necessarily coming up with hypotheses and pushing for particular ideas. But the team’s earlier point when… Part of the role of leadership, I think, is in this high velocity experimentation setting, giving teams a target to aim for, then monitoring their progress against that target. And when they start to go off the rails, to adjust and not adjust by saying, “No, that’s wrong, you shouldn’t have run that experiment, that experiment is bad, you should stop doing that.” But adjusting the target and adjusting the objective. So in this setting where the call center is optimising for the wrong metrics, someone at the top should step in and say, “Actually, these metrics that we have set, they are causing perverse incentives, and so we should adjust the metrics so that the teams are actually optimising for the right thing.”
0:40:14.5 MH: Leadership’s role is to help resolve the inherent conflict that arises from the pursuit of these optimisation.
0:40:22.6 LV: And at the same time acknowledge that no metric is going to be perfect, right? These conflicts will always exist. These teams when you send them on a mission with the objective, there are always gonna go slightly awry, and that is unavoidable. At least if you want to give them the freedom to operate and you don’t wanna be like a, one person makes all the decisions in the company. There are gonna be inefficiencies in the system, and part of the role of leadership is to figure out these metrics that we’re currently using, are they sending us off a cliff? And if they are, then maybe we should be using different metrics, right? And the challenge isn’t to find the perfect metrics, but to find the metrics that will help the teams execute in the short to medium term against the long-term objectives that the company has.
0:41:06.0 LV: And so one example that I’m reminded of is, and I think it’s Working Backwards the book about Amazon, where at some point Jeff Bezos decided that mp3s were the futures and CDs were gonna die. That they set an objective for teams to sell more mp3s and every single experiment they ran, they lost money. The revenue went down in all of those experiments, and every time Jeff would say, “Yes, I know. Short term, we’re going to lose money on this thing ’cause mp3s are cheaper than CDs, but long term, this is the future.” And so they kept a team on these objectives, even though from a conflict point of view, if you look at the entire organisation and say, “Hey, this entire company wants to make money,” then this is clearly at odds with the objective of the overall company, but still strategically, in that case, leadership is using a metric, is using an objective as a way to let a team operate against a strategic objective.
0:42:00.0 TW: Is there… On the leadership front, one more question.
0:42:02.8 LV: Oh, yeah.
0:42:04.4 TW: One, just one more.
0:42:06.3 LV: Okay, fine.
0:42:06.4 TW: ‘Cause the other I thought, there’s a… I’m thinking of the rounded versus square corners on a button, like the other case, where is there a leadership role, or is it a challenge, or is it a problem when teams wind up clicking along but they’re doing tiny little tests, they’re not taking bigger swings, and that’s partly because they can say, “Oh, I’m clear that if I improve this metric which we’ve plugged into the larger context, so let me do a little small things as opposed to,” and the organisation starts to lose a big idea of a mp3s versus CDs or a bigger idea, is that… One I guess, is that part of a healthy culture, that there is a range of big experiments and small and medium? And two, if it is, is that a leadership responsibility as well to say, “You’re giving me a… Great, you ran 50 experiments and they all seem tiny?”
0:43:10.4 LV: Yeah, the answer is yes and yes. I think definitely there needs to be a balance between big swings and optimisation. I think it’s a fallacy to say that you need only one or the other, you need both, especially for larger companies. And I do think its a role of leadership to make clear to different individuals or teams in your organisation, which one they are supposed to do. Is a team supposed to do optimisation or is a team supposed to take big swings? And I’m reminded of, we had a team way back when I joined Booking that was doing this within the team, because the way they were rewarded was for measured impact in experiments. And the PM realized that it’s like a fat-tailed function, where in a quarter if he spend two months to doing micro-optimisations, he could hit the target from last month. And then the last month, he would go all out and try the wildest things, and the team loved it. ‘Cause it’ll be two months of tweaking the rounding of the buttons and making micro-optimisations to things that they knew they would work just to hit the budget, and then the last month would be like, anything goes.
0:44:21.7 LV: I was like, “This is genius,” PM figured this out. And then as the company grew, I think leadership also realised that they can do this at the organisational level as well, and so at some point they created an entire department that was just new stuff. And they said, “You don’t get to do any micro-optimisations. Any feature that you work on, you need to double your user base in a year or it’s gone, right?” And so that clearly sets the boundaries for those teams to say well, “Yeah, you don’t get to do any micro-optimisations.” By the time that the product is mature enough that it can actually, will benefit from micro-optimisations, that’s when you throw it over the fence to the rest of the organisation, and they know how to do micro-optimisation, but that’s something that leadership can really drive, yeah.
0:45:03.2 VK: So I’m stepping in for Moe, I feel like it’s only fair that I get to ask.
0:45:07.7 LV: Oh, boy.
0:45:08.7 VK: Sneak in one more last question. I think it should be quick though. So Lukas, not everyone will have the ability to interview with senior leadership when they’re looking for their next role within experimentation or CRO. So do you have any advice on questions that you might ask to kind of get a sense for what it would feel like boots on the ground in that role and how leadership may or may not adopt this or embrace it? What could you do? What are some questions or things that you could pose to get a sense of that when you’re in the interview phase?
0:45:39.5 LV: But you’re asking from the point of view of someone who wants to interview at a company to become part of the company?
0:45:44.5 VK: Yes.
0:45:45.4 LV: Interesting. I guess it would be something along the lines of, what bets is your leadership making or what was the last bet that they made that failed and how did they… How is this communicated? Is it something that was shoved under the rug and we’re like, “Nah, let’s forget about projects Phoenix.” Or was it something that was celebrated? Was it, well, we tried this thing, it totally collapsed and fell flat, but at least we tried. And I think maybe would be a way to get a sense of how open is leadership in this company to sharing their own vulnerabilities and the fact that they don’t know. This is tough question again.
0:46:24.0 VK: No, that’s good. No, I think that that’s telling ’cause it’s very much turning around. Like, tell me about a time when senior leadership… [laughter] No, that’s great. I love that. Thank you.
0:46:32.6 LV: Can you give me an example of when… Yeah, exactly. Or you could ask them like, tell me of a time that you disagreed with one of your superiors and how did they respond? I think that would be a similar question.
0:46:47.0 MH: ‘Cause I intend to disagree a lot here if I come on board. No. All right.
0:46:51.9 LV: Really? I haven’t seen any disagreement here. We all seem to be very much in the… On the other hand, you had a good list. You had a good list. But it’s not very smart to put it in Tim’s… I don’t know but that was your words.
0:47:05.5 TW: That was Michael. [laughter]
0:47:06.9 LV: Yeah. It wasn’t very actionable. I have to ask, if people listen to this podcast and they’re like, “I wanna change my organization to be more experimentation driven,” can they take that list and actually make changes? ‘Cause I am still not convinced that that is even possible. This is one of the reasons, again, that I switch companies is I wanna know, can it be done?
0:47:28.5 MH: That’s episode number two with Lukas Vermeer coming up sometime soon. [laughter] But actually you bring up… So this is a whole other topic Lukas, that I would be dying to talk about with you because I think the exact same thing about data culture, but we certainly have not the room to cover it today. All right. We do have to wrap up. What an amazing conversation. Thank you Lukas so much. This is…
0:47:54.9 LV: Thank you.
0:47:55.5 MH: Incredible and delightful and I’m super stoked we completely changed how we even ran the podcast today, so I’m just delighted beyond measure. But one thing we’re gonna keep the same is we wanna share our last call, something that we’ve recently come across that might be of interest to our listeners and it could be on any topic. But Lukas, you’re our guest. Do you have a last call you’d like to share? Or if you wanna think about it, I could go to somebody else.
0:48:22.6 LV: Oh, boy. I completely forgot about this one.
0:48:24.1 MH: It’s okay.
0:48:24.9 LV: I forgot you were gonna ask me this.
0:48:26.2 MH: I can go back. I can go to Val or Tim first if you want a second.
0:48:29.5 TW: Yeah. This is what we normally cover before we start rolling.
0:48:32.4 MH: Yeah. Yeah. These are the things like, okay. So optimizing our process. We made a big swing today.
0:48:38.9 TW: This is why I’ve been stressed through the whole thing. We didn’t go through the checklist.
0:48:43.8 MH: Somehow, it magically worked out but…
0:48:45.3 TW: Good job, Michael.
0:48:46.4 MH: All right. Tim, why don’t we start with you so you can show us all how it’s done.
0:48:50.2 TW: ‘Cause you’ll know it’ll be long winded.
0:48:51.6 MH: Yeah. What about you, Tim? What’s your last call?
0:48:55.0 TW: So I am going to plug the… As I do every year, I’ll plug the Data Connect Conference in Columbus, Ohio, July 20th and 21st. All dudes are welcome, but none of them will be on the stage. So it is a conference, 100% female and underrepresented speakers, and it’s a great conference. So check out dataconnectconf.com. Columbus is beautiful in the middle of July. And then my deeper read, which you don’t have to travel for, is the legendary Stephen Wolfram wrote a very, very long piece on what is ChatGPT doing and why does it work? I cannot pretend to fully understand it. I cannot pretend to have completed reading it yet, but I’m kind of taking it in chunks and it’s pretty interesting. So if you’re kind of curious about how to log large language models actually kind of work. He uses GPT-2 kind of more as his example. But I think as people run around and say, “Oh, ChatGPT, it sounds human, but it may not be accurate.” I think it provides a lot of, if you wanna be the one who has a little bit of a deeper understanding of LLMs, it’s a a very, very well written article on the subject.
0:50:08.2 MH: Outstanding. All right. What about you, Val? What’s your last call?
0:50:13.8 VK: My last call is a little bit of a twofold because I was worried that Lukas might take one of them. So I came up with a little bit of a part two of backup, but Kevin Anderson, who works with Lukas at Vista, has Experimental Mind substack, which is always chock full of great content. And the bottom of each of those, there’s always some recently posted job on experimentationjobs.com. So if you didn’t know that exists, definitely check that out. But there was the headline post from this week was, How to Apply the Scientific Method to Startups Without Being a Zealot by Ben Yoskovitz. And it’s an absolute gem, and it just pulls so many great parts together about starting with assumptions and those are your hypotheses and applying some design thinking framework for organizing and prioritizing the ideas. And he’s very much talking about applying this to a startup, but I could absolutely see this being like an entrepreneur kind of initiative or effort, and especially if you get to work with a product team. So, really loved it. He had some great assets linked off of that, so definitely check that one out.
0:51:12.8 MH: Awesome. Okay, Lukas, how about you? What’s your last call or multiple now that you see it’s possible.
0:51:19.9 LV: Okay, now I can do more than that, that I didn’t know that was allowed. So ChatGPT actually reminded me of I think it was Scott Alexander, I can’t find it now, but there was this great post on alignment problems, which is how do we get AIs to do what we want, which is closely aligned to the discussion that we were having early, like, how do you know that the AI is actually doing what we want? And that is going to be very much a problem when the AIs get smarter than us. And so we wanna make sure that we solve this problem before they get smarter than us. And so a bunch of researchers tried to get an AI to finish prompts for fiction books that were always non-violent. And so they would get to the character to say, and he pulled out his gun and then…
0:52:04.9 LV: And then the ChatGPT has to finish it in some non-violent way, and spoiler alert, it’s actually very difficult to train an AI to always not use violence in any event, but it’s a good model for how can we figure out whether an AI has learned the right thing or not, so that’s a… I think it was Astral Codex that post. You can add it to the show notes.
0:52:27.9 LV: The other thing I was actually thinking, let me just thrown in around something, working from home is awesome ’cause I get to tinker more, and so recently I’ve been playing around with making my own t-shirts, which is fascinating, creating something from fabric. But also with my aquarium, and so I bought this nice device, which is called a Seneye, S-E-N-E-Y-E. And you plug it in your aquarium, and it gives you real-time temperature, pH and NH4 readings, which has taught me a bunch about what happens when I feed the fish, what happens when I throw in carbon dioxide that I have in a bottle here. And so just the idea of getting immediate feedback on a living system like this, to help you understand what you’re doing has impact on it is has once again driven home to me the importance of getting feedback, ’cause I’ve been tinkering with this for a year, and I had no idea that sunlight affects the levels of pH. I learned.
0:53:30.1 TW: It’s also cool to have an aquarium in the background of your very cool office set up, so it’s like if you see the fish moving around your are like, “Either that person has a really cool fancy virtual background with movement in it or that’s their real background.”
0:53:46.3 LV: I actually do want a much more big aquarium so that I can do walls to wall on my screen. That’d be even cooler.
0:53:54.2 VK: That’s a very legit background. You got plants, aquarium, records, books. You really checked all the boxes.
0:54:00.6 LV: I’m a legit YouTuber, man.
0:54:04.3 VK: It’s great.
0:54:06.5 LV: I need more LED lights, I think.
0:54:08.3 MH: Oh yeah, a couple more LEDs or wall-based lights.
0:54:12.5 LV: Also better camera.
0:54:15.9 MH: No one on the podcast knows what you’re talking about.
0:54:17.1 TW: Michael.
0:54:18.7 MH: All right.
0:54:20.6 TW: Michael, what’s your last call?
0:54:22.2 MH: Well, thanks for asking, Tim. So recently, a couple of people that I enjoy reading their material who are working in the media mix modeling space, which is Michael Kaminski and Mike Taylor came out with an ebook about operationalizing media mix modelling for modern brands and I have started reading it and I really have been enjoying it just it’s a great break down of the topic, and I just think in this world as we are trying to measure media and being thoughtful about it in better ways, seeing media mix model re-merge as something that’s accessible and useful in marketing again, is a good thing. So, you can check it out.
0:55:00.2 TW: I’ll also say that Michael Kaminski is very good about not saying, “This is the one and only approach,” he tends to be like, “Look. These are the different pillars,” he has his four pillars, which is always refreshing to not have somebody say, “This is my hammer, and here’s why all the world is a nail.” ‘Cause Michael Kaminski will talk about randomized controlled trials as well, he won’t say that it’s MMM or die.
0:55:26.5 MH: All right, well, I’m sure as you’ve been listening, you’ve been thinking, wow, I’ve got some thoughts on this topic, and we’d love to hear from you, and it’s really easy to talk to us. You can reach us at our Twitter account or on the Measure Slack group or on our LinkedIn page, and we would… We’d love to hear from you either on this topic or any other and that you’d like for us to cover on the show. Val, thank you again for stepping in and being our guest co-host on this episode, really appreciate you and your voice.
0:55:54.7 VK: Yeah, happy to be here. This is fun.
0:55:56.4 MH: And also no show would be complete without big thank you to Josh Crowhurst, our producer who makes all these things possible, including all the little edits and cuts that we’re gonna do to make this show do the thing it’s gonna do, which I think is awesome by the way. It’s always pretty awesome. And once again, Lukas, thank you for taking the time to come on the show. It’s been awesome to get to know you a little better and to hear you talk about and reflect on this topic.
0:56:24.7 LV: Thanks for having me. And thanks for letting me ramble.
0:56:28.5 MH: No, it’s the best part of a podcast is rambling is totally allowed. Tim and I’ve been doing it for seven years now, so yeah. [laughter]
0:56:38.0 LV: If only we were doing this as a vlog post then they would be more coherent.
0:56:41.4 MH: Yeah, but much more difficult to pull off.
0:56:47.3 LV: What metrics are you even optimizing for with this blog post?
0:56:50.7 MH: Or this podcast even. All right, well, I know I speak for both of my co-hosts today when I say, no matter what your culture of experimentation, remember, keep analysing.
0:57:04.4 Announcer: Thanks for listening. Let’s keep the conversation going with your comments, suggestions and questions on Twitter at @analyticshour, on the web, at analyticshour.io. Our LinkedIn group and the Measure chat Slack group. Music for the podcast by Josh Crowhurst.
0:57:22.7 Charles Barkley: So smart guys want to fit in, so they made up a term called analytics. Analytics don’t work.
0:57:29.3 Kamala Harris: I love Venn diagrams, it’s just something about those three circles and the analysis about where there is the intersection, right?
0:57:39.7 TW: I’ve had that happen a few times where somebody was like, “Well, what Tim would say about this is blah, blah, blah,” and I’m like, “Oh my… No, no, you did not understand what I said, and now you’ve interpreted as something completely different and yeah.”
0:57:56.4 LV: Also, it’s an argument from authority, so it’s completely devoid of support.
0:58:01.2 MH: I just always change what I say to different audiences so no one can really pin me down. That’s what I do.
0:58:06.9 LV: Smart.
0:58:07.7 MH: Who have you talked to last?
0:58:09.1 VK: Keep ’em guessing.
0:58:10.1 LV: How’s that working out for you?
0:58:11.8 MH: Well, it’s a web of deceit that’s gonna catch up to me some day, but…
0:58:15.4 LV: And does your wife listen to the podcast?
0:58:19.4 TW: Good thing you’re not recording stuff that you’re saying.
0:58:23.8 MH: Lukas, this podcast is a direct result of a proposal by another friend of ours who said we should all write a blog together, and we were like, “Oh, blogging is so difficult, why don’t we just record ourselves talking and then do that. It’s so much easier.” It turns out it’s about the same amount of work.
0:58:46.7 TW: Let’s not put the we in that, that was… That would be Michael said.
0:58:48.1 MH: Okay, I said. I said let’s just record ourselves talking about it ’cause I didn’t wanna have to try to write blog posts.
0:58:55.9 TW: I said it doesn’t matter. This will be a flash in the pan. It won’t go anywhere. Whatever. Sure.
0:59:01.5 MH: Now look at us.
0:59:03.1 VK: We got you now.
0:59:03.8 MH: We’ve got Val Kroll on the podcast.
0:59:07.0 TW: That’s right.
0:59:07.6 VK: I have to say though, that the outro, the keep analyzing to hear that without the music in the background, kind of a weird experience. [laughter] Kind of a weird experience ’cause I expected the voice to be like, “Thanks so much for listening. Let’s keep the conversation going.”
0:59:25.9 LV: Can this AI save teenage spy Alex Ryder from a terrible fate? “No,” cried the villain, “you will never take me alive.” He raised his gun, fired and then… It was all a dream.
0:59:40.2 VK: Oh, it was all a dream. [laughter]
0:59:43.0 LV: Good job, AI. Well done.
0:59:45.6 MH: That’s a little bit too convenient.
0:59:47.7 TW: That’s the biggest hack, the biggest writerly hack ever. And then they woke. I can’t resolve this.
0:59:53.5 MH: That’s right.
0:59:54.2 LV: Luckily, the gun was out of bullets.
0:59:56.1 TW: ChatGPT is a hack.
1:00:01.9 TW: Rock flag and fat posteriors.
This site uses Akismet to reduce spam. Learn how your comment data is processed.