#221: Causal Inference Revisited (...DAGnabbit!) with DJ Rich

What causes us to keep returning to the topic of causal inference on this show? DAG if we know! Whether or not you’re familiar with directed acyclic graphs (or… DAGs) in the context of causal inference, this episode is likely for you! DJ Rich, a data scientist at Lyft, joined us to discuss causality—why it matters, why it’s tricky, and what happens when you tackle causally modelling the complexity of a large-scale, two-sided market!

Links to Items Mentioned in the Show

Photo by Michał Mancewicz on Unsplash

Episode Transcript

0:00:05.8 Announcer: Welcome to the Analytics Power Hour, analytics topics covered conversationally and sometimes with explicit language. Here are your hosts, Moe, Michael, and Tim.

0:00:22.6 Michael Helbling: Hey, everybody. Welcome. It’s the Analytics Power Hour, and this is episode 221. I’ve noticed any time we’re gonna talk about maybe like a more technical topic, I’m always tempted to make some kind of like little joke about it, but not this time. This time we’re all business. This episode, we’re diving into causal inference again, because if we aren’t talking about culture… Did you hear that thunder crash? That was a sign.

0:00:48.7 Tim Wilson: Yes.

0:00:48.8 Mow Kiss: Oh yeah.

0:00:49.0 MH: That was Awesome.

0:00:49.7 TW: Yeah.

[laughter]

0:00:49.7 MH: Nice. Don’t go on.

[overlapping conversation]

0:00:53.8 MH: Hear the Oregon music.

0:00:54.7 TW: It’s like you have one of the boys standing with those sheet at 10:00 off to the side rattling the…

0:01:00.8 MH: That’s just the thunderstorm from Atlanta, Georgia, commenting on this episode. All right. No, so we are gonna talk about it ’cause if we’re not talking about culture on this podcast, we’re probably talking about causal inference. Tim, I know you’re excited about this topic, but how are you doing?

0:01:18.4 TW: Good. But I don’t think it’s technical. I mean, it could be technical, but it’s conceptual.

0:01:22.5 MH: Awesome.

0:01:22.6 MK: It’s technical.

0:01:23.3 TW: I think that was the technical. It’s technical.

0:01:25.2 MH: It’s conceptually technical. So anyways, Moe, welcome.

0:01:30.3 MK: Hi.

0:01:30.5 MH: How are you going?

0:01:32.0 MK: I’m going all right. A bit slow.

0:01:34.1 MH: Alright, mate.

0:01:34.8 MK: But I’m generally pretty good.

0:01:37.5 MH: Awesome. Well, and I’m Michael and we needed a guest. We wanted someone who could bring some deeper insight to this topic, and with a big hello and a shout-out to the entire data science team at Lyft. Hello. Shout-out to you, all of you. I’m excited to welcome DJ Rich. DJ started his career in data science building models in the financial industry, and over time has migrated to Lyft where he is the data scientist. He also has a really excellent YouTube channel called Mutual Information, which I encourage you to check out where he covers data science topics. Welcome to the show, DJ.

0:02:10.1 DJ Rich: Hello. Thanks for having me.

0:02:11.7 MH: And the thunder is just gonna keep rolling on my side, but once we get going, I can mute my mic.

0:02:17.3 DR: I think it’s an omen for the podcast.

0:02:18.7 MH: Anyways, it’s awesome to have you. What’s crazy is I looked at the weather an hour ago and there was nothing on the weather, and now it’s a thunderstorm. Anyways.

0:02:29.2 MK: Helb, I genuinely believed you did that for dramatic effect. Like I did not realize there was a storm in the background.

0:02:37.1 MH: It’s a real… No, all of nature is excited about this topic.

0:02:42.0 MK: Apparently.

0:02:43.1 TW: Yeah, it is causing disturbances in the atmosphere.

0:02:45.6 MH: That’s right.

0:02:46.1 TW: That feels about right.

0:02:48.3 MH: That’s causal inference for you. All right. See, I said I wasn’t gonna make jokes about it, and then here we go. All right, so why don’t we… Maybe just to get us kicked off here, DJ, maybe just give us a little bit of your background, some of the things you’ve done in your career, and then we’ll kind of jump into some of the topical stuff and get into some of the discussion, but I think that might be a good kickoff point and help people get to know you a little better as well.

0:03:13.6 DR: Sure. So I have a master’s degree in financial engineering from Berkeley. I finished that in 2014. Financial engineering is… To put it overly simplistically, it’s basically statistics plus traditional finance in a sense. You spend a lot of time basically figuring out how to put together derivatives and then pricing those derivatives. So kind of originated on sell-side investment firms trying to figure out fancy packages of options that they could sell off to people. But then it’s evolved a lot. Finance and hedge funds have gotten a lot bigger, so they do everything under the sun.

0:03:47.0 DR: So data science is a natural causal of it. So I finished around 2014, worked in the hedge fund industry for three years, part of a multi-strat hedge fund that is now defunct, but I had a good ride while I was there, and I was basically doing like loan analysis, basically trying to cherry-pick loans to come up with a healthy fixed income portfolio that you could leverage up, at the same time playing with other equity strategies that were part of the company. I did that for a few years, then briefly was in the lending industry. And then in 2019, I moved over to Lyft and I’ve been there ever since basically doing causal inference and causal forecasting, trying to help them manage that really complicated market that I’ve come to love and hate. So yeah.

0:04:31.6 TW: So you made a… I think we were chatting a little bit before kind of in some of the notes that you did make sort of a… When it comes to causality, you sort of drew a distinction between the… There was sort of a profound shift because with the financial markets, the levers you can pull, you’re trying to understand all these things you can’t control at Lyft. Not that you don’t have macro forces and pandemics and other things operating on it as well, but you have a system where you actually are pulling the levers. So I wonder, is it… Was that kind of a conscious shift or I mean, were you making that move? You’re like, “Oh, I’m going from trying to predict to basically try to affect.”

0:05:15.7 DR: I had a… In the graduate program, you had… They had classes on causality, but it wasn’t front and center. When you’re in the financial industry, everything is about prediction and that’s because you can… It’s fair often to assume a very simple causal structure, which is that the decisions you make have no impact on the things that you observe kind of. Like you can basically take stock returns as having nothing to do with you. And that’s a really easy causal environment to deal with because you don’t have to disentangle your actions from the things that happened.

0:05:48.0 DR: And so the analogy that I’ve heard people use is finance is kind of like astronomy. That’s another environment where it’s still horribly complex and extremely difficult, but it’s extremely safe to imagine that the universe that you’re observing in the sky has nothing to do with what you’re doing on the earth. And so there’s kind of a luxury scientifically there. So what it means is if you could find like time lagged correlations in finance, those things are as good as money. So if you can find anything that you can observe right now that will correlate to the future, then you can structure bets around that. But that itself is also extremely hard. And because everyone is trying to do it simultaneously, you get these really small correlations. Like a 0.03 correlation is a huge deal. And so that’s why you’re just in this constant cloud of data. So yeah, causality isn’t really a problem, but massive noise is the problem, so yeah.

0:06:45.2 MK: ‘Cause I was about to ask, what makes it so complex, and it is just those macro factors and the amount of noise, like you said.

0:06:55.2 DR: Yeah. The amount of noise is one thing. Another thing is the non-stationarity of the market. So the analogy that I’d like to think of is imagine every stock return every day is a sample from a multi-variate normal distribution. And if everything was easy, that would be the same multivariate normal distribution that you’d be sampling from. And then you just have to estimate the mean and covariance from those samples. That is as easy as a data science task it comes. But it comes a lot harder if every single sample comes from an entirely different multivariate normal with a different set of parameters. And let’s say those parameters come from no other simple distribution and that’s just like an impossible task. Like you may just be completely out of luck there. So the reality is somewhere between those two extremes where you’re estimating from some true noble distribution and then another sense where you’re just sampling from a new distribution every single time and you can’t… It says basically as non-stationary or I think in machine learning they talk about more of like out of sample distribution. No, oh dear, now I’m blanking on the technical term.

[laughter]

0:08:01.4 TW: Well, I just know that outside of causality, non-stationarity and time series data is like that’s like my second favorite topic of late and just all the quirks of when you have things… When you have non-stationary data, you can’t just do things that I feel like sometimes we dabble in statistics and all of a sudden we assume some static normal distribution. And it’s like, wait, but you’re looking at data over time. And with those of us who come from a digital analytics background, all of the data is time series data and often that means it is non-stationary.

0:08:37.8 MK: Okay. Before we get stuck into the technical stuff, what I really wanna understand though, what made you move to Lyft? Like was it that… Was it about the problem space or ride-sharing as a concept? Or was it like the ability to really delve into the technical? Because it does seem like a very… Obviously a lot of the skills are transferable, but it does seem like a very different problem space to where you were working before.

0:09:07.4 DR: Yeah. I just thought that ultimately that was where the more interesting and applied science was. I think the thing that turned me off from the financial industry is that the data is constantly fighting you. It doesn’t wanna be figured out because for every time someone else figures it out, that works against you. So if you’re… You don’t need to just have an insight in it. You have to have the first insight into it to… So it’s just… It’s this constant moving target and this is well-known. It’s also why a lot of hedge funds underperform and it’s why there’s really good advice to just put your money in an index fund and that’s pretty much just as good quite often. And so I just thought that was a really unromantic and uncool way to spend your time. Also, I just kind of morally got turned off from the industry.

0:09:53.1 DR: We really were… It is a little bit of a cliche, but we really were just like dealing with very rich people and trying to make them more money. And from a distance, it looked cool ’cause it looked like a fun game, but once you got in there, you’re like, “Okay, this is actually pretty unfulfilling.” So you… I wanted to be part of products that were actually changing how people do things and had an impact on people’s lives and all sorts of stuff. So I just moved over into tech because it seemed more appealing. And I also needed to take kind of a… They don’t just allow you to go from deep within the hedge fund space to anywhere within tech. So I had to find a middle ground. And Lyft is… It’s very econometrics based. It’s a lot… It’s very causal forecasting. It’s very causal inference based and they live and die by their scientists. So they’re more traditional in the types of science that they’re doing and it’s very important for their business that they do it well. So that looked exactly like the makeup that was most attractive and I knew that if I was in that environment trying to help them grow the business, you’d have to learn really good science and that did turn out to be true. So it’s just kind of appealing.

0:10:57.8 TW: What was kind of the ask? When you came in, what was the here’s your task? I always kind of wonder, what’s the data scientist they…

0:11:07.8 MK: A problem to solve.

0:11:08.9 DR: So that actually reminds me of one of my decision variables for choosing between these places, is I was interviewing at a lot of tech companies and they all had different interview processes and Lyft’s interview process was the most challenging and interesting. They wanted you to know so much statistics and machine learning and software engineering. So they had you just do this pretty aggressive take home assignment for… They tell you to spend four hours on it and pretty much everyone spends 24 hours on it. They had the… The whiteboarding challenges were respectable. I was… I felt pushed, which was good. And then their interviews for stats and machine learnings was just all the textbook information. Like you had to have read a lot to have survived some of those interviews, at least when I did in 2014.

0:11:54.4 DR: The interviews have changed just since then, since the competitive landscape has changed. So I remember being like, this is where it’s at. Also looking at some of the people who were at the company, they had… I mean, there are serious scientists all over the place, but at least my crawling of their LinkedIn revealed that I think I thought they had some serious scientists over there, so I just wanted to be part of that boat. Okay. But the second question about what type of problem are you actually solving when you first show up? That varies obviously, but I’d say a typical case is we’re trying to figure out how to make good decisions. So there’s a thousand different decisions that we need to make. For example, how old should we allow the cars to be for our drivers to use? So when people are applying to be a driver for Lyft, they say, “My car is from 2002 or 2013.”

0:12:41.1 DR: And that is something that when Lyft first came around, someone just pulled a number out of the air and said, “Okay, no more than five years old ’cause we wanna be a luxury product.” And then they realized that that’s too restrictive and then no one really explores that. Then someone shows up and decides to explore that really in depth. And then you find that just on that dimension has a huge impact on the business. Like you get an entirely new group of people who are now available to give rides and now you need to make that decision well, so someone will do an analysis to figure that out and you’ll also discover that no one ran an experiment on how to do this accurately. So you kind of just need to be clever with forming your best estimate and so good causal inference people are useful on that front.

0:13:24.0 TW: So you’ve got the, call it, the good causal inference people. And it sounds like culturally the company… I mean, pushing that example, which I’m assuming it’s relatively fictitious and we’re not gonna do anything proprietary, do you run into the challenge where somebody says, “Hey, we just picked a number. I have a hypothesis, or I have an idea that this may matter and we need the sweet spot. Hey, data scientist, work your magic and give me the optimal car age.” Like do you run into where expectations are, “We’re Lyft. We have all of this data. Therefore you should be able to give me a specific and precise answer.” I suspect the reality is like, “Well, the age of the car you’re… ” There’s not… There’s still gonna be uncertainty. It’s not like there’s a perfect number and it may vary by where and model and make. So do you still run into that challenge that there are people saying, “I have a… I just wanna know what age of the car. Do your magic.” And you have to actually do some education and wind up giving a less perfect answer than they would like to have in an omnipotent world?

0:14:44.0 DR: Yeah, so I think the Lyft is… And again, a lot of tech companies are like this, but Lyft is a very science fast company. It’s not that hard to communicate a pretty technical idea across the whole company which is useful because some insights and some ways to manage the business are pretty technical. When it comes to something like, yeah, just as an example, picking the age of a car, I think maybe talking about how that would actually look might reveal some details here. Some person would just guess that it is useful to do that. Someone would literally just find out like, how old are these cars? Maybe by having a conversation with the driver, they’ll be like, “Oh, I’ve never seen a car that’s before the year 2013. That’s a little weird. Or that’s… ” And so they’ll just look as to whether we have data on that. They’ll find that we do, and they may see a hard cutoff in the data. And this is under a culture which allows data scientists to just think on their own and investigate things.

0:15:39.3 DR: And that culture’s useful there. And then they go on and they have to make a recommendation. They have to say, these cars are way too young. They could be older and we’d be fine. And maybe they’d show a graph that shows something that we care about trending up until 2013 and they’re getting cut off. And we said, if we just let this curve continue, we might do a lot better. And so they would make that argument. A bunch of people would get convinced by that. And then a natural follow up would be everyone would be skeptical of the insight and then suggest that, “Okay, you could do this, but let’s start with an experiment. So let’s roll it out in this small region and let’s try letting it down and let’s just see what happens there.” And that’s not an uncontroversial path for a data scientist to take to have an impact. It could be kind of slow. So a data scientist might be better off if they were more aggressive in their recommendations and could just be more convincing and then have a stronger rollout and whatnot. But that’s… I’d say that’s like a typical individualist approach to data scientists at the data science at Lyft.

0:16:39.5 TW: So now I’ve got another question putting… I mean, car age being one, ’cause you talk about experiments. I mean, experiments are kind of the gold standard of causality. Gold, silver, whatever, that’s… They’re better than non experiments. But experiments, you can’t experiment on everything. How do you… In a case where your experiment could be basically driver eligibility versus not, if you needed to run an experiment for X period of time and you’re allowing riders in, does that mean you’re like, “Well, we’re just gonna let them stay in as part of the experiment.” We have the data, but we can’t really knock the riders back out if they… If the experiment fails and maybe that’s we don’t wanna go down that path. It’s getting too much into the operations of Lyft. It’s just… I’m thinking of like experiments where you’re like just experimenting with the experience. A customer experience thing is one thing. If it’s an experience of… An experiment of eligibility or something, do you wind up in a murkier… Is that a reason that you say maybe we don’t wanna run that experiment?

0:17:49.4 DR: No, that’s a great point. Again, that’s a symptom of this just being something that I made up.

0:17:54.5 TW: Yeah, yeah. We’re kinda revealing.

0:17:57.5 DR: So I would imagine… Yeah, I’m not actually running experimental design review right now. So in a lot of cases, you would run against that guardrail, where someone would be like, “We can’t just be knocking around this constraint on our drivers, so we’re not gonna experiment with that.” So there are things that we’re comfortable wiggling and then things that we know we can’t. I don’t know where driver age would fall. I would imagine you may not run it like a typical experiment where in the same region you’re thrashing around the required age of the car, but at a whole region level, that might be fine. We’ll just saying, “Okay, in California, we’re gonna allow the cars to be as old as 2008.” I couldn’t imagine not running into that problem. And that’s an experiment of kinds. You can’t make as strong causal statements when running something like that. But you can make… You could sort of see your trajectory if you were to roll it out to the rest of the country. And that’s really all you care about ultimately. So yeah.

0:18:53.0 TW: So I’m gonna work this into the directed acyclic graphs, ’cause I really wanna use that in the title of the show. But so you have a two-part series where you talk at a broader level of causality. And even with that one example, there’s this causal link or there is a presumed by someone causal link between the age of the car and revenue or total rides or something. And you talk about directed acyclic graphs or DAGs, which I think some listeners will be like, “I totally know what those are,” and many listeners will be kind of like, “I’ve been… What the heck is that?” But that’s kind of at the core of this. There is a thinking there that you’re looking for a cause and effect relationship. Can you talk about that a little bit?

0:19:47.0 DR: Sure. Yeah, yeah. So DAGs. Okay, so DAGs show up a lot in and they show up in graph theory and causal inference, it seems, and they show up in computer science. They’re all over the place. I think for DAGs, when it comes to causal inference, we’re talking about ways to constrain models. So let me back up before I get entirely into that. So at the end of the day, what you really want is you want a causal model. So what that means is it’s gonna be something which maps from the decisions you make to the outcomes you observe. So the decisions you make are gonna be the things that you can control and the outcomes you observe or things that you care about. Typically, this all goes through things that we understand as well, but the things that you might care are like total rides or bookings or something downstream of all of your decisions.

0:20:35.2 DR: And a causal model won’t just tell you what you’re likely to get for those observables according to the status quo operations. It’s also supposed to tell you what would happen if you did a variety of other decisions. And that’s what makes it causal. So a causal model is strictly like a bigger thing than just a forecasting model, which will just tell you what the future will look like if things continue as they have been. And so you have… In doing that, in having that large model in front of you, it becomes extremely hard to estimate, because it’s so big in a sense. It’s hard to put into words, but let’s say you have just N variables, and a causal model is really gonna have to talk about how they all relate to each other. And so you have to make this space of potential models that you could be dealing with much smaller so that you can estimate it. And it turns out that talking in terms of DAGs is a great way to constrain the models that you’d like to actually work with.

0:21:36.3 DR: And they’re useful in a couple ways. The first way is when someone writes down a DAG, they are declaring, I think that this set of variables caused this other set of variables, which causes this other set of variables. And when they do that, they’ve really constrained the set of models that they’re going to estimate. So in one case, it’s good for really collapsing the space of models that you’re using to estimate. But another way is that it’s a great communication device. I can just take my DAG, and I can show someone else, and I say, I think the business works this way. I think that pricing impacts this, which impacts this, which impacts… And then someone else can say, “No, I don’t really think so. I think it goes the other way.” So it’s another good vessel for you to collect consensus. And so people can take issue out of a small level looking at individual edges and saying, no, these edges are wrong, or this whole thing is wrong. The criticism of the DAG is kind of always there, but the DAG is really useful for, again, collecting consensus.

0:22:35.5 MK: Can I just ask one clarifying point? So I think the consensus side is really interesting and getting people in the business essentially to agree to an assumption, but the bit that I kind of struggle with is why is it useful to collapse down those assumptions to kind of have that one causal relationship or as it grows then many? But what’s the advantage of collapsing the models and restricting it?

0:23:03.4 DR: Yeah, so at the end of the day, you need a single function, which maps from your decisions to your outcomes. And that is in the space of all possible ways things could be a very specific thing. So it is specific both in the functional form of basically the functions that are relating all the variables in the specific parameters that you get when you fit those models. So it’s like there’s two levels to move from the full world of potential models to a specific model that you’d like to actually fit and train. So let me give you an example. At Lyft, and this is not proprietary, but Lyft and everywhere, yeah, people are estimating demand surfaces, so you care about, if you change prices, how much are people gonna become less likely to use your product. So you’re constantly estimating these surfaces. And what you’re immediately likely to do is you’re gonna wanna get those into the model, because they’re kinda like your gold standard for causality.

0:24:05.7 DR: And most people are gonna reject any model that doesn’t have those in there. So if I want to get that in, I’m going to say that prices cause your likelihood to use our product, and that’s a statement about causality, and I can encode that as an edge that goes from price to let’s say conversion probabilities. So that’s a line, and this is a statement about a part of a DAG. If someone else came back to me and they said, “Actually, your use of the ride causes the price,” that would be a different DAG. And I would say, I don’t like that because I can’t use my experimental results there. So the DAG gets informed by the experiments that you already have and basically people’s intuition for the business.

0:24:48.4 TW: So there are those two fundamental things that the D is for directed and that it’s nodes and lines and those arrows going one direction. And the second thing, and I think this was actually in your post, I was like, oh, is that there’s no loops. Like you can’t… You don’t kind of, oh, the price affects the ride and the ride affects the price, can’t… That’s like just a rule, right, that you can’t put it in a DAG, which again seems like handy from a communication and alignment. Or you’re gonna say, well, except there’s a PDAG or something and you’re about to…

0:25:25.3 DR: Yeah. But the PDAGs will show up. They’ll haunt you. But, yeah, so the DAG rules out certain models that you’d like to be able to use. And the unfortunate thing is the reality of the market that you’re managing does have those feedback loops. And those feedback loops are just really, really hard to estimate. But the reason you can’t have these loops in the model, at least according to the naive contractions, there’s kind of ways to jam in these circular relations, but I’ll forget that for now. The reason you can’t do that is because it doesn’t… You can’t do a forward pass on a model like that. You don’t… If you have a circle of nodes, you don’t know where to start. The theory doesn’t really know what to do with that. So you have to rule those out so that you can have this cascade of causal effects that go from the things that you can control to the things that you can observe and care about. And if it has to go on a circle at some point, it’s gonna get very confused.

0:26:18.5 MK: And so then I’ve got it right. Once you have this DAG in place, then you set up experiments and whatnot or use existing data to try and verify it.

0:26:27.4 DR: Yeah. So what you do, really, or at least in our case, is you have a DAG, and then the DAG on its own just gives you a bunch of this causes that relationships. But it’s still up to the modellers to determine exactly the functional form of those relationships. Are you gonna use a linear model, are you gonna use a Bernstein basis, are you gonna use a neural net, are you gonna use something crazy? And then, again, you have to figure out the specific parameters within that context. And so the experiments will just refine you… They won’t… They’ll refine you on the parameters with… Once you’ve picked a DAG and a functional form. But what they won’t do necessarily is they won’t tell you that your DAG is bad.

0:27:11.0 DR: So back to the other example that I used earlier. We have experiments that tell us about demand surfaces. And then that makes someone want to recommend that pricing causes conversion. So they have that deck. If someone else comes back and says, “I actually think conversion causes pricing,” I would look at them weird, but I would also say that none of my experiments on elasticity or the demand surfaces can refute your claim that conversion causes pricing. So it doesn’t make a statement about what the DAG should be. So the DAG really needs to be understood as it’s just assumptions. Now, there are ways to validate DAGs and to figure out if they’re working. The first test is are you predicting the future well? If you’re not, that’s an indication that maybe your DAG is wrong. But the true scientific ways of estimating DAGs are just really intractable, especially if you’re trying to model something as big and crazy as a Lyft business.

0:28:08.3 DR: So doing a search over DAG space is just prohibitively expensive and too complicated. And I think people have tried it. They’ve run sort of these assumption-less searches over DAGs, and it comes back with just a crazy web that makes no sense to you. So if you… We’re not willing to look through all those DAGs. And then when you actually gain a consensus of everyone’s DAG, it looks very reasonable. I mean, it looks very complicated, but it looks like people are… People start reasoning in terms of it. They’re like, “Okay, we have a demand side, we have this thing and that.” And it kind of… If you see the DAG, it’s not that uncontroversial, especially once you get familiar with it.

0:28:50.2 TW: But that assumption, the assumption seems so… I feel like I run into that a lot, that it’s like, I don’t wanna make assumptions. I want the data to tell me the truth.

0:29:05.3 MK: Or even just the squabbling that would happen about the assumptions. That’s the big…

0:29:09.0 TW: Well, but I wanna have the squabbling, I wanna have the squabbling about the assumptions ’cause to me, that drives…

0:29:12.3 MK: Yeah, true.

0:29:13.4 TW: Outside of DAGs, isn’t that where all exercises and causality, don’t they need to start with somebody’s idea or belief or an assumption about causation, and then you can go and try to validate whether there’s causation? Or is that… That is probably me poorly articulating why I get so excited about the topic ’cause I feel like there are a ton of analysts and definitely a lot of people in the business who are like, “No, no, no, no. The data is there to tell me.” Start with the data. And as you just described this, I don’t know what. This sounds horrible, this thing that’s gonna try to just magically mine the data and spit out a DAG.

0:30:02.7 MK: But you would still be kind of starting with the data, right? Because you would have prior experiments and results of like, we changed the price here and this increased that. When you form those assumptions, you would be relying on some data that you have, presumably.

0:30:17.7 TW: I think that can inform the assumption. But I think there’s… You’re rolling out a new product, you’re rolling out a new… You’ve always been the age of the car has never been allowed to be more than three years old. You have no data. The best you can do is plot a line and say, well, if that continued on… I think you can have…

0:30:35.2 MK: But that is data.

0:30:36.3 MH: Yeah, you’ve seen it. You can say you’ve seen some data, but I also… That’s the… Well, then if somebody says, “Well, wait a minute. We haven’t run enough… We don’t have enough data to inform our assumptions, so I guess we need to just gather data.” I would rather start with the pretend you know nothing, and then any data that you have that might help inform your assumption or your idea is just a gift as opposed to what I run into is the exact opposite a lot more, which is don’t ask me to make… Come up with any ideas. The data needs to tell me all. I don’t… I’m articulating it poorly.

0:31:16.5 DR: So this idea of you like the data to tell you everything, in some environments, you can do that. I think in these more classical machine learning environments with computer vision where you have great signal to noise ratios and a ridiculous amount of data, you can really allow the data to figure out every parameter that you want, and you don’t need to worry about things like identification because you just care about predictive performance. But causality is a much harder thing. It’s a much bigger ask to get a causal model than it is virtually anything else. It kind of reminds me of the separation between frequentist statistics and Bayesian statistics. Bayesian statistics makes you show up with your prior, and that makes people really mad.

0:31:58.4 TW: Duck.

0:32:00.9 DR: Yeah, exactly. They get upset over that. But it acknowledges that for any given problem, a good answer is going to involve information that didn’t come with the problem. And so I think if you try to ask like, I honestly think that the true causal mechanism isn’t in the data that we’ve collected at Lyft or at least at the level that you’re looking at. Maybe there’s something crazy if you look only at like event-level data and you do very clever things with extremely well-timed data where you can look at like… Maybe there’s something really remarkable there. But if you look at the aggregated data that we get where everything is at the weekly region level, and it’s just like very aggregated up and you measure the business on a bunch of different dimensions, I don’t think causality is in there. I think you really have to know about things that are only in people’s brains.

0:32:50.9 DR: For example, in that aggregated data is no knowledge of the algorithms that you’re using to generate pricing or estimate ETA. They’re not really aware of how Uber’s operating with respect to your business. Those are all things that exist only in people’s minds, and a good model would have that information. So if you have to get the things in your brains into the model, assumptions are like you’re a vessel for doing that. But it is… Yeah, everyone has that intuition that you want the data to just tell you everything, but in certain environments, I just don’t think it’s possible.

0:33:24.3 MK: So okay, I’m dying to understand how this works practically in terms of firstly, like the thing I love about this whole DAG concept and whatnot, it’s quite explainable, right? But I also imagine it gets… And you can have kind of like a data scientist own it, right? But then there’s gonna get to a level of complexity where potentially, and this is something I’m interested to understand how you guys are doing it at Lyft, whether different data scientists are owning different DAGs to then combine into your giant model, but also whether then you get to a point of complexity where like there are so many assumptions and DAGs and that benefit of explainability is lost.

0:34:08.8 DR: I’m glad you asked that. That is the thing I’m dealing with on such a regular basis now which is we’ve built this giant model and it’s had… I don’t know how many people, but I think it’s something like… I don’t know, probably 10 people have inserted their assumptions into the system, maybe a little bit less or a little bit more. And you have… Now we have this big model of the business, and then a new person shows up and they see all these assumptions of all these strangers to them, and they immediately take issue with the whole thing. They’re like, “I’ll just make a tiny little DAG that does everything that you want.” And they’re missing the history of why each assumption was added. And so we have to go through this period of training new people. It’s like the first part of training is to get them to accept the complexity of the existing model.

[laughter]

0:34:53.7 DR: And we tell them comforting facts like, “There have been other people who took the whole thing and then tried to reduce it and make it smaller, and that helped a little bit. But look at this problem that we had, and you can’t get rid of this thing because of that.” And then every now and then maybe as a therapy you say, “Go build your own little section and see if you can beat it and don’t use any complexity. See how far you get.” And so there’s always that tension because you look at it. We have this PDF that shows what this DAG looks like and it’s big. It’s like not fun. I used to trace through the whole DAG two years ago and I would no longer do that.

[laughter]

0:35:24.8 TW: Do you… I mean, in that… I mean, if you’ve got a mix of assumptions and validated, like does every edge or is it every edge or every kind of drawn a circle around it, some subset of edges and nodes that you’ve then whether it’s with a model, whether it’s with an experiment, whether if it’s some combination, like is there any capture of this edge is an assumption? Logically, it makes sense. We don’t actually have any evidence. We haven’t put any effort into trying to actually validate it. It’s just it seems logical and we haven’t quantified it. We just know there’s a relationship. We haven’t modeled with that relationship. Like, do you capture that? Like, what’s the… How known are aspects of it?

0:36:16.4 DR: If you would look at the existing DAG, you wouldn’t go to the edges and say, “I don’t think this edge should be here.” If anything, you’d say, “I don’t think there are enough edges here.” I bet the real complexity of the business is just bigger and that happens more frequently. Some will name some recurrent effects that they think is definitely in the model and they can’t see how the model captures that. The models that do exist are ones that we can estimate and that we’re reasonably sure of. So for example, you’ll have like demand is caused by price edges. Those ones are uncontroversial. Or you’ll say that market congestion variables, things that measure how long it takes a ride to show up or how infrequently someone is in the back seat of a certain car. All things that sort of reflect market balance, you might expect that that is caused by your overall supply. No one is gonna take an issue with that. They’re gonna think that those things are related. And so those edges are pretty safe.

0:37:09.9 DR: The next criticism is, are you estimating those effects well? That is a more reasonable criticism. But with enough time, you get that model feedback. You basically say, “Hey, I changed this thing and this other thing didn’t change how I thought it would.” That’s a problem. We’ve had enough cycles of that where we’ve corrected them and then we put something else to kinda patch it up. And that ultimately gives you an overall path that you start getting pretty confident in. So that’s one section. Another section is some edges are just totally uncontroversial because they represent really simple operations like summations. So this can just fall out of the how you define variables. So if you have accounting relationships in your DAG, then saying that a bunch of dollars sum up to overall dollars is not gonna be a controversial thing. And that will show up as an edge. Yeah. So I would say that a lot of the edges are defensible. But I built a lot of the DAG, so a lot of them are also my assumptions. And other people come in and they say they change it and I don’t like their edges as much. But it’s very much like these edges are really your assumptions. So they’re all open for criticism.

0:38:16.0 MK: One, I guess, complication that comes to mind, is there anything about Lyft being a two-sided marketplace that makes this approach more difficult than it would in, say, like an e-commerce store or something like that?

0:38:28.9 DR: Looking at… One of the reasons I signed up for doing this is that I thought Lyft was in the best position to ever… I almost thought like we could be one of the first people to have a really well-functioning DAG of the overall… Of an overall market because you observe everything from the top-down and the causal structure on a general level is not controversial. There’s a demand side, there’s a supply side, there’s a market congestion, there’s financials. It’s separable into some way that you’re reasonably comfortable with your assumptions. Other environments where you’d be tempted to make DAGs, you run into the issue of just things being very opaque. So if you were… Let’s say you were Amazon and you were working with a lot of merchants and say a lot of those merchants were pretty large themselves and they had internal DAGs that you really should be using, but you couldn’t represent those as well, then Amazon trying to model all this, it would have large basically invisibility walls into what those merchants are doing. And it’s like… So you can kind of get… Your observations can wall you off from parts of the DAG that you should be modeling that you can’t. Whereas Lyft, the pieces that we’re dealing with are all pretty atomic. We’re dealing with riders, we’re dealing with drivers. Their behavior is more simple than let’s say a very large merchant that has… That’s doing something more complex.

0:39:52.1 TW: But would it work? And this is taking e-commerce and they’re doing… And take the marketing that they’re doing. So they’re saying, “We’re running display ads and we’re running paid search.” And there is an implicit causal relationship that we run these ads, those ads get exposed. There are impressions that cause clicks, that cause conversions, that cause revenue. Part of me thinks I don’t see a visual representation of this is how we think this lever that we pull trickles down through multiple steps before it becomes business value. That’s not a DAG for the whole system. That’s saying that’s DAG for a channel or for multiple channels. And how do we actually see them? Like, back to your point of saying it gets people on the same page of this is how we assume that it’s working. That to me makes me think, well, then it goes to people saying, okay, can we quantify what that relationship is? It’s a problem that media mix modeling and attribution and all these goobers are trying to… I mean, very smart people are trying to solve legitimately. It’s just kind of coming at it from a what we’re really looking for is a causal relationship between stuff that we do and an outcome that we care about.

0:41:18.7 DR: Yeah. So in terms of, I think the example that you’re talking about, display ads and paid search and that cascade of a user clicking through to ultimately purchasing a product, that is a smaller thing to model, which you could probably cast a causal relationship that wouldn’t be too controversial. But what we’re trying to do at Lyft is model the whole business. And so if you were trying to sign up for modeling the whole business of something bigger like Amazon or something else, those things can get a lot harder. And so in general, if you’re gonna recommend when to use DAGs, it’s on small problems. If anything that would probably be my number one piece of advice, is DAGs work really great sort of the smaller they are. And so you can personally inject a DAG that is less likely to be wrong. And when you grow things, when you make ’em really big, then you just have a lot more surface area to get wrong. And so you have to be a lot more defensive and it just takes a lot more time for you to really build up a DAG that you’re confident in.

0:42:19.1 TW: I like it. Are you convinced, Moe?

0:42:22.5 MK: Yes.

0:42:23.0 MH: Alright. Well, DAGnabbit. We do have to start to wrap up. [laughter] So this has been awesome though. It’s very interesting ’cause I think hearing you describe what you’re doing at Lyft is just helping me think through a bunch of stuff that I’ve done and I’m more on the e-commerce side in my career. And so thinking about systems and how we’ve tried to optimize them and like thinking through these edge cases and it’s like, oh yeah, we were almost… We almost had a DAG. We didn’t know what it was called. We called it something else, but we were designing something very similar. We just didn’t have the knowledge. It was like, “Wow. So this is sort of cool, two and two together. Well, it’s one on my side, three on your side.” But the knowledge is very, very useful.

0:43:10.1 TW: But you’re sold, right? Like if you took it…

0:43:12.3 MH: Yeah, yeah. It was very cool.

0:43:14.8 TW: That one more step, it formalizes it and then all of a sudden…

0:43:16.1 MH: Yeah, no, I can see… You see your way to applicability so much more. Anyways, DJ, thank you so much for walking us through some of this stuff and sharing some of your experiences. This has been awesome.

0:43:25.8 DR: Thank you.

0:43:29.0 MH: Alright. Well, one thing we like to do is we go around the horn and share our last call, something we think might be of interest to our listeners or something we just find interesting. And well, DJ, you’re our guest. Do you have a last call you’d like to share?

0:43:40.5 DR: Sure. So if you’re interested in causal inference, which I suspect people would be for listening to this episode in particular, there’s a guy who…

0:43:47.5 TW: Well, if they’re still listening. If this topic…

[overlapping conversation]

0:43:49.2 MH: They are.

0:43:52.4 TW: If this topic caused them to end early, then not for them. Sorry, go ahead.

0:43:58.0 DR: No, no, sure. But for those who are interested in the causal inference, one guy I’d like to give a shout-out to is Matteo Courthoud. He writes a lot on medium for causal inference stuff. Really like his writing style. And a lot of people at Lyft have found his stuff quite useful. His last name is spelled C-O-U-R-T-H-O-U-D. And if you google him, you’ll find he has got several really intuitive, easy posts on causal inference and a lot of the tools that he demonstrates how to use are really well-utilized. And so if there’s a sort of a short way to start with causal inference, that might be a good start.

0:44:37.0 MH: I like it.

0:44:37.3 TW: Awesome.

0:44:38.1 MH: Thank you.

0:44:39.1 TW: More reading. Does he say counterfactual in his stuff? I think we’ve gotten through this whole episode without…

0:44:43.2 MH: Using that.

0:44:44.3 MK: Oh, yeah, you have.

0:44:47.0 TW: Not anymore. I ruined it.

0:44:48.9 MH: Not anymore. I agree. Yeah. Thank you, Tim, for making sure that happened. Alright. Well, in that case, Tim, do you wanna share your last call?

0:44:56.6 TW: Sure. I’m gonna sneak in one little observation ’cause it’s now just in the back of my brain that this whole, I think, assumptions have a dirty word in our world. And I realize there’s the… Assumptions are like assholes. Everyone has one and most of them are full of shit or they all stink or whatever. And then they assume… The problem with this making… Assuming is it makes an ass out of you and me and I don’t know why that hasn’t clicked with me, but my favorite definition of a hypothesis is that it’s a tentative assumption made in order to draw out and test its logical, empirical consequences. I think I’m just…

0:45:29.7 MK: Rolls up the tongue, Tim.

0:45:32.7 TW: I’ve said it a few times in a few forums, but I think just a little light bulb went on. I think that’s a problem. Like we… Assumptions have a dirty, a bad connotation and they’re actually kind of at the core of things. So that was not my last call, but just sneaking in ’cause I’m unfiltered. My last call is a long time ago guest on the show, Todd Belcher has been playing around. He is a major League baseball fan, so this is gonna be two episodes in a row with an MLB-related last call. But basically with ChatGPT-4 and well said, he… That is at mlbresultscast.com as well as on many of your favorite podcast platforms. He has now a daily podcast that’s being generated. It is audio, it runs through the box scores from the previous day and then actually goes into a couple of games using the box score.

0:46:32.1 TW: It goes to the scores of all the games in the major leagues, then it goes through and breaks down basically just feeding in from the box scores and then has a couple of different AI generated voices. And it is pretty wild. He’s been kinda tuning it and tinkering with it. It throws in like a random fact from the day. It is crazy. You can tell it’s AI, but it doesn’t sound unnatural at all. And it’s got little jokes and it hands off from one commentator to the other one to kinda run through it. And it’s kind of a pretty intriguing little experiment. As of this recording, I think it’s still taking him a little less than an hour a day to produce it. But where he’s heading is the idea that you could actually get your personalized news or scores in a podcast format auto-generated. It’s worth checking out. They’re like 10-minute episodes, mlbresultscast.com.

0:47:34.8 MH: Interesting. Alright, alright. Moe, what about you? What’s your last call?

0:47:40.4 MK: So we have had this person on the show previously, Aaron Dignan, and I know that I’ve mentioned the podcast previously that he does with Rodney Evans, Brave New Work, which I’m totally obsessed with. I’ve just listened to a bunch of episodes. But there was one particular episode that I wanna call out and it’s because it’s got me spiraling into this whole, like, oh, I need to do this. But basically the podcast episode is called, “Help Me Help You, What If Your Co-workers Came With Instructions?” It’s one of the Brave New Work episodes, but they actually talk about the idea of having a user manual, and particularly how important that is for managers. So basically it’s inspired by an article in The New York Times by Adam Bryant. There’s a whole bunch of other people that I’ve now read up on Forbes and the Wall Street Journal. They all… There’s a whole bunch of people who’ve written on this topic of user manuals and I’ve started to write my own. And the whole idea behind writing a user manual is that it gives your co-workers shortcuts to how you work so that therefore you guys can become high-performing more quickly. And yeah, it’s a topic that I’m kind of delving into and testing out to see if it works. So if anyone’s interested, I can report back and let you know how I go.

0:48:57.9 MH: So you actually are writing a user manual for yourself for people who are trying to use you?

[laughter]

0:49:03.7 MK: Well, so people… So it might be… There might be things like, for example, one of the things that I’m really good at is getting the ball rolling on stuff and bringing people together. But I’m not always a good finisher. So if you’re in my team and you know that, you might know that either to get that last 5% done, you either kind of need to bug me a bit or you might be like, “I’m a really good finisher. I’m gonna pick this up and take the initiative to do the last 5% because Moe’s got the ball rolling.” So it’s like how do you give shortcuts on how to work together so that therefore you work better together? Like the other one is like, “I don’t like to come to meetings or discussions with fully fledged ideas. I like to really spar and discuss things and come to conclusions or like things as we work through them.” And that doesn’t work for everyone. So it’s like, “Okay. Well, if that’s not your style, then how do we have some kind of relationship where you get to know the problem ahead of time, so you get to be thoughtful and feel prepared and have your views worked out?” Whereas for me, I’ll need to use that meeting to come to my opinion about a particular topic.

0:50:12.8 MH: Sounds like we could use this for the show.

[laughter]

0:50:14.5 MK: Do you know, I was gonna suggest that we all write a user manual and then I thought there might be some prickles on the back of the neck.

0:50:23.2 TW: But it does sound like you need to… It does… There’s a a degree of self-awareness required. Somebody could write a pretty, ha. I mean, if I wrote mine, I delivered. I’m like, I’m easy-going, you know what? I don’t have high expectations of others.

0:50:40.2 MK: That’s exactly what Aaron and Rodney talk about on their podcast, is that the exercise of doing it… You might not even end up sharing it, but the exercise of doing it forces you to become or to be self-aware because you have to think about how you work and your ways of communicating and that sort of stuff. And whether you share it or not, that in and of itself is a useful process, especially for managers, which I think is really phenomenal. But the other topic that they also go into is feedback. And when people write user manuals, everyone writes, “I like direct feedback.” And the truth of the matter is most of the people actually really don’t. Direct feedback is incredibly hard. It requires a level of thoughtfulness about your ways of working that I think is really beneficial.

0:51:28.0 TW: Wow. Interesting.

0:51:29.2 MH: Alright. Thank you.

0:51:30.9 TW: So Michael, What about… What’s your last call?

0:51:35.2 MH: Well, like you, Tim, mine’s a little bit of a departure this time. There is a guy I ran across who’s a musician named KS Rhoads, and he’s done a series of little songs called “Your Kid’s Favorite Tunes to Your Dad’s Favorite Bands,” which is basically nursery rhymes and kids songs, but done in the style of bands from the ’90s. It’s phenomenal. So if you go on Instagram and just look up KS Rhoads, you’ll see his whole series. But it’s not data-related at all. It’s just something I totally geeked out over in the last week or two, and I think everyone should see them.

0:52:18.6 TW: You’re just making fun of me ’cause I won’t recognize the bands from the ’90s, even though I was supposedly…

0:52:22.7 MH: No, you wouldn’t recognize any of them. Well, actually there’s one you probably would. He does the “Itsy Bitsy Spider” to the theme of Hamilton, and it’s crazy good. Just crazy, crazy good. Anyways, sort of a weird little departure, but it’s a lot of fun.

0:52:38.2 TW: That is weird.

0:52:39.0 MH: Alright. I know you’ve been listening, and you probably have questions, and maybe there’s things that you’d like to ask. Well, we’d love to hear from you. And the easiest way to do that is to reach out to us on the Measure Slack group or on Twitter, or on LinkedIn, and we’d love to hear from you. And DJ, I know you have a YouTube channel, which I mentioned before, which is called Mutual Information. And so people can check out your videos there, which I’ve been watching those as we prepared for the show. And I have love the way you present information on those. Even though there are a lot of great characters and things that scare me, you make it actually quite accessible, so that’s good. And is there other ways that you engage on social media? Are there other places people could follow your content or contact you that are easy?

0:53:23.1 DR: Yeah, the only other way is my Twitter. I’m occasionally on there. So it’s DuaneJRich, @justduaneJRich.

0:53:33.3 MH: @DuaneJRich?

0:53:34.2 DR: Yeah, @DuaneJRich.

0:53:36.2 TW: It’s just @DuaneJRich. I forgot how Twitter does their thing.

0:53:38.0 MH: Yeah, yeah. Well, who knows? Twitter may not be around even there anymore…

0:53:42.5 TW: By the time this recording goes out.

0:53:45.0 MH: They’re not really going into…

[laughter]

0:53:48.1 MH: Anyways, so, all right. Well, thank you once again, DJ, for coming on. It’s been great to hear from you, and thanks to the Lyft team for reaching out and suggesting the topic. We really appreciate it, and no show would be complete without a huge thank you to our producer, Josh, and I mean huge for all you do behind the scenes to make this show possible. We appreciate everything you do.

0:54:11.8 TW: Specifically this show.

0:54:13.4 MH: It’s this show, and all the other ones, but this one specifically. Anyways, and I know I speak for both of my co-hosts, no matter the size of your DAG, I know that Moe and Tim would agree with me in saying, just remember, keep analyzing.

0:54:33.3 Announcer: Thanks for listening. Let’s keep the conversation going with your comments, suggestions, and questions on Twitter at @AnalyticsHour, on the web @analyticshour.io, our LinkedIn group, and the Measure Chat Slack group. Music for the podcast by Josh Crowhurst.

0:54:51.1 Charles Barkley: So smart guys want to fit in, so they made up a term called analytics. Analytics don’t work.

0:54:58.2 Kamala Harris: I Love Venn diagrams. It’s just something about those three circles and the analysis about where there is the intersection, right?

0:55:08.1 DR: So I have a master’s degree in financial engineering from Berkeley in 2014.

0:55:15.5 MH: Still think this thunder is an omen. [laughter] Maybe I shouldn’t talk about my past.

0:55:21.6 TW: Well, we’ll add the Oregon music after, and it’ll be great. Yeah.

[music]

0:55:32.1 MH: Moe, what about you? What’s your last call?

0:55:38.4 MK: Josh is gonna fucking kill me. I don’t know if I’ve said mine before. I have a list, and I mark off if I’ve said them. And I was like…

0:55:45.0 MH: Josh, Moe did… She did use the timeout gesture before she said that. [laughter] Let me add voice to her video. So it’s okay. It was just a timer.

0:55:58.4 TW: I was totally… That slipped by me too, Moe. So I was just like, yeah, timeout, of course.

[laughter]

0:56:06.7 TW: I think that the preparedness of the guest causes the…

0:56:11.3 MH: Oh are they gonna put that in our DAG?

0:56:15.7 MK: [0:56:15.9] ____ the fumbles of the co-host.

0:56:21.5 TW: Rock, flag, and DAGnabbit.

Leave a Reply



This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

A close-up view of a jumble of gears and sprockets

#244: Data Is Everywhere. Why Do We Limit Ourselves by Default?

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_244_-_Data_Is_Everywhere._Why_Do_We_Limit_Ourselves_by_Default_.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares