#066: The Democratization of the Data

It’s another one of those on-going lobby bar topics: how much of the data should be made available to whom and in what form? Should all of an organization’s data be completely and freely available to everyone in the company, or is that a recipe for messy data being misinterpreted and misused? That’s the topic tackled on this show, courtesy of a recommendation from Pawel Kapuscinski. As it happens, it’s also Independence Day in the U.S. — a fact with which the guys had a little fun.

References Made During the Show


Episode Transcript

Show Transcript


00:04 Announcer: Welcome to the Digital Analytics Power Hour. Tim, Michael and the occasional guest discussing digital analytics issues of the day. Find them on Facebook at facebook.com/analyticshour and their website analyticshour.io. And now, the Digital Analytics Power Hour.


00:28 Michael Helbling: Hi everyone, welcome to the Digital Analytics Power Hour. This is episode 66. Everywhere you turn, you see the red, white and blue of charts and graphs ringing in the freedom of the data informed business, and I’m proud to be an American… Okay. I got carried away there, but this episode, we tackle the important and always timely issue of data democracy. This was a topic proposed to us by Pavel Kapuchinski, our great friend in the Measure Slack and we toast your good fortune, sir. Now let’s get into it. Tim Wilson, my co-host, welcome Tim.

01:09 Tim Wilson: Hi Michael.

01:10 MH: He will be supporting the side of data dictatorship, and I, Michael Helbling will be supporting the side of data dictatorship also. No, just kidding. In DAA, data reports on you. Okay, okay, all jokes aside. Let’s get this thing underway. In honor of the American Independence Day, let’s talk data democracy. So Tim, how would you describe how most people perceive data democracy, just to kinda level set for people?

01:41 TW: Well, part of me thinks it’s another buzz word, which our industry has no shortage of, but I think it generally gets perceived as this ideal of all users, all stakeholders, all possible stakeholders in the company have access to all of the data and all of the data with limits on sensitivity, privacy and that sort of thing. But basically, the only thing they need analysts for is really to do stuff that’s beyond their analytical capabilities, but basically have the access to the same data that the analysts have access to.

02:22 MH: Yeah, and the rise of data scientists, they can create data products that give access to data to everyone. So what’s wrong with that?


02:36 TW: Well, you’ve got data lakes.

02:40 MH: Data lakes and streams.

02:41 TW: Yeah. You’re moving into the world of unstructured data, throw the stuff in the data lakes and give everybody a data fishing pole and then watch the organization grind to a halt.

02:51 MH: Is there such a thing as data waterfalls? Because I’ve heard you should not go chasing those.

02:58 TW: You want a data stream. Oh, the analogies in this… This episode is getting pretty rough.

03:05 MH: Yeah, alright. So okay, that makes sense and it’s the way I’ve heard it too is basically this idea that sort of, “Hey, we create a structure around the presentation and visualization of data such that it’s in everyone’s hands. It’s open to all, or at least most, and people can use it to do analysis on whatever they’re doing with their part of the business.” And then so, and in an idealized state, it seems like that’s a really great idea, but it does I think have this tendency to break down, or it could.

03:42 TW: Yeah. And I think there’s… I like to think of things as being a spectrum where if that’s on one end, you say data democracy, I say data anarchy, or maybe I stole that from an article you’d found. But on the other extreme, I think there’s… Some of that is a reaction to organizations or processes that are set up where a business user says, “I just wanna know X. I just wanna know how much traffic came to this page and I have to go ask an analyst for that. And I asked the analyst how much traffic came to my page and their answer is, ‘Well, tell me what you’re gonna do with that information.’ And I have to go through this whole justification just to get one simple stinking number. And why can’t I get that data myself?” And so I think, I’ve certainly seen that, organizations that are on that end, where it’s data held hostage. I don’t know what the… Is that data fascism where you get none of the data that hasn’t been qualified, pulled, extracted, validated, but you get it two months later than you actually… It would have been useful.

04:51 MH: Yeah, and that is obviously the other thing, ’cause it’s interesting ’cause I’ve felt that feeling before, too. Not being able to do the analysis I need to do of a business because I didn’t have the right data or access to the data in a way that made sense. And so I feel that’s the flip side or the negative, which is, “Give me the data I need and I can show you what’s happening.” And so there’s a balancing act definitely to be struck. But also I’ve seen people take data freely given and torture it into wildly inappropriate “insights”… And that’s the story of Groupon. No.


05:40 MH: Sorry.

05:41 TW: But I think actually that way, it’s the torture the data enough, it’ll say whatever you want. To me, that’s kind of an insidious perspective or view because I’ll hear that said and that has this implication that people have such an agenda that they get the data and then that they’re actually sophisticated enough to understand that the data’s clean and then they sort of sort and filter it until they find this one little sliver that tells the story they want. I don’t actually think that’s what happens nearly as often as… What happens is, “Oh, go in and get the data yourself.” Or every single time, 100% of the time when there’s been data that I haven’t had and I know somebody does have it and I asked them for it, 100% of the time it turns out that data is more complicated and messier than I thought it was gonna be. And I think that’s what happens more, is then you get the data and you don’t understand it and then you try to use it and it comes back wrong, possibly. But it wasn’t ’cause you tortured it because you had an agenda. It was just, you read the headings and assumed that revenue meant revenue and it didn’t.

06:54 MH: Yeah, and I believe it was you recently that I learned that there was Hanlon’s razor, which I love to say a lot and I think that’s, “Never attribute to malice that which is adequately explained by stupidity.” And a lot of times that’s also happening, is there’s a lack of being informed and so a misuse of the data, even to the point of, in the early days of digital analytics, running into being like, “We’ve got so many hits and it’s great.” [chuckle] And it’s like, “Okay, well, that’s not a thing but… ” And then get people past it.

07:37 TW: Yeah, although I guess, and that Hanlon’s razor, that was my very first MeasureCamp experience. MeasureCamp Cincinnati picked that little gem up. That was a fascinating session. But even that it’s like attributing something to stupidity and maybe I’ve just hit some magical age where I just want to assume the best of everybody and I think I’ll be more right than wrong than assigning… What’s the line between stupidity and naivete or stupidity and shitty data that was taken at face value? Like, “You’re stupid.” It’s like, “No, the data’s terrible.”

08:17 MH: Well, and that’s the combining factor but then you can also layer in the politics of any given organizational situation and in some case actors who are motivated by specific agendas then capitalizing on both the quality of the data as well as the accessibility of the data to promote agendas that don’t actually help make the optimal decision.

08:44 TW: Yeah, well and I guess even that I, and maybe this goes back to the challenge of data democracy, is that you have three systems that all are basically tracking orders in some fashion, web orders. One is your system of record, one is your web analytics platform, one is your ad tracking platform or some other place. And you have experts in the data and if all of that data is available to anyone in the organization, then probably the first time they’re gonna say, “Where should you go get this data?” And the person they’re gonna ask is going to tell them to go to the system that that person is most familiar with. So that’s this other, “A man with one watch always knows what time it is, a man with two is never quite sure.” And so if everybody has access to pull the same metrics, and nobody is the one that says, “This is right.” Even if one person is.

09:41 TW: If it’s like, “Oh, the web analytics team, what they say is gold.” And if your data doesn’t agree with it the presumption is, is that your data is wrong, well what does that set up? Well, that sets up, one, it still casts doubt on the experts because somebody says, “Well, I pulled the data and got a totally different number. Two, it now does have a cost because the analytics team now needs to go and assess what the other group did. Why it was wrong, explain it to ’em. Now there’s a benefit of that because they’re educating that user in the data, which hopefully they’ll retain. Hopefully they’re not stupid. But it does set up an enormous amount of churn, and a lot of lack of confidence in the data that is very likely misplaced lack of confidence.

10:32 MH: Yeah, I guess I can see it breaking down based on the organization itself. I’m going back to this concept of, “Okay, if we’re gonna have data democracy does that mean we need some sort of data citizenship test? Or data literacy that is required?”

11:00 TW: Yeah, or a poll tax. [laughter]

11:04 MH: Hey. No way, we’re not going to have a bajillion political jokes during this… [laughter] This is not gonna happen. So that concept is interesting to me because it denotes this idea of, “Okay, as you start maybe you can’t just jump feet first into data democracy, but you can go toward data democracy. So at first you’re a data republic… No. [laughter] Or more likely initially some sort of data oligarchy. But then you’re bringing people up to be able to become literate in that data so that they can participate more and more with the use of it. I don’t know exactly how to say that. At the same time, reading a company’s financials, if you just step out of digital for a second, you can come to a lot of conclusions by looking at the financials of a company that may or may not be correct.

12:07 MH: That’s why people do financial analysis and why analysts get paid a bunch of money by Wall Street or whatever. I don’t really know. But you can read through those and be like, “Oh, wow. Snap loss. Two point whatever million or billion dollars in the first quarter of the year.” And there’s more to that story and you gotta know how to dig in and find the answers to it.

12:29 TW: So what do you think? As I was thinking about this topic, I used the same company that I’ve talked about many times that I was at for a while. And it was back in the days. We had Webtrends, but put the web analytics data aside, it was the world of PowerPlay Cubes, Cognos PowerPlay Cubes. And there was a lot of thought and planning that went into what was the cube that everyone in the sales team was gonna have access to?

13:00 TW: And it was a lot of data and we would have releases where we would add additional data to it, but the nice thing about a cube was that it was almost impossible. It was possible, but you had to work kinda hard to actually combine things in a way where it would return data and that data would be flawed. Now you could do it, but a part of the control is who had access to that was all these sales people and it was the data they cared about. They drove what it was. It was a rich data set. It wasn’t by any mean just static reports.

13:34 TW: So they had a vested interest in getting the data literacy in that area. Those guys weren’t pulling web analytics data, they weren’t pulling email marketing data, they were pulling the data that was important to them. And I guess that was a form of… Maybe that’s state’s rights. That was data democracy within the limited area. And that made a lot of sense, but when data literacy or data quality or data cleanliness or data availability or data visualization, every one of those seems to be something that gets talked about as though it’s like, “Do the project. Do the data literacy project. It’ll take three to six months and now we’ll have data literacy.”

14:18 TW: Whereas instead, you can work for the next 10 years and you will still have a lot of people who are embarrassingly data illiterate in your organization. So none of those can be wished away as like, “Oh, just do the project. Just make the data clean and then we’re all set.” And that just runs through almost every article and top that I read on the subject. It certainly gets wished away by the technology vendors who are saying, “Hey, we can pull all this data into one place. All you gotta do is… ” And I’m like, all you gotta do is, is a super high cost and there is an opportunity cost. If that’s what you’re gonna have invest your resources in doing and invest all your users, they’re gonna have to dedicate their time to actually learn this stuff.

15:06 MH: Right.

15:06 TW: And it’s wished away as being not as expensive as it is.

15:10 MH: And in most scenarios I’ve observed some people select in, they self select in. They’re interested in it, they’re attuned to it, they learn about it, they’re interested. And other people are just like, “Yeah, yeah. Throw it on the pile over there.” So it’s also, unless people feel like it’s a necessity, they choose or don’t choose to interact with it even given the opportunity.

15:36 TW: I love that. I sort of feel like, “Yeah, don’t make it everybody.” The people who are actually drawn to it and keep pestering you, give them that access and then coach them along.

15:46 MH: That’s fine. It’s just really hard to systematize something like that. It’s hard to say, “Our organization large is data informed or data secured, data happy or whatever.”


16:03 MH: I don’t know. I’m making up some new buzzwords. Feel free to play around with that as you’re listening and try ’em out in the business context and anything that sticks just attribute to Digital Analytic Power Hour. Thank you. 2017.


16:16 TW: But isn’t that fair? I actually like that… I’m thinking of a client that has been frustratingly on the other end where they won’t give anyone. They won’t give certainly not external agencies. They will not give them access to the data, they won’t… Even people internally. And they’ll point to… For three years I’ve worked with them and every time it comes up where I’m like, “Come on, guys. More than three people should actually have access to this.” And their response is always like, “Oh, but there was this one time where this one person… ” And they name her and what she did. And I’d met her. She’s been gone from the organization for two years, mind you. But that is still the example of what she did which was something that was well intended, a little misguided and caused a little bit of consternation in the organization.

17:08 TW: And I was looking at it saying, “Well, that’s… Absolutely let’s give her, let’s make sure she understands the churn that was caused in that situation.” She wasn’t in the analytics group. She was in more of a marcomm role, but she was interested and she wanted to dig around with it. So why not? The people who are drawn to it, give them the access. Now you’d hope that they wind up being more successful in the organization and therefore that gets noticed by other people who then actually get an interest and then are drawn to it.

17:42 TW: But, yeah I agree with you. Maybe this is what you were saying, the people who are like, “I want the data.” And then you give them the data and then they’re like, “Well, this is complicated.” Or, “I don’t care.” Or they never look at it or they never log in. And I’m like, “Okay, fine. Then we’ll keep doing stuff the other way.” But if you’re somebody who wants to dig into it. Hell, that’s how I got people on the analytics team when I was in-house at a company. It was these random people in the customer service organization or in the product organization who were drawn to it and they were drawn to it enough, they got access, they were good at it, next thing you know they were analysts. Which doesn’t have to be the path they go, but they were super knowledgeable of the business analysts.

18:24 MH: Yeah. So do you or don’t you support stronger background checks for selling data?


18:33 MH: Or are you content to let the private sale of data happen at data shows all across the [18:38] ____?


18:43 TW: This land that we live in.

18:46 MH: Anyways, sorry that…

18:47 TW: The right to protect myself with the data.

18:50 MH: That’s right, well but actually so there’s a point, not to that. That was more of a joke.


18:55 MH: There was a point I wanted to make which was one thing that I think that can be a inhibition today to data democracy that we really haven’t discussed on this. This lack, potential lack of checks and balances on data. When the CMO comes over to the analyst’s desk and says, “Find me metrics that make this campaign look good. I’ve got a meeting in 2 hours.” That analyst needs a structure to be able to say, “No, we will not ban data migration,” or whatever it is. Immigration, data immigration? No.


19:34 MH: So yeah, just ripped from the headlines. But do you know what I mean? Like that’s a real scenario that happens all the time of there being purposeful misuses of the data.

19:46 TW: Really? Like I just…

19:49 MH: I think that happens. Okay, I know it’s happened to me.

19:54 TW: Well…

19:57 MH: And, you know agencies are tempted to do it if they… I’m not calling anybody out.

20:00 TW: I was going to say in-house? In-house I think it happens happens less because there’s less of a…

20:06 MH: It could, it could. But, there are scenarios I’ve been in in-house situations where that scenario played out. The names have been changed to protect the innocent but that’s…

20:18 TW: But I don’t…

20:22 MH: I’m just saying it’s a risk.

20:23 TW: It’s a risk but it’s like… It’s a risk that I’m gonna walk into a door frame and give myself a concussion because I’m clumsy. It doesn’t mean I don’t leave… Well, actually I don’t get out very much. I mean, there’s a weighing the risk versus reward. Or put it this way, for people who are totally anti… We are going to keep the data hostage, I want to go in and say, “You know what? Let’s slowly start opening it up and then the first time somebody gets burned, then let’s have that discussion and see if we could mitigate it.” ‘Cause that’s gonna be three years from now and it’s gonna be a minor little thing where somebody gets slapped on the wrist and it’s fixed.

21:04 MH: Yeah, again, I don’t know.

21:06 TW: So I will argue against data hostage or data fascism or whatever, data authoritarianism pretty hard.

21:12 MH: Yeah, well I’m against that too, but I think that’s the thing is there’s got to be structure for there to be interplay. And a lot of times that just falls on the analyst to try to be nimble enough to work their way through that situation without…

21:31 TW: But working through what situation? That who’s getting access to the data or the specific scenario where somebody is [21:35] ____?

21:36 MH: No, no… Being asked to do analysis that is fundamentally skewed towards a specific point of view.

21:42 TW: What does that have to do with data democratization?

21:45 MH: Well, it has to do…

21:46 TW: I know you’re running the show.

21:48 MH: Well, I don’t know exactly…


21:50 MH: It’s just an issue that I’ve had and I think it fits sort of in this category.

21:55 TW: I mean, I guess that’s getting to the risk of the misuse of data in various ways.

22:00 MH: Yeah. Well and certainly this topic is probably dancing around data governance or is data governance we just haven’t said those words yet.


22:13 MH: But, the point is, is that there’s a lot of ways for data to be misused in your organization and I feel like once people get what can be done with data, there is increased incentives to get in there and mess with those kinds of things. To use it as a weapon. And that’s where I feel like that is very unhealthy, it’s very negative for an organization and its culture around data, and that is something to fight against, if it’s possible.

22:44 TW: I totally don’t think… Would you say that’s the number one risk of making data more…

22:48 MH: No, I don’t. Again go back to Hanlon’s Law which is, first you just have to get people up out of not knowing anything about it. There just needs to be a modicum of data literacy or ability to understand the data and what matters. And I feel like we’re speaking at such a high level, it’s kind of hard to… Digital analytics data. If you’ve worked with it for any period of time whatsoever you’ve found out two things. One is, it has limited utility, which is when you need to realize that you’ve got to translate it back over to the business so that it could be understood in a business context. And a lot of times you see analysts just playing on the digital side and doing all this cool stuff with just digital data and that’s cool, up to a point, except there’s no connectors. No connectors to how the business operates. And it’s that kind of bridge that is, I think, fundamental if you want to pursue data democracy.

23:53 TW: Well, but, I actually think the bigger… I was thinking if you take just digital analytics data, if you just give somebody access to Google Analytics or Adobe Analytics with a reasonably sized site, with a reasonably involved implementation, there’s a risk that everybody who dives into it, who now feels the pressure of, “I have to have data when I present on this.” The bigger risk is that… Or a large risk, is that they will spend hours. That the data literacy side that isn’t understanding the data, but actually how to approach the data in a disciplined way, that you don’t spend a week, and all you’ve done is fuck around with the gender and demographic stuff in Google Analytics, because you try to click on every report.

24:42 TW: Is that why in, say, analysis workspace in Adobe Analytics you can literally lock down specific views? So that people… You can create… So this is literally sort of… It’s not anti-data democracy, in fact I feel like in a certain way it was kind of smart. Again, let’s get Ben Gaines on here to explain what he was thinking, but I think it was to serve data democracy by giving people a slice of the data that you know they can understand and work with effectively. To stall out that exact scenario you just described. Just giving somebody a login to Adobe Analytics or Google Analytics is a sure fire recipe for them ending up in the real time reports for 45 minutes watching traffic roll in. [chuckle]

25:32 MH: Which by the way is great fun, and I enjoy doing it, it’s just not gonna like make you be insightful about your business.

25:42 TW: Yeah, give them better visualization tools where they can change the color palette of some sort of scatter plot but…

25:49 MH: And to be fair, I don’t know to what extent, or I don’t have as much exposure to Data Studio, so I don’t know, but I presume they have a similar structure as analysis workspace or you can create views in Data Studio that can contain specific slices of the data for the audience that you’re serving. And I feel like that is a good thing for data democracy, it’s not… But that’s not really democracy, or at least it’s not full access to the data.

26:19 TW: Right, well that’s back back to my PowerPlay Cube example, right? And I actually, I used to say… This was something I used to think that Google Analytics had as a really, really strong strength over Adobe, and there was a very good reason why they had it as a strength, which was you literally could come at data, the same question, three totally different ways in Google Analytics, and it would net out within one or two, whatever your counts were. Now, that was because they had not introduced the concepts of hit versus session versus user scope custom dimensions, whereas Adobes had eVars that are messier, but more powerful for a long time. And so it used to be that, sure, give somebody access to Google Analytics, it will be impossible. They didn’t have custom variables. All you had was events and page views, and segments, and you weren’t going to be able to get yourself in trouble. And then they had to scale, it got more complicated.

27:17 TW: But I’m with you. I look back to the PowerPlay world, or my experience with PowerPlay years and years ago, was you could define that limited system that still had an infinite number of ways it could be sliced. It wasn’t defined down to, “Oh, I’ve given you a really fancy Excel spreadsheet and you can choose drop downs and pivot table slicers.” It was much, much richer than that. And I think that’s Power BI, Power Pivot, I think that’s sort of the world they’re going into, but you’re still… What that does mean is you’re drawing a circle around a set of blessed data with the linkages defined, which means, inherently, that users will be asking questions that require data that exists but is not part of that data set, that they don’t have access to, and they need to go to an analyst to get that answered.

28:12 TW: Which I think is great if they’ve got a lot to play around with that they’re gonna be less likely to get themself in trouble, and it’s gonna get them to ask a smarter question, and have a better intuitive understanding of the data, then that’s great. That’s perfectly where an analyst can help them, and have an informed decision. There’s already a base level of data literacy from, “Do you understand what the data means?” Or, “Can I quickly explain to you the distinction between a booking and a billing?” Or whatever it is. And now, let’s move on to a more sophisticated level better use of the analyst time. I think that’s great, and I think that’s hard to systematize, because it’s gonna be varying levels of interest and aptitude from those stakeholders.

28:58 MH: Well, and I think it’s a progression that has to happen in an organization to get there.

29:03 TW: Yes, but I guess the organization does have to be open to that so that’s…

29:07 MH: Yeah there’s gotta be some buy-in on a very senior level, I think, for this to happen. And honestly, I think good analytics anywhere has to have that.

29:16 TW: Well, but I think that’s the buy-in… A lot of times I feel like the buy-in reads some of the trade press, where you get these nauseating things, where it’s decreed from on high, “Oh, you’re all gonna do this.” And it’s like, “No, all’s not the right words.”

29:36 MH: Well no, and I… Yeah, I mean support not stupid decrees.

29:42 TW: Okay. [laughter]

29:44 MH: Just to be clear.

29:46 TW: Yeah, so it’s messy, right? I guess, maybe that’s the way to look at it, is that having, stopping and reflecting, as you said, where it’s kind of at this abstract discussion level, and it makes it really, really hard, whereas I feel like I can think through current and past clients and experiences where, with a little bit of deliberate thought, it is like, “Look this is the path forward.” If we say that the data wants to be free, the data should be out there, the data should be out there as far as it can be, in as many people’s hands as it can be safely used without being a massive time suck and resource drain on the company. The question is, as an organization, what does that mean? What group is it? What people? Which data sets? Who’s managing it? What else needs to happen? What’s the road map, to let it continue to get out there more broadly?

30:40 MH: Yeah. Well, and then there is the… I don’t know about layering but it’s the first you give people exposure to the data so they can consume it, ask questions of it and those kinds of things. Where do you draw the line or where are the gradations in that from it going from data analysis, insight generation and so on? At that point of generating insights that we’re recommending to the organization, that requires the most amount of knowledge, would you agree with that? To do correctly? To leverage the data appropriately?

31:22 TW: Yeah. I guess what I’m pausing on is that what I really want is, I want the business partner, that no matter how limited the data is, they’re going to asking questions that they actually have an action or a decision in mind. They don’t look at it and say, “Oh, that’s not sliced by device category, we should do that because it’s not sliced by device category and I know that data is available.”

31:53 MH: Yeah, yeah. I’m not saying, “Hey, let’s… ” Yeah, I’m anti solutions looking for a problem.

32:01 TW: No, no. I guess what I’m thinking is that if you start with the most… I’m thinking through some of the cases where what I’ve done, somebody has access to Adobe Analytics but, frankly no business user is gonna get into analysis workspace or Data Studio and a blank slate and be like, “Oh, this is awesome, this is totally clear to me [chuckle] what I need to, what I should be doing.” If I take cases where I say, “Let me build you something that has some interactivity and is kind of rich, and you have to understand how we’re trying to map your PPC investments and your product taxonomy there to the actual purchase, and that that’s messy, and let’s put that on a grid. And let’s figure out the way to do that, where you can play with it a little bit.”

32:53 TW: And there’s no way in hell that very smart, engaged, thoughtful, provocative idea business user is gonna be able to build that even though that data’s there. But by building that for him, he can ask questions where I say, “Oh you can actually get at these other questions yourself.” And can iterate back and forth. So it comes back to me to this relationship of building trust and then figuring out, what can I give to specific people with increased access and increased power where they don’t feel like I’m trying to dump my job on them and they feel pretty confident with what they’re doing, that they’re looking in the right place for the right things. And it’s really hard for me to see that saying, “This is the formula for this group. Everyone in this group has X and it’s gonna work starting tomorrow.” That’s a struggle.

33:46 MH: I just had a wacky idea, which is maybe there are literally a discrete amount of those, and that we as an industry just need to figure them out and create the tools that make that exploration possible across whatever those are.

34:03 TW: What the scenarios? What the people?

34:06 MH: Yeah.

34:06 TW: The people scenarios? Or the…

34:07 MH: No, no, no. The data. In other words… So for instance, alright… Yeah, I don’t know. That’s really a wild thought. It’s too big.


34:19 TW: Well, I will say, I’ve got this fuzzy, wacky idea on the R front is that, because that’s an open source world, I do think… And maybe this goes to what the DAA has done with analysis recipes, analytics recipe… Whatever they’re called, that if we could have an organized and structured way of saying, “These are all the things we might wanna do with data. This is good for time series data that has four possible dimensions.” That you could wind up with this infinite cookbook of things you could do with it. The trick is getting the business question mapped to which one of those scenarios in it and which data can I plug in to actually use it.

35:06 MH: Right. Well, and that’s the idea and also different data looks differently. We have different levels of it, different segments. There’s a lot of different ways that can break apart. But it is at… Yeah, that’s a whole other show.

35:26 TW: Yeah.


35:27 MH: Give us five years, we’ll come back to you on that. So yeah, I think this concept of data democracy is one… I honestly came into this episode, Tim, against it notionally. And now I’m kinda changed my tune a little bit.


35:46 MH: Hopefully not to agreeing with you. Hopefully we didn’t get there. ‘Cause I know our audience really wants for us to be in disagreement but…

35:55 TW: I think we somehow disagree. We started in roughly the same spot, managed to disagree ourselves to a point where we’re…

36:02 MH: Well, but maybe data democracy is possible on a path and that’s the next thing I’m interested in exploring as I move forward from here. Unfortunately we just don’t have time to do it on the show.

36:17 TW: But I like that, data democracy, it’s a journey, not a destination. And said with a goofy voice, but maybe that’s what it is, is that, that is a…

36:26 MH: Are you calling my voice goofy? Is that what you’re…

36:29 TW: No, that was me just saying it. It’s a…

36:30 MH: Data democracy, it’s a journey not a destination. But yeah.


36:37 TW: That is the wacky use of it. Well done.


36:40 MH: Sorry.

36:42 TW: That was good.

36:44 MH: This is good, though. I think we’re not as locked down as I thought, but I think we’re… That was a good conversation.

36:52 TW: I’ll take it.

36:54 MH: Well, why don’t we see if we can keep this sort of semi-agreement train rolling and do a last call.


37:01 TW: You wanna start?

37:03 MH: Sure. I am reading a book which is the memoir of Phil Knight, the founder of Nike called Shoe Dog and I actually, I came across it. Apparently it’s a pretty popular book I would imagine, but I came across it from a tweet from a guy by the name of Ben Thompson who runs an awesome blog called Stratechery and a great podcast called Exponent. So dual plugs for those things, but he was saying, basically he tweeted something like, “This book’s writing is amazing. I can’t believe how great it is.” And it just made me be like, “Okay, I need to pick this book up and start reading it.” And sure enough, it’s engrossing and it’s just the story of Nike, how it came to be. And as a young company, all of the struggles they went through, just really, really fascinating. So great book if you like to read memoirs or biographical books, I highly recommend Shoe Dog by Phil Knight.

38:04 TW: That sounds awesome. I’ve heard… I’ve definitely heard of Phil Knight, but I think I was vaguely… How long… That came out?

38:10 MH: Yeah, it’s been out for a while I think.

38:11 TW: It’s been out for a good while, yeah.

38:12 MH: Yeah. I had sort of seen it and then I was like, “Oh, that would be interesting.” But I saw that tweet and it just made me like, “Okay, I’ve gotta go get this book and read it next.”

38:24 TW: Are you finding any analyst relevant nuggets particularly or when you read it as an analyst, is there talk of…

38:33 MH: I will say I don’t know, but certainly they learned by experiencing things. They were trying stuff, but the early years of that company were such a fight for survival. He’s telling you how it really feels in a start up and it is not glamorous. It sounds awful. So yeah, but I feel like that’s the actual experience in a real start up and that’s something our today media and perception of startups doesn’t really give us that visibility into that often, but it’s really compelling. And it’s like you feel so close to death that you feel very alive and that’s like, “Okay, that scares the crap out of me.” But wow! What an amazing story to have lived through those things. So anyways, I just…

39:23 TW: That’s a good one.

39:25 MH: I’m heartily enjoying it and big props to Ben Thompson for convincing me to give it a read.

39:32 TW: Do you feel like you’re… So there’s the podcast recommendation is season one of the Start Up podcast, which is on season four at this point. But since Gimlet Media has become fairly successful, not Nike successful, but it’s only been around for a couple of years. They’re season one where they were basically recording a podcast as they were literally trying to figure out what the company was and starting it up was a fascinating one to listen to, which is not my last call. So I will throw in…

40:02 MH: We got the special podcast last call category for Tim now. So every episode…

40:08 TW: [chuckle] It’s like, “What podcast… “

40:08 MH: We go around the board and do a last call and we do a podcast last call for Tim.

40:15 TW: You just said it gave you the sense of being inside a startup. That was a great season.

40:20 MH: No, I just love that almost for any topic, you can be like, “Hey, I just heard a great podcast.” You are truly the quintessential podcast listener. That’s what I’ve heard people say about you.

40:30 TW: It’s because if you had a commute as long as mine, then you too would find a lot of podcasts.

40:37 MH: That’s what I don’t get. Where do you find the time?

40:41 TW: Where do you find the time? Travel.

40:42 MH: At least I have Atlanta traffic. That’s what I’ve got.

40:49 TW: What do you do at the dinner table? You actually talk to your kids? I just put the ear buds in.

40:54 MH: Who eats dinner at the dinner table? What’s that about?


40:57 TW: Hello 1957. So I got to go to the first ever MeasureCamp in the United States in Cincinnati a month and a half ago, and it was a transformative experience and I had high expectations. So a little chaotic, but also probably had one of the most fun sessions I ever had. So given that and we’re just two weeks out from MeasureCamp San Francisco. So I think MeasureCamps are going to take off in a few places in the US and it would be really cool to say you were at the first San Francisco one. So I believe that’s sanfrancisco.measurecamp.org. They might already be sold out. If they are, I think I heard through the grapevine that you can reach out to him and say, “But I’ll volunteer.” And you can wind up… You just have to do a little work, but you can actually still get in. So if it’s… And if it’s already sold out and there’s just no way in hell you’re gonna get in, then my apologies and move faster next time.

42:00 TW: That’s not my last call. My last call is actually a short little video which has a much longer paper behind it. Michael, are you familiar with Anscombe’s Quartet?

42:08 MH: They are a string or a brass quartet?


42:15 TW: So, back years and years ago, a guy was trying to make the point that you can’t just look at summary statistics for data, and so he took and came up with a data set where there were four data sets, they all had virtually identical linear regression, Pearson coefficients and means and medians, but then when you actually graph them, they are four very different things. So, it’s cool. Data visualization people like to use that as a way to say, “You need to visualize the data.”

42:46 MH: That’s why you always test for kurtosis and skew and that kind of stuff?

42:50 TW: Yeah. Or actually, my first stats professor ever said, “Always visualize the data. Always.” And I don’t even think he necessarily used it, but these guys at Autodesk Research have put out this paper and it’s got a little video that goes with it called “Same Stats, Different Graphs”, where they took this one data set, I think it was actually one where a guy had drawn a dinosaur with it to make a lesser point, but a valid point as well. And they said, “We’re gonna take the stats for that data set, and we are gonna draw 12 completely different visualizations, 12 totally different data sets.” They basically built this tool where they could say, “Oh, give me a data set that looks like a star that has these properties.” And it would build it. And so this video is just kind of fascinating to watch ’cause it shows these handful of stats running… And they don’t stay exactly the same. They’re the same to three decimal points, and then you can see the next four decimal points where it’s…

43:45 MH: Different.

43:46 TW: Spinning.

43:46 MH: Oh, okay.

43:47 TW: But it’s a case for… And actually, at MeasureCamp Cincinnati, there was a guy making some of the similar points, saying, “This is why you visualize stuff.” And I use the example of saying, if I look at, “Hey, which channel was the biggest channel in the last quarter?” Well, that may be horribly misleading if on a three-day period in the last quarter something blew up in social. So, hey, I don’t wanna look at just the totals. I wanna trend it to make sure that it’s really representative. So, it’s a cool… It’s less than a two-minute video, and it’s just kind of like, “Oh. Wow, that’s interesting.”

44:22 MH: Nice. Yeah, that’s definitely worth checking out, just because that is a challenge that is in the data sometimes, and you don’t necessarily know.

44:31 TW: Yep.

44:32 MH: Alright. Well, if you’ve been listening and you’ve been completely blown away by our insightful analysis [chuckle] of data democracy [laughter], just remember this: You’re either for us or you’re against us. Wait, no. [chuckle] I’m kidding. We would love to hear from you, though. And again, a big thank you and a shout out to Pavel for recommending this topic. I think it’s a really good one.

44:58 TW: Even if we didn’t do it justice. [laughter]

45:00 MH: Yeah. Well, but we made some great jokes throughout, if I do say so myself.

45:06 TW: Okay. [laughter]

45:08 MH: No. [chuckle] But, no, and I think this is one that could definitely bear more discussion, so we’d love to hear from you, whether that’s on our Facebook page or on Twitter, or on the Measure Slack, of which we are massive fans. So, get out there, give us your best data democracy jokes. If that’s… Whatever those are, and we’ll vote on the best ones or something. [laughter] Alright. See, ’cause it’s democracy. Voting, see?

45:40 TW: I get it. I get it. It’s all that humor. If you didn’t find this episode really funny, you just weren’t listening close enough ’cause it was subtle democracy humor.

45:49 MH: Well, it’s like the old red green show goes. If they don’t find you handsome, at least find you handy. [chuckle] And so, if they don’t find you good at analytics, they should at least find you humorous.

46:02 TW: There you go. [laughter]

46:06 MH: Anyways. For my co-host, Tim Wilson, some have said he’s the quintessential analyst, and myself, keep trying to make data free. Oh, wait. Data wants to be free? No. Just keep analyzing.


46:24 Announcer: Thanks for listening, and don’t forget to join the conversation on Facebook, Twitter, or Measure Slack group. We welcome your comments and questions. Visit us on the web at analyticshour.io, facebook.com/analyticshour, or @analyticshour on Twitter.

46:44 TW: Smart guys want to fit in, so they made up a term called analytics. Analytics don’t work.

46:53 S4: Oh, alright then. Give me a good two by two grid.

46:56 MH: Yeah, exactly. See?

47:00 TW: Pom, pom, po-pom, pam, pa-pam, pa-pam. Oh. That was Hail to the Chief, wasn’t it?

47:04 MH: Hail to the Chief. At this point, by the time this show airs, Stat may not even be in business anymore.


47:19 TW: Wait, you said there were two things?

47:21 MH: Okay, I’m not good at numbers.


47:26 MH: It’s just how I feel. No, I don’t remember where I was going with that. Sorry. But I feel like that’s actually how… I imagine that’s coming through.

47:37 TW: Yeah, that’s coming through.

47:38 MH: Oh, my God.

47:39 TW: What is that going on?

47:41 MH: My kids in the next room playing with the doorstop.


48:24 TW: Rock, flag and eagle.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_243_-_Being_Data-Driven__a_Statistical_Process_Control_Perspective_with_Cedric_Chin.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares