Under the Radar

238: A/B Testing


00:00:00   Welcome to Under the Radar, a show about independent iOS app development.

00:00:04   I'm Marco Arment. And I'm David Smith. Under the Radar is never longer than 30 minutes,

00:00:08   so let's get started. Now, should we try two different intros here

00:00:12   to see which one is more effective? Sure, I can go first and then you can go.

00:00:16   Yeah, right. I can say welcome to Under the Radar, a show about

00:00:20   independent iOS app development. I'm David Smith. And I'm Marco Arment. Oh,

00:00:24   this show's never longer than 15 minutes, so let's get started. 30 minutes?

00:00:28   Our first test is off to a

00:00:32   roaring success. I don't have any practice doing that part of it. It's hard.

00:00:36   I did yours fine. Yeah, you nailed mine perfectly. But yes, so we are

00:00:40   going to talk, so everyone knows how much Marco and I love

00:00:44   automated testing, and specifically we're going to talk about a particular kind of that

00:00:48   A/B testing, which is honestly the only kind of automated testing I've ever actually

00:00:52   used. Is this really even, like could this even be the same thing as testing?

00:00:56   Should we even be using the same word? Because, you know, when most people say automated testing, they're

00:01:00   talking about a code level thing. Oh, sure, I know, I know.

00:01:04   But this is the kind of testing that I can get behind, and it's automated

00:01:08   because I built a system to do it, and it happens automatically. I would maybe

00:01:12   call this more like marketing science. Sure. But the

00:01:16   fancy word that everyone uses for it is A/B testing.

00:01:20   And I think it's something that I looked down on for a while.

00:01:24   I don't know if looked down is the wrong word for it, but I feel like it was always this like, you get these

00:01:28   articles that were passed around every now and then, where it's like, ooh, Google

00:01:32   tested 46,000 shades of blue for their link in their home screen,

00:01:36   and they're completely written, you know, like the app, or all their

00:01:40   design decisions are designed by computers, and it's lost, you know,

00:01:44   a sense of soul or artistry, and you kind of pass it around in that way.

00:01:48   And I feel like sometimes it's easy for me to think of it in those terms,

00:01:52   but in recent, recently, this is something that I've started using,

00:01:56   and it's a tool that, it's like, surprise, surprise, people are using it because it's useful, and

00:02:00   because it can answer questions that are very difficult to answer otherwise,

00:02:04   and lets you in some ways be more creative,

00:02:08   I find, which is something I want to talk about later, but I think it's really

00:02:12   important as a developer with limited resources

00:02:16   to consider this as one of the tools that are available to us, that

00:02:20   it's something that we can use to make our apps better, to make our apps

00:02:24   better in ways that are actually meaningfully and measurably better,

00:02:28   not just notionally better, but before I dive

00:02:32   too much into that, I think it's important, I think we're just going to start with just kind of a general overview

00:02:36   of what A/B testing is, and the mechanics of kind of how it works. Does that make sense?

00:02:40   Yeah, I actually would love to hear about this from you, because

00:02:44   this is exactly the kind of thing that I wouldn't do

00:02:48   until you did it and told me, and kind of convinced me to do it, because

00:02:52   you are so much more pragmatic and experimental than I am.

00:02:56   When I hear something like A/B testing, which I don't think

00:03:00   I've ever actually really done, like, I've done

00:03:04   like, you know, notional things like, oh, I'm going to change the wording of this thing in this

00:03:08   build, and then see what happens, but then I'm not actually testing at the same time, so

00:03:12   I'm like, you know, I'm not controlling the variables, so it's not really A/B testing.

00:03:16   You know, where you are very good at pushing the boundaries

00:03:20   of what independent developers think we should be doing and not doing,

00:03:24   you are more open-minded to say, like, you know what, this thing that, like,

00:03:28   other people do, that indies think is not for us, or not appropriate

00:03:32   for us, or not something we need to do, you are more willing to try it,

00:03:36   and then report back to us, and kind of convince us in

00:03:40   more evidence-based and less

00:03:44   emotional or less assumption-based reasoning than we would use, like, whether we should

00:03:48   actually be looking at this or not. Thanks, that's encouraging to hear, but

00:03:52   I think it's very much, like, A/B testing is

00:03:56   a, I'll just do my overview. So A/B testing is a method by which

00:04:00   you can evaluate the relative performance of two things

00:04:04   in a way that you, at the end of the

00:04:08   test, you have reasonable confidence that one is better or

00:04:12   worse than the other, or they're the same, I suppose.

00:04:16   It ends up with this sort of comparison function between two things.

00:04:20   And in software, what we're typically doing is

00:04:24   we're going to segment our user base, our audience, whatever it is, into

00:04:28   two groups. For the purposes of this, I'm going to do two groups. You can do A/B testing

00:04:32   with more than two groups, it's just the math and

00:04:36   implementation of that gets very complicated, but it's like, in my version of this, I've done

00:04:40   it with multiple variants and you just kind of have to account for that. But for simplicity,

00:04:44   you take, say you take your audience and you can split it into

00:04:48   two groups, and you're going to want to try and make those groups as

00:04:52   similar as possible so that you're not creating

00:04:56   some kind of bias in your system. So say, for example, you wouldn't want

00:05:00   to do it that if, you know, it's from midnight to noon,

00:05:04   UTC is one group, and from noon to midnight is

00:05:08   another group, for example, like where you're shifting it based on time of day.

00:05:12   That would be a bad way to segment your group because now you're creating these other variables

00:05:16   that might impact how people are responding to whatever it is you're showing them.

00:05:20   You want to segment them essentially randomly. And so for my A/B testing,

00:05:24   I'm doing this completely randomly. It's just

00:05:28   using Swift's random functions to adjust

00:05:32   for this, and it seems so far it's been working great. And I think it's close enough

00:05:36   for my purposes that I'm not trying to do something more fancy, but you just want to have

00:05:40   two equally representative groups. And then you're going to show

00:05:44   each of those two groups something different.

00:05:48   In my case, I started doing this for trying to work on improvements to the paywall

00:05:52   in Widgetsmith. That's something that I'm wanting to work

00:05:56   towards making it better. I want to make it more, you know, sort of actually have a higher

00:06:00   conversion rate and, you know, increase the number of people who are

00:06:04   starting subscriptions. That was my ultimate goal, is, you know, how can I change my

00:06:08   paywall so that that happens? And so the first thing I did is I took my original

00:06:12   paywall and then I created a new one, and it was slightly different. And it was different in, like,

00:06:16   I changed the buttons on the bottom of my paywall. It was the first A/B test I did.

00:06:20   And for one of my groups, I showed them the

00:06:24   one, for the other one I showed them the other. And then I instrument my app

00:06:28   so that I can tell essentially how many times did you show A,

00:06:32   how many times did you show B, and then what was the relative conversion

00:06:36   rate of A versus B. And you, you know, I put this

00:06:40   in my app, I run it, and what I start to get back is, you know, as they're

00:06:44   reporting back their data, I'm getting the number of times the paywall was shown and the

00:06:48   number of times that, you know, someone became, started

00:06:52   a subscription for each of these two groups. And in some ways you would think,

00:06:56   well, that's all you need. But the more, it's like

00:07:00   this is where you start to get beyond my expertise but into an area that I think I can describe

00:07:04   but not understand. And this is the, sort of the concept of

00:07:08   statistical significance and whether

00:07:12   the differences you're seeing are actually meaningful and you should interpret them

00:07:16   as being true, or if they are potentially more likely to be

00:07:20   a result of chance. And the way that I think about this is

00:07:24   say you were trying to see if a coin was fair. And so you, you know,

00:07:28   you flip a coin and in theory it should, you know, be half the time should come

00:07:32   up heads and half the times it should come up tails. And so you flip a coin

00:07:36   once and it's heads. So right now you have 100% heads

00:07:40   and 0% tails. And so if you stopped your test there, you'd be like, wow,

00:07:44   heads is way more likely than tails. But you obviously

00:07:48   don't have enough data to really draw that conclusion yet. So you flip it again and it's also heads.

00:07:52   So now you have two heads and zero tails. And it's like, wow, this, you know, this paywall is

00:07:56   performing amazingly. You know, A is so much better. But obviously the more you flip

00:08:00   the coin, over time, if it's a fair coin, you'd end up with

00:08:04   50/50 and just sort of these statistical sort of noise

00:08:08   that you can end up with where you can have streaks of people, you know,

00:08:12   maybe in my case I could have streaks where there's just people are being more

00:08:16   generous and, or more excited or whatever it is. And you have

00:08:20   these things that are kind of happening, but it hasn't, but it's

00:08:24   it isn't actually representative of the fundamentals difference.

00:08:28   And so you, there are these big formulas that you can just sort of plug your data into and you

00:08:32   you know, essentially the more, the more trials you have

00:08:36   the easier it is to say that there's a significant difference between one thing

00:08:40   or the other. So if one of them is coming, and obviously if you have big differences between them, which

00:08:44   in a couple of my cases I actually did, where, you know, it's like one of them has

00:08:48   you know, has this performance and the other one is like twice the performance.

00:08:52   You actually don't need quite as many trials to say that that one is actually better

00:08:56   but if you're in any of these cases where one is 5% better or 10% better

00:09:00   you actually need a relatively high number of trials before

00:09:04   you can be confident about this. And so the way I've been doing this is, you know, there's lots

00:09:08   of like websites or calculator things that you can just punch your data into

00:09:12   and it will tell you, you know, how likely can you be, how confident can you be

00:09:16   that this difference that you're seeing is actually meaningful and not just

00:09:20   the result of kind of statistical noise going back and forth.

00:09:24   And the end result is that you'll end up with some kind of improvement

00:09:28   and a confidence score. And so, you know, so my old paywall versus my new

00:09:32   paywall, my new paywall performs better, it performs better by this percent

00:09:36   and I'm, can be, in my case I was like 99.9% confident

00:09:40   that it was actually an actual improvement. And then from there you just can

00:09:44   kind of continue to iterate on this and you can continue to say, well what if I came up with

00:09:48   another test to run or another option to try

00:09:52   and you can just keep adding options into this and you continue to see what's the

00:09:56   relative conversion rate, how do they compare against each other

00:10:00   and kind of go from there. With the ultimate goal, obviously, of trying to say

00:10:04   what is the best, you know, the best version of this that

00:10:08   I can make. And the best, as long as you have a good definition of best,

00:10:12   you're in a good place. And for something like a paywall it's relatively easy because

00:10:16   I can, you know, I can relatively, I can say, you know, what I want

00:10:20   is someone to start a trial. That's my goal and that's pretty straightforward.

00:10:24   If you're doing something where you're trying to like A/B test, you know, colors

00:10:28   in something, you know, which color should I make my,

00:10:32   you know, the text in my app and you're going to be like, well I want to improve

00:10:36   retention. It's like, retention is a very hard thing to measure that takes a very

00:10:40   long time and is impacted by a lot of different factors, so that's probably a hard thing to

00:10:44   measure. But something like this where you have a very clear goal, you can segment

00:10:48   very easily, you know, essentially every time someone taps on a paywall

00:10:52   I have an opportunity to segment them, say which group are you going to be in, show it

00:10:56   to that person, and then I can measure the result.

00:11:00   And I think the only other thing that I want to mention is that, you know, something I found that was relatively

00:11:04   important for things like this is the importance of when you're doing that

00:11:08   segmentation to be slightly sticky with it, to move, you know,

00:11:12   to make a user into one of the groups and have them stay there

00:11:16   in my case, I have them stay in there for at least two or three days

00:11:20   because you don't otherwise, you know, if every time they open the paywall they're seeing a different

00:11:24   one, to some degree that's helpful, but then they're also

00:11:28   gaining the exposure to all of the sort of subsequent ones, and what you're actually measuring

00:11:32   is, well, if I show the user all three of my paywalls, the third time I show it to them, they'll

00:11:36   be successful, and that gets really complicated, and so, have some sense

00:11:40   of stickiness to that, and then in my case, after three days, it resets

00:11:44   and does a fresh check to see, it checks in with the server and says, are we still

00:11:48   running this test, and if they're in a group that is still valid, it will

00:11:52   randomly reassign them again. But anyway, hopefully that made sense,

00:11:56   that's sort of the general process for this, and the end result is that, you know,

00:12:00   in my case, it's like, I've been able to have a more performant paywall because I tried a lot of different

00:12:04   things, things that I wasn't sure about, and I can validate the end result with

00:12:08   actual data to know that, yep, these are better, this is, you know,

00:12:12   more people are starting trials than where before, and the app, you know, sort of like, the app

00:12:16   is more sustainable as a result.

00:12:18   We are brought to you by Supporter. In past episodes, you've heard us talk about how to handle

00:12:22   customer support in your apps. Well, you can add a native, great-looking support section

00:12:26   to your app in only three minutes with Supporter. Bring your support section

00:12:30   to the next level with step-by-step instruction pages, frequently asked question toggles,

00:12:34   image and video support, release notes, contact options, markdown support,

00:12:38   great accessibility, and more, and it's only 62 kilobytes.

00:12:42   You can use Supporter with local data, or you can load remote JSON files

00:12:46   for easy updating without having to update your whole app, and you can localize your support

00:12:50   section based on the language of your user. You can use the optional count-based analytics

00:12:54   to keep track of which support section gets visited the most, that way you know exactly which

00:12:58   parts of your app are unclear. This, Dave, this sounds a lot like the way you did yours manually.

00:13:03   With your one-time purchase of Supporter, you will receive

00:13:07   the full Swift UI source code, so you can customize every little detail,

00:13:11   clear instruction videos on how to build a great support section, and full documentation

00:13:15   with example JSON. So, this is obviously

00:13:19   right up the alley of a lot of our listeners and possibly the two of our hosts here.

00:13:23   So, head to Supporter.goodsnooze.com

00:13:27   like snooze like sleeping, so Supporter.goodsnooze.com

00:13:31   Use code "under the radar" for 20% off

00:13:35   to get Supporter for just $20. Add a complete support section

00:13:39   to your app in only 3 minutes with Supporter. Once again, that URL

00:13:43   is Supporter.goodsnooze.com. Code "under the radar"

00:13:47   for 20% off to get it for just $20. Our thanks to

00:13:51   Supporter for their support of this show. Thank you so much to Supporter.

00:13:55   So, I'm actually really curious, you know, this world of

00:13:59   A/B testing, so you brought up a few things I would not have thought of, like the stickiness

00:14:03   thing of like, you know, make sure somebody's being shown the same thing on subsequent

00:14:07   attempts within a time interval. That actually makes a lot of sense and that's something that

00:14:11   if I was just doing like a little naive implementation of this, I probably would not have thought of that.

00:14:15   The big, I think, dilemma and the big nuance point here

00:14:19   for a lot of indies is similar between A/B testing and analytics.

00:14:23   You know, we've talked about analytics before and we both perform some

00:14:27   level of very basic first party analytics in our apps.

00:14:31   You know, we don't integrate like big analytics packages, but, you know, we both have kind of like these

00:14:35   custom stuff running on our own servers to just say like, "Oh, you know, this percentage of

00:14:39   the user base uses this feature of the app or whatever."

00:14:43   And I think my big challenge with analytics has been

00:14:47   over time, like figuring out what is the right balance here?

00:14:51   And what analytics can I collect that are actually

00:14:55   actionable? And that to me is always like the key

00:14:59   like gotcha factor is, am I collecting

00:15:03   this data because I'm just curious and I want as much

00:15:07   data as possible, which is not often a great thing, especially in a privacy conscious world?

00:15:11   Or am I collecting this data for some kind of decision

00:15:15   to be made in the future? Like is this actually actionable data?

00:15:19   And I feel like with A/B testing you have a similar issue with

00:15:23   that and you have a couple of new issues. So the similar issue is, are you

00:15:27   testing something that really needs to be tested? Can you

00:15:31   just figure out like which of these text labels is more clear on this button

00:15:35   or do you really need to test it? And then if you test it, are you

00:15:39   actually going to get like a really significant difference between two options

00:15:43   or are you most likely to get like a very small, you know,

00:15:47   not super significant difference between the two? And there's going to be

00:15:51   different areas of your app where you have different choices to make here. And then

00:15:55   obviously things like paywalls or whatever

00:15:59   your business goal for the app is, that seems like the most important

00:16:03   place to put something like this. Things like

00:16:07   if you have a login or account creation process, you want to know like

00:16:11   what kind of tweaks do you have on your first launch experience

00:16:15   that can make more people proceed and create the account or set up the thing

00:16:19   or whatever they need to do. And then obviously when it comes to how you make your money

00:16:23   if there's an in-app purchase or whatever, you obviously want to instrument that

00:16:27   as well as you can to be as optimized as it can be within

00:16:31   reason. And I think that's the key part.

00:16:35   People made fun of the Google 40,000 shades of blue thing because it seemed ridiculous

00:16:39   and it did seem more of like a micro-optimization.

00:16:43   And I think you can definitely get bogged down a lot in

00:16:47   that sort of thing of like, you think you're optimizing the heck out of something

00:16:51   but you're actually spending a huge amount of time and waste

00:16:55   and privacy and data to possibly only get something

00:16:59   within a few percentage points of what it was before. And so I think this is one of those areas

00:17:03   where a little goes a long way, just like analytics. You kind of want coarse-grained

00:17:07   analytics to know like, do people really use this feature or do I really need to be doing this thing?

00:17:11   But to get too fine-grained with it, I think is

00:17:15   a possible trap of infinite work

00:17:19   that you might be creating for yourself for very small gains.

00:17:23   Yeah, and I think with that too, the thing that I, one thing that I like about A/B testing

00:17:27   that is sort of related to the tension around

00:17:31   analytics and around whether you're collecting too much, is it useful, is it private, etc.

00:17:35   is, I do kind of like with an A/B test that in some ways it has a defined

00:17:39   duration, that the purpose of this is

00:17:43   to try this versus this, and once you have an answer, once you

00:17:47   have collected enough data that it's just statistically valid, you can

00:17:51   stop the test, stop collecting data, and it goes away

00:17:55   in a way that a lot of analytics and a lot of data collection is intended

00:17:59   to be something that sort of goes on forever. But like in a weird way

00:18:03   the one small sort of nice thing about this kind of

00:18:07   data collection is that it's very time limited, which doesn't mean that you should be any less

00:18:11   cautious with the data, and like I'm very thoughtful in the way I build my system that is like I don't know

00:18:15   anything about these people, all I know is that someone opened a

00:18:19   paywall, someone started a membership, I

00:18:23   don't do any connection to the actual membership itself, I'm not trying to collect

00:18:27   or be creepy in that way, but there is something nice about this that it's

00:18:31   very time limited, which both in terms of my time and my energy, like my

00:18:35   goal with doing this is I want to as quickly as I can

00:18:39   narrow in in like what is within a few percentage

00:18:43   points of the most optimal paywall that I can make,

00:18:47   find that, and then move on to other things, move on to

00:18:51   improving the app in other ways, like adding features and doing the things

00:18:55   that I like the main business of my app, rather than getting too

00:18:59   stuck in, like I'm sure at some point I would hit, like I have yet to hit that

00:19:03   point yet and I'm continuing to run experiments and trials on improving my paywall,

00:19:07   but eventually I will hit to that point where I try something that's another improvement

00:19:11   or another idea, and at some point it's like this is the best I can do, this is

00:19:15   what it is until I can come up with another whole

00:19:19   concept or something more fundamentally changes, and at that

00:19:23   point it's like I'll move on and I can close this down

00:19:27   and stop collecting this data, and so that is certainly one nice thing

00:19:31   about this, is that it is time limited in a way that a lot of data collection isn't.

00:19:35   I think the other possible major pitfall that you could run into here

00:19:39   is people treat data

00:19:43   like whatever result you get from an A/B test, you're like "Oh, we have data on this now"

00:19:47   and people treat that as gospel, and it's like

00:19:51   this data, you cannot argue with the data, and the data

00:19:55   means if we do this, we will make more money and therefore we are doing better, so therefore

00:19:59   we must do this because the data supports that, and I think you really have to be careful

00:20:03   with what you're measuring,

00:20:07   how you are evaluating the score

00:20:11   of the different options that you're testing, like what exactly are you evaluating

00:20:15   when you're saying that A is better than B or vice versa, and

00:20:19   that doesn't necessarily result in a

00:20:23   broad picture, or long term success

00:20:27   necessarily, and it doesn't reflect larger scale

00:20:31   factors in your app, so for instance, we hear a lot about

00:20:35   "Oh, the company X tested this thing that seems like people

00:20:39   wouldn't like it, but it turns out it performs better and it makes them more money"

00:20:43   and it's like, well yes, maybe, but

00:20:47   if it makes people hate the app more, or if it makes the app seem

00:20:51   overall crappier or more user hostile,

00:20:55   or people just kind of get a little bit annoyed by it,

00:20:59   then that might have long term negative effects on your customers and on

00:21:03   their willingness to use your app or their impression of you, or how often they'll

00:21:07   come back to it, like if people do the thing you want them to do,

00:21:11   but are encountering friction or annoyance while

00:21:15   doing it, then that's very likely to make them

00:21:19   come back less often, and if the time comes that somebody

00:21:23   else comes around offering the same thing that you do, maybe they'd be more willing to switch away from you

00:21:27   because they actually have a kind of ambivalent relationship

00:21:31   towards your app, whereas it seems

00:21:35   by the numbers that you are measuring from your testing, it might seem like you are succeeding

00:21:39   by making certain decisions, whereas you actually might be

00:21:43   irritating your customers or just giving them little paper cuts, and so you really

00:21:47   have to be very, very careful how you are judging

00:21:51   how well you're doing here, because what you are measuring with an A/B test is

00:21:55   a very small thing, usually.

00:21:59   It's literally the definition of the forest of the trees, it's like you are

00:22:03   looking at one tree, you are missing the entire forest around you, and so

00:22:07   you really have to be very

00:22:11   inquisitive about, like, if I do this thing, if I

00:22:15   increase the success of this thing by this metric,

00:22:19   am I going to possibly cause other negative effects that I'm not measuring?

00:22:23   Because all you're optimizing for in that case is the thing you're measuring.

00:22:27   But that doesn't necessarily mean you're making a better overall app or a better overall business.

00:22:31   Yes, no, and I think that's an excellent point to make.

00:22:35   And the obvious version of that would be

00:22:39   say I started lying in my paywall, and say, like, you sign up

00:22:43   and I'll send you a pony, and then, like,

00:22:47   my conversion rate might go up. Yay! It's like, okay, but I'm not actually sending people

00:22:51   ponies, and so that's going to make people really upset, and ultimately

00:22:55   lead to lots of consequences. And so it is, you have to be very careful

00:22:59   that what the changes you're making are, are, it's like

00:23:03   ideally it's your choosing between two good choices, and it's like which of these

00:23:07   is the most good, rather than trying to

00:23:11   be putting yourself in a place that, yes, you could be

00:23:15   optimizing yourself into a worse app that is going to perform

00:23:19   worse on more sort of like macro measurements, that it's like, it's going to

00:23:23   people are going to be less likely to recommend it to their friends, people are going to be less likely

00:23:27   to continue using it, that they might sign up for a subscription, but they'll only do it

00:23:31   once, and they'll churn away, and then you're actually, like, what you really,

00:23:35   you know, it's like, it'd be much better overall for your business, almost certainly, to have

00:23:39   a smaller number of people who sign up who keep

00:23:43   their subscription every single month going forward, rather than have this very sort of

00:23:47   churn heavy, lots of people signing up and then immediately canceling, signing up and then

00:23:51   immediately canceling, like, that seems like the version

00:23:55   where they stick around is going to be much more sustainable and better for your business, and so

00:23:59   being careful and thoughtful about this, and I think

00:24:03   that's just, in some ways, it's just another form of design, it's another kind of way

00:24:07   of understanding that you don't want, this is a tool that

00:24:11   you can use, but if you overuse it, or if you use it in ways that

00:24:15   you know, if you, like, never create an option that you're not

00:24:19   comfortable with, or you're not excited about, or you don't think is good,

00:24:23   but instead, it's trying to understand, I think in some ways, for me it was

00:24:27   I realized that I was designing a lot of things with assumptions

00:24:31   or things that were built for me, and people who think like me, and

00:24:35   people who are developers, or people who are, it's like, I was designing

00:24:39   a paywall screen that made sense to me and looked good, but in, what I

00:24:43   was sort of increasingly finding is that my paywall screen that I had made initially

00:24:47   was confusing, and what I've been able to do with my refinements is to

00:24:51   make it less confusing, and that's the kind of improvement that's like, that's great,

00:24:55   this is exactly what I want, I want to make it clear, I want to make sure that people who are signing up

00:24:59   sort of know what they're doing, and the way that I saw that is

00:25:03   like, in, I also measure, you know, the number of people who hit

00:25:07   on the, like, start subscription button, but then don't actually

00:25:11   complete the purchase, like, I keep track of the, you know,

00:25:15   whether people are, just because they hit buy, and then the, you know, the, the Apple's

00:25:19   screen takes over, and it's like, I was able to get my cancellation

00:25:23   rate down, which to me says, like, in terms of the number of people who hit

00:25:27   that button, the number of people hitting the start subscription button, and then actually start a

00:25:31   subscription, went up, and it's like, that's perfect, that means people were not confused,

00:25:35   whereas in my old version, people were hitting that button, and then they're like, wait, what? You're going to start

00:25:39   charging me money, or, you know, the pricing wasn't clear, or whatever it was, for whatever reason, they said,

00:25:43   cancel, that's not what I want, and getting that number down, I think

00:25:47   is, was like, yep, this is, I'm going in the right direction, but yeah, absolutely, it's important

00:25:51   to make sure, it's like, you're, it's like, if you're going to make

00:25:55   it, if you're choosing between two choices, like both choices before you

00:25:59   start down this road, otherwise you're just ending up, you know, you can optimize yourself into a really

00:26:03   bad place. Yeah, like, common sense first, before you

00:26:07   even create the test, and always, and keep common sense in mind, you know, like,

00:26:11   when, the reason why design by committee is

00:26:15   considered so bad, and sense to produce bad results,

00:26:19   is because design by committee reduces the chances

00:26:23   for, like, an authoritative human decision to be like, wait,

00:26:27   this, this is not good, even though, like, you all, we're all kind of trying to build

00:26:31   consensus, like, but, you know, one person can say, like, wait, this, this is not good,

00:26:35   we should just do this, and data is,

00:26:39   is a way to basically have, like, the ultimate design by committee.

00:26:43   That's not often a good thing. Like, ideally,

00:26:47   you have an opinionated human with, with good, with high

00:26:51   standards and decent taste, whose decisions are being

00:26:55   informed by data, and that's a very different thing

00:26:59   than making every decision by as much data as possible. You know, you still,

00:27:03   I think that's a much better balance of, like, you know, you have to

00:27:07   have somebody who's, who's providing the human filter above everything.

00:27:11   So, you know, like, you know, a very common thing we've been talking about recently around

00:27:15   these circles is the discussion around streaming app

00:27:19   quality that John Siracusa started on Hypercritical's blog, about, like,

00:27:23   you know, all the different TV streaming apps, and one of the things is, like, oh, if you, if you, the most

00:27:27   common thing people want to do is resume the show they were already watching, but

00:27:31   everyone learned through data that if you move that below the fold

00:27:35   of the launch screen, then it'll, you'll make more money, because people will basically

00:27:39   accidentally look at more content on their way to the thing they actually want to do, even

00:27:43   though everyone hates it. And, like, all the customers are hating this, but

00:27:47   technically the data says that you, it performed better on certain metrics, and, like, this is

00:27:51   such a great example of, like, there weren't enough humans involved in that, in that decision. There, like, data

00:27:55   overwhelmed it to the point where now we have worse

00:27:59   products, people are less happy, they, they are actively being annoyed

00:28:03   by all these streaming apps, and so, you know what, if, if

00:28:07   they're constantly being annoyed by the streaming apps, and then something else comes along, some other service comes along

00:28:11   that they're less annoyed by, they're going to want to spend more time on that one.

00:28:15   And, you know, it, so it's, it's, it's complicated,

00:28:19   but I think the answer is, like, let data help

00:28:23   inform you as the human with good taste and who respect your

00:28:27   customers to make a better product overall, not letting data

00:28:31   optimize all of the humanity and user satisfaction out of your product.

00:28:35   Yeah, and I think using it as, as a tool

00:28:39   to be more creative, like, and I think that's where I'll leave this with, is the thing that I've

00:28:43   really enjoyed about this is looking at these, some of these screens in my apps

00:28:47   and saying, okay, how could I design this in a

00:28:51   way that I think is great, I think is awesome, but is different, and coming at it,

00:28:55   coming at it from, like, a totally different perspective and saying, like, if I was designing this from

00:28:59   scratch today, what would I do? And trying to come up with different versions that I

00:29:03   think are all good, I think are all humane and coming from the

00:29:07   design aesthetic and the way I want to structure my business and the kind of business I want to have,

00:29:11   but having the, in some ways, there's a, there's a freedom in rather than me having

00:29:15   to only design one and that everything has to be

00:29:19   that. I can design five, and I can, each of them are different

00:29:23   in different ways, and then I can explore which of those are actually

00:29:27   more understandable, more communicative, getting across my point better

00:29:31   to my customers, and I found that to be incredibly satisfying creatively

00:29:35   rather than feeling like I have to just magically, from my own way, kind of

00:29:39   design the right thing, come down from my ivory tower and show it to my customers.

00:29:43   And so instead, this is a way to design lots of things and to make that process in a weird

00:29:47   way slightly more collaborative, which has been really enjoyable.

00:29:51   Thanks for listening everybody, and we'll talk to you in two weeks. Bye.

00:29:55   [ Silence ]