Rate This Episode

Building A/B Testing in Your CMS: A Deep Dive with Directus & PostHog

Name: Building A/B Testing in Your CMS: A Deep Dive with Directus & PostHog
Uploaded: 2025-02-27
Description: Join us and our friends at PostHog to learn all about building A/B testing infrastructure in your CMS with their killer A/B testing functionality.

Enter the WorkshopSeason 1Episode 3February 27 2025

Join us and our friends at PostHog to learn all about building A/B testing infrastructure in your CMS with their killer A/B testing functionality.

Transcripts are automatically generated with AI and may contain errors.

Speaker 0: And we are live. Alright. Welcome. Welcome. Welcome.

Super excited to kick off this webinar. It's been a long time in the making for me. I have been knee deep in AB testing over the last couple weeks, so super excited. We are going to be covering how to build AB testing inside your CMS with Posthog and Directus. As everybody trickles in, if you are in the chat, let us know where you are from.

Hop in the chat. Let us know. Awesome. I am Brian Gillespie from Directus. I see a few of you in the chat already know me.

We also have Yurai from Posthog. Yurai, nice to have you.

Speaker 1: Hey, everybody. Nice to meet you. I'm Yorai. I live in Amsterdam, Netherlands, and I've been at Posthog for about a year and a half. And for the last year or so, I've been, working hard, on our AB testing tool, which I'm hoping to, demo you a bit today.

Speaker 0: Yes. I could I I'm excited because I'm gonna learn a bit on this one as well. Obviously, I've done a lot of the technical implementation for not just our own use case at Directus, but, for this amazing bonus that we got for everyone at the conclusion of this. But, I, you know, I I haven't messed around with all of the the the, like, the metrics and, like, the testing and it just like, all of the config inside post hoc is is tremendously powerful. So I looking to, I guess, forward to seeing what, you know, how that is cooked up on your end.

Alright. Let's, let's cover the agenda, and then we'll kinda give a brief overview of, both direct us and post hoc just because I I saw some questions in the sign up of the people that weren't familiar with either one. So this is obviously the awkward introduction phase. Hi. Okay.

I see everybody. Canada, Tampa, Florida, Texas. Amazing. Nashville. Next, we're going to kick it over to Yurai where they'll cover he'll cover, basically, a demo of post hoc and how you set up experiments, how you run AB testing, what's a feature flag, what's not, what you should be thinking about as you're testing.

And then we're going to, basically do a live jam sesh of how to connect Directus and Posthog using this, the starter kit that we've created. And we'll open this up for q and a at the end of this. And I'll show you guys how to get this amazing bonus with, the working source code, fingers crossed. Right? So you don't have to invest the time and headache that I have invested over the last couple weeks.

But with that, Yirai, maybe you want to talk a little bit about post hoc for those who are unfamiliar with the tool.

Speaker 1: Sure. I guess it'll be easier if I share my screen right away. Yeah. Go for it. Let me do that.

Alright. Well, so post hoc, started as, as a product analytics platform, but we've really evolved into kind of, like, an all in one solution which allows you to build great products. So, besides product analytics, we have today more tools such as session replay, feature flags, surveys, you know, data warehouse, and, of course, experiments, which we'll talk about today. So experiments, basically allow you to, test, like, variations on your website, test different changes, and see if those changes lead to some kind of improvements in the behavior of your users, which is then, of course, visible in the metrics that you are tracking. So the way it works is that you you use post hoc's product or feature called feature flags.

And feature flags basically assign different variations of your website to your users. And usually, by default, you will be testing two variations, on your website. The variations will be normally called control and test. And let's say user a will get the variation control, user b gets the variation test. They will both see something different on your website, and then, you know, PulseHawk will track the behavior of your users on the website by capturing events.

And then we aggregate those events on our site, and we basically calculate results for you, which tell you whether a particular variation is better than some other variation. So like I said, every experiment is is backed by a feature flag, but, actually, you don't really need to know, much about feature flags at all. The feature flag will be created for you when you set up an experiment. So, like, all you have to do is basically create the experiment, and I believe Brian will later show you how to do that via direct us.

Speaker 0: We will. And,

Speaker 1: basically, it all kind of, like, happens under under the hood. And, basically, all you need to then do to analyze your experiment is to just let it go to your experiments that impose HOG, and you will be able to analyze your results there. So I'm going to open an example experiment to show you what a results analysis might look like. So over here, I have a running experiment open. You can see that it was started, like, two months ago, which will be, like, a pretty long running experiment, but that's just because, like, it's it's some test data.

But, basically, the the core of each experiment is the metrics that you are tracking. Right? It's it's these metrics that tell you, like, what what exact changes in behavior your your changes in in your content or in your experience are producing. And in this particular experiment, I'm tracking, six different metrics, three primary and three secondary. And the difference between the primary and secondary is just that it's just like the way of of organizing your metrics.

So the primary metrics are something which actually inform whether your experiment is successful or not. And secondary metrics are kind of like guardrails. So it's it's something that you don't really want to regress. It's it's maybe not like a like a metric directly tied to your experiment, but it could be anything from, let's say, like, a session length or or any kind of, like, in interaction maybe indirectly linked to your experiment, but you still may want to track that to make sure that, that you don't get some some other part of your of your product kind of, like, regressed.

Speaker 0: Cool. That's a I I like that for sure. Hey. That's that's one of the confusing things for me. It was, like, what goes in primary, what goes in secondary?

Just, so it sounds more of, like, secondary would be like, hey. This is a great guard against unwanted side effects. Like, hey. Exactly. Increased conversion, but, you know, like, the time on the page or or, like, session time or, bounce rate or something rose versus, the actual event that we wanted.

Speaker 1: That's exactly right. Yeah. So it's just like a way of organizing things. And but other than that, there is really, like, no difference under the hood between, like, a primary metric or a secondary metric. Basically, always, like, at a at a very low level, what we do at Boss Hog is capturing events, and, a metric is like a way of counting those events in a in a certain way.

So let's have a look at at one of these metrics. So let's let's take a look at the first one. I'm going to click at this on this edit button, and here is my metric definition. So this is a funnel metric which measures, the conversion rate between two events. So the first conversion so so the first event here is called sign up started.

The second event is sign up completed. And what this funnel metric measures is the conversion between these two events. So you can see that I have 3,000 persons who did the first event, but only 815 persons who, triggered the second event. And so the the difference between these two events is basically your conversion rate, which in this case is twenty seven percent. And what you actually see here so this is like the the metric definition form.

This is not your experiment result. This is just kinda like a preview, which shows you well, is the data actually there, right, in the in the system? Like, are actually people sending these events? And that kind of tells you that, okay. Like, this is, like, a valid metric to experiment on.

Like, our instrumentation is set up properly so we can actually go ahead and and save that metric. So once I save the metric and, well, once the experiment is running, I will start seeing my results once sort of, like, a minimum criteria are met, which is that, you know, you need to have some sort some number of events that have been ingested. You need to have events for both control and test variants. And once all of these are met, you will start seeing results. And the way we present results is kind of like an industry standard way to present results of, you know, AB experiments, which is that we show you this chart, which is called a delta chart.

And what you see here is that for each variant, you will see this bar. And this bar is basically a credible interval. What it shows you is the the actual or let me actually start start start, like this. So so each each bar shows you, the actual difference between that given variant and the control variant. Right?

The the black bar in the middle is the delta between the variant and and the and the control variant. So you can see that in case of the test one variant, we actually have a regression here. So the control is at 0%, and the test one variant is kind of like minus 13%. So that's bad. Right?

Like, that's a that's a regression. We have worse conversion rate for the test one variant compared to the control variant. Now for the test two variant, we actually see an improvement. So the delta here is plus 6.92 compared to control, and that's why this bar is in is in green because it's actually gonna be improvement. So that's what the the the black vertical bar tells you.

And now now onto the edges onto the edges of the actual bar. So this is like a credible interval. So what this bar tells you is actually the uncertainty that you have because, in any kind of statistical testing, there is some sort of uncertainty. And this like, the the the outer boundaries of these, credible intervals tells you, what kind of range in the actual results you may expect. And this basically tells you that, this credible interval, goes from minus 3% to plus 70%.

And that means that in ninety five percent of the cases, because this is a so called 95% credible interval, you can expect the true value to lie between, like, this range. So there is still some sort of small probability that there will be a regression, for the test two variant even though, it's kind of like there's a high probability that it will be, some some sort of improvement. The narrower a credible interval is, the higher certainty you have. Right? Because, it's kind of like a tighter range of values where the actual value may lie.

The wider it is, then it's kind of like more more uncertainty. And oftentimes, as you as you collect more data and you, like, keep refreshing results, you can you can kind of, like, observe the variance getting narrower and narrower every day as you kind of gather more data and get gather more more certainty. So that's what these credible intervals bars tell you. It's calculated separately for each of the metrics. And, at post hoc, we use a so called Bayesian statistical methodology, and the two main outputs of the methodology is the credible interval itself, which tells you the the uncertainty of the result.

And kind of like the main output is is what we call win probability. So in this case, for this variant, there is an almost 83% probability that this test two variant is actually better than control. And then we show you, kind of like the the significance banner over here. And at post hoc, the the criterion that we use to tell you whether you should roll out a variant or shouldn't roll out a variant is that the win probability needs to be higher than 90% for the best variant. And in this case, you can see that it's actually less than 90%.

And that's why for this particular metric, we declare it as not significant because it's less than 90%, which is what this tool tip also tells you.

Speaker 0: Okay. So now you've just answered, like, my own specific like, this is the biggest question I've had on, like, the experiments that we've ran of, like, what's how do you measure the significance? You know? Because we we've ran several tests, and, I'll get into, like, the specific results here in just a bit of one of our tests. But, like, some of the tests have been like, we saw, like, a it what looked like a positive improvement, but it was marked not significant.

So it's like, like, it you know, can we be confident in that result or not?

Speaker 1: Right. Yeah. Yeah. That's a good point. And, we actually keep, improving this UI, and we want to actually, like, make it clearer as to, like, what all of these numbers mean and when you can expect significant.

And it's like why something is significant, why something isn't significant. Just, like, make make all of this kind of decision making clear. So that's that's definitely a a valid point.

Speaker 0: I I I'll just stick you in the UI explaining it, man. Take it like you flawlessly. You did I was

Speaker 1: gonna ask that again. I I didn't hear that. I I

Speaker 0: was gonna say, just stick a video of you inside the UI because you, you you did flawlessly.

Speaker 1: Oh, nice. Yeah. We might actually do that. That's a that's a great Yeah. Cool.

Am I still sharing? Because I think the, Oh, we lost the screen share. Yeah. Oh, okay. I'll reshare.

Okay. And then, so just like continuing to the second metric, it's it's exactly the same principle. So each metric is is evaluated in exactly the same way. I mean, the the like, on our back end, there are some differences as to how, different metric types are evaluated. So for example, the second metrics the second metric is a different metric type.

Right? So, like, in in the first case, we had a we were measuring funnel conversion. In the second metric, we are actually just measuring the role click count. And, of course, there are some statistical differences as to how this should be evaluated, like, on the back end. And we make sure that, like, we do this properly.

But for you as a user, there's really no difference. You basically just look at, the movement of these bars relative to the control variant, and you look at the the win probability. And then the banner will will tell you whether a particular metric is significant or not. You can also dive deeper into any particular metrics. So if you click on details, you will see the actual counts for, for each variant.

You can see that the test variant is strongest. Right? So it makes sense that it has the highest count over here. And this is like a cumulative chart, so the the the counts actually stay the same after we we stopped collecting the data for this experiment. Nice.

Yeah. You can also see things like exposures for each variant, their means, the delta, things like that. There's also like a like a small cool feature, which is that you can actually view recordings for any particular variant. So the the the power of the POSIX platform is really that we offer multiple products, and they are kind of, like, interlinked. So you can can actually, like, click on on this particular variant and see the recordings, like, of those persons that that that actually, like, sold that variant.

And I don't see any recordings here because I'm on my local instance and just, like, using all the data so you can know from actual users. But in Yeah. On actual dashboard, you would actually see recordings, of users where you could actually see how they how they interact with your with your website.

Speaker 0: Yeah. And and, like, having that all in one has been, like, a significant help for us at Directus. You know, I would it's not like for us, it's it's not stack overflow. It's it's like stack overload. Like, I, you know, I the thought of adding, like, six more tools to your tech stack for your website, is just like a a mess.

It's it's like a pain for us. I I don't wanna do it. So and, like, when we integrated post hoc, like, the the analytics for us was, like, one of the one of the first things that we got a lot of value out of, and then we we started diving into the AB testing. You know, as we we kinda shift gears, like, do you have any best practices, your eye on, like like, what to test? You know, obviously, like, you've built this thing.

You probably worked closely with with some clients at Posthog. Like, what are people testing? Do you have any best practices to share?

Speaker 1: Sure. So I would say if you if you are just starting out with experiments, start with something very small. So things like, you know, small changes to your landing page. That will really allow you to, just, like, kind of, like, get, like, how how the whole things works. And, maybe also kind of, like, circumvent some of the gotchas, like, like, why why while you are still starting out.

Like, there are, like, several things that you should be aware of if you are implementing AB testing. I'm not sure if if, like I would say, like, maybe there can, like, be on the scope of this of this webinar, but we have, like, section with some troubleshooting and FAQs and, some best practices, when when implementing experiments. To summarize very quickly, I would say the like, one important thing is to make sure that your tracking is set up correctly. So, like, in your code, whenever a user performs a given action, you do actually capture that action so that Pulsar receives that event because, obviously, if we don't receive the correct events, we cannot, provide correct analysis. Now in terms of some in terms of, like, some likely more actionable AB testing advice, I would say start with small changes.

Also make sure that, you are testing perhaps only, like, one or two changes at once. Because if you change, like, too many things, let's say, on your on your on your landing page, and then the test is showing, you know, significant outcome, like, you don't really know which which one of those a changes that you've made is actually leading to the improvement. Whereas, if you just like there's, like, small incremental changes, then you would be able to to tell that, kind of, like, more reliably. Another kind of, like, important technical detail is that you should probably use a reverse proxy on your post hoc setup to make sure that the, like, ad blockers are not, like, blocking capturing of the events, which is, like, a common issue, that can be really easily, circumvented with this. We also have, like, proper documentation, on this.

Like, in general, if if you have, basically, for for any questions, you can use our search functionality in our documentation, for example, for the reverse proxy. That will explain to you exactly what you you should do to to set up also correctly to to be able to circumvent, ad blockers. Another useful tip is to learn how to actually estimate, your sample size properly. So one thing I I haven't explained yet is that we have this data collection section over here. And what this allows you to do is to is to answer the question how long you should run your experiment for.

And, so I'm actually going to show you how this works. So if I click on edit over here, I have this slider which says minimum detectable effect. And this basically says, well, what kind of change in my metric am I trying to measure? And there's, like, a trade off to be made here because, if you are the the way the way sample sizes and experimentation work is that the larger the change you are trying to measure, the smaller the sample size you need. It may sound kind of counterintuitive.

Speaker 0: It it definitely. Definitely. Because I've seen this, and I'm like, like, hey. Why why do we need less people for this?

Speaker 1: Yeah. And, actually, right now, we are actually completely rebuilding this component to to do, like, a better job, explaining all this. But, basically, what this means is that if there is a huge change in your metric, you don't really need, like, a sensitive test for it. Right? You you don't need, like, a, like, a huge sample size because, if there is, like, a huge effect, that effect will already be apparent in, like, a relatively small sample size.

But if you are trying to measure something much smaller, like, let's say, I'm I'm just going to move this slider from 10% to 2%, I need, like, a much more sensitive test, which means, like, a much higher sample size. Right? So, like, I I basically need, much bigger sample size to be able to reliably say, this 2% change is not just due some sort of chance. It's actually due to the actual change in the in the underlying behavior. But I definitely need, like, a larger sample size for this.

So Right. The consideration some some some key considerations here is that are that, first of all, what is the sample size you can actually get? Right? If you if you are a startup and you're just, like, getting your first users, it might be actually difficult to get a sample size of 10,000 persons. So in that case, you are basically restricted to much smaller sample sizes.

And

Speaker 0: Right.

Speaker 1: That actually means that your tests would actually would probably have to target, you know, just like large changes.

Speaker 0: You wanna go big or go home at that point.

Speaker 1: Exactly. Yeah. But as soon as you have some larger user base, you can start going after, you know, incremental small changes, that perhaps produce, smaller effects on your metrics. But you can you can run, like, many of such experiments and, you know, even incremental changes or or of one or 2% over time. They they really add up to a lot.

So, yeah, there's, like, a trade off to be made here, when it comes to this, so it's it's it's good to be aware of that.

Speaker 0: Yeah. Well, awesome. Yeah. Thanks for the best practices, man. Thank you.

Like, yeah, I'm I'm learning just as much on this as as the audience, so I appreciate that. We'll jump into, you know, kind of, like, our own experience a little bit, and then I'll, kinda run through the steps of of integrating post hoc and direct us and and, again, show you guys the the source code. We'll we'll dive into it. We won't write code. That didn't work out well for me the last time I, I did one of these live sessions.

But, as far as our own, like, experience with posthog, you know, recently, we rolled out a brand new version of our home page. And I'm gonna can we see this? Oh, yeah. There we go. So this was a big change for us.

And, you know, this one, if I shrink it down, it's probably a little better. But this was a this new homepage was a big shift for us. And one of the things that we wanted to do was we wanted to test first to make sure that, number one, the messaging was wasn't causing a a decrease in conversions. Number two, you know, we wanted to make sure that that this was performing better than our our old home page. You know, we've got this interactive carousel component that basically links into, what we call our directus pizza demo, which is just a a live working instance of directus.

So folks can hop in and and poke around inside one of the templates. Before we we shifted all that traffic, right, we wanted to make sure that this was actually worthwhile. So our own results that we saw got a fancy slide up here somewhere. Boom. Yeah.

So, the conversions were were relatively the same. And, again, that goes back to kinda your eyes point about, like, what metrics are, you know, how do we measure significant change? But some of the big results that we saw was, like, a 30% decrease in bounce rate on the site, which is huge. And, obviously, that correlates with a, like, a larger session time, most likely because people are getting in the demo of Directus, at least that's our hypothesis, and and actually poking around, which is which is what we want. And and I know you guys at Posthog, you're I, you guys are are kind of following that same methodology of, like, you know, let's let's skip all of the, fluffy marketing stuff and actually get you into the product, so we can actually dive in and and learn.

Alright. So let's actually dive into a this integration. And I put together just, like, this a really nice visual for how we've been doing AB testing with post hoc at Directus. And that is kind of the the concept behind behind this setup. We've done tests at at two levels, and I call it the block level, which is basically testing within the same page, which is, you know, hey.

We wanna test a different headline on the homepage, or we wanna test, a different pricing component or a a different pricing tier. So that would be like a block level test. And then, I I was calling this a page level test, which is basically testing between different pages. You could call it a split test. You guys are calling it redirect testing inside the documentation dry.

But it basically like, we we take a URL. We want to redirect some percentage of the traffic. You know, usually, if it's just two variants you're testing, you you probably split fifty fifty. But but that's the way that we've been doing testing at Directus. And now I'm going to show you how this all comes together.

So, this is post hoc, and we laid a special theme, special post hoc theme on top of a direct us instance just for this webinar. But this is our CMS starter template with a little bit of magic sprinkled into it. So, if we take a look at what I call the checklist where did that guy go? Supposed to have a production person here, Matt. Not calling you out, but I'm calling you out.

The post hoc checklist, can we see that live? Do I have to stop screen sharing to see that? Yeah. There we go. Alright.

So this is the AB testing checklist, as far as integrating with posthog. I'm gonna show you how to create a a project in posthog. We're going to dive into, like, creating a personal API key to power this little automation that we've got. We're gonna walk through the directest data model. We'll adjust some permissions.

I'll show you the flow that's involved, and we'll talk through, like, this Next. Js front end, and how that is integrated. Alright. So let's get back to the screen share, and we'll do this together. What I'm going to do, and this is a little crazy to do.

We are, like, maxed out as far as projects. So I'm gonna delete this test project. This is, sketchy on a demo on a webinar, but, that's what we're gonna do. Alright. So I'm in post all the first thing we've gotta do.

Right? We're we're gonna create a project. Just go through this. This is gonna be the AB testing webinar project. Great.

Alright. So we've got our project. I'm gonna need two things. I need the project ID. So I can find that up here in the URL, or, let me get my fancy mouse pointer going here.

I can find that from these settings as well. So I'm gonna grab the project ID, and, yes, we will send the recording of this. I promise. Alright. So now in the Directus instance, which you are going to get total unrestricted access to you at the end of this, we've got some global set up.

So global's inside Directus, are basically, just a what we call a singleton collection. So globals are are typically things like, social links or favicons, logos, stuff that you're gonna use across your entire site. So we're gonna add our project ID. And then the other thing that I'm gonna add, I I don't necessarily need this project API key for the directive side of things. You'll need this on the Next.

Js integration. You'll wanna copy it to your clipboard, stick it into your text editor, so that you've got that. But what we're gonna do, we need to go into our personal settings. And the reason why is inside this Directus instance, there's an a nice little automation that will show. Alright.

So I'm gonna log in, reauthenticate for security, and we're gonna look for a personal API key. So I'm just gonna create a new key. This is our a b testing webinar key. We want a specific project that's gonna be our AB testing webinar. Whenever you create keys, whether that's in post hoc or GitHub, please be very specific.

So we're gonna do right access on experiments and feature flags, and I think this should be all we need. Am I am I correct in that assumption, your eye?

Speaker 1: Looking good to me. Yeah.

Speaker 0: Okay. Perfect. Alright. So I'm gonna grab this key. I'm gonna go in, and I'm gonna post that inside this direct us instance.

Amazing. Magic. Right? So let's let's talk through the changes inside this Directus instance. Again, this is our simple CMS template.

Like, if you go to Directus.io, you go to get started for free, you create a cloud account, You get logged in. You can get the starting point for this, just by clicking CMS, or you can also get it through our template CLI tool. We'll button up all these resources. But this already has what we call the many to any relationship. It's basically a dynamic page builder that is set up inside your CMS.

So if I open up my live preview pane, we can see that this page is made up of blocks. Right? And this paradigm lends itself to that block test that I was talking about. So that is kind of the setup here. The, extra collections that we've added to this direct to census, which are very minimal, are are just two pieces.

Right? We have added experiments and experiment variants. And the reason why we add those inside direct us, we need to be able to link the content inside the CMS to the post hoc experiment. And this is it gets back into the why we created this. So we want to empower our marketing team, our content editors, to run tests, right, without code, without bothering the developer, without it being blocked by the developer.

Right? We want marketers or well, this is my personal mission. I want developers and marketers to get along well. And if you are waiting, on information from marketing to set up the actual code for an AB test, not great. Likewise, if they have to bug you every time they want to test a new variant, that's gonna frustrate you as well.

So, what we do, we've created a a experiments collection. And inside that, pretty simple. We've got a name for the experiment. We've got a feature flag key that you'll see we actually need, inside post hub. We've got a a short description.

We've added a type of test. You know, is it a block or is it a page level test? And then we have our variants. So the variants are a relationship to that experiment variants, and this is pretty simple as well. We've got a key for the variant.

Each experiment has to have a control variant as your eye talked about. And if you're doing a page level experiment or a redirect test, you need to have a URL. So on the front end, running a test is as simple as this. Right? With those pieces put together, and I'm sure Matt is crossing his fingers behind the scenes right now.

Let's do a block level test. Right? So I'm inside Directus. I want to test a new headline. New headline.

Home page. There he is. I see I see him in the chat. New headline for homepage. Alright.

I stole this placeholder copy directly from you guys, Uriah. We want to let's see if this new headline improves conversion. So, hey, this doesn't totally replace post hoc. This is just a a slick integration to work together with the two. So we're gonna pick the test type.

This is gonna be our block level test. We wanna test within the same page. And we'll add the control, and we'll add, just like this new headline variant. Great. Okay.

Now with that out of the way, we save. What happens behind the scenes? There is an automation. I love automation. Direct as flows is a great way to build these automations.

This is what this automation looks like, and I'll walk you through it really quickly. So whenever I go to create an item inside experiments, we've run this series of operations. We grab our global settings, so that API key, that project ID, and then we format a payload for post hoc. We create a new experiment inside post hoc using their API. Did we lose audio?

Can you hear me okay, Yuri?

Speaker 1: I can hear you, Brian. Yes.

Speaker 0: Okay. Okay. Alright. I just wanted to make sure. Hopefully, it'll all be in the recording as well.

But, then we've got a another it just it little piece of JavaScript here that formats a feature flag payload, and that's helpful for our redirect test that we're doing. And then basically, we stuff all that into post dog. And at the end of this, we return a payload that gets saved inside Directus. So the effect that we've get is a experiment that gets created inside Posthog. We've got a experiment here inside Directus now that we can actually link to, a piece of content.

So if I go into post dog, we go to experiments. Check installation, skip installation. Skip or no? Skip. I did not remember that part of the creating a project.

Alright. So we could see this experiment here inside post hoc. Let's see. There it is. This is all set up.

But now let's go and link this to a piece of content. Right. So we're gonna go back to our home page, and we've got our hero block. So this is the control block. I'm just gonna go down to the bottom.

And, basically, we've got a a relationship from this block to our experiment, and then we're gonna pick the variant that it belongs to. Except something wasn't quite right. It wouldn't be a demo that I was doing if everything worked smoothly. Why isn't my variant showing up? Clear filters.

Experiment. There is the alright. You got me. Let's clean this up just a bit. I've got tried to get fancy, and I've got, I don't know what level of fancy I got here.

But, okay. We'll try this again. Now I'm gonna link this to our new homepage headline experiment. We're going to add this to our control, and now we're gonna add another headline. This is the new headline.

Amazing. It's gonna look beautiful. We're gonna link it to that same experiment, except now I'm gonna link this to our new headline variant. So all I'm doing behind the scenes here, nothing fancy. I'm just linking this piece of content to the posthog experiment.

On the Next. Js front end, we're making a call. We get this content, And, because we've got posthog integrated, we get something like this. If I hit refresh, right, I don't see two hero images or two hero blocks here. I just see one.

Right? And that is because of the post hoc SDK that's set up that is handling all the magic, and you could probably understand why I don't want to do all that magic myself. Now let's see if I can actually trigger you're gonna have to show me, like, a a trick to, like, force some type of, visitor into a variant sometime, your eye.

Speaker 1: Sure. We can do that.

Speaker 0: Let's see. This should be put in. Yeah. It's it doesn't seem like I can actually trigger the not triggering the variant here for some reason, through this. But, this is how this is actually integrated.

This is like a a block level test. Now, you know, if I swap

Speaker 1: Can you, can you open your web tools? Maybe we can try to override a flag, for this particular page. Yeah. I I Oh, we don't have to do that. That was up to you.

Speaker 0: Yeah. Yeah. Okay. So there we go. So if you know, now you could see if I swap the control Right?

If I make the headline the control, we could see the difference here. And, basically, the post hoc integration is is pulling that all together. So that is, like, the setup inside Directus. Like, if I wanted to run a page, a a reader direct level test, let's say I wanted to have a a new pricing page. Right?

If I go to pricing, we've got pricing to fit every budget here. Maybe I wanna change this. We have new pricing. So we'll just create a new page. Pricing to fit no one's budget, and we'll just raise the prices by quite a bit here.

Amazing. Alright. So now I'm gonna hit save as copy in this template, and now I've got I've got two new pages or or, well, one new page, but that's our page. I'm gonna go in. And now if I wanna do a redirect level test, I have new pricing page.

Pricing page. We'll do a redirect. So the control, here, we're gonna add this URL. So that'll just be slash pricing. And when you set up the control experiment or the control variant, again, that is the URL that you're testing.

It's an important distinction to make. Next, we will add the new pricing page. Great. That's gonna be new pricing. And, again, our direct us flow automation, like, will will bring this home for us and basically create this experiment inside post dog.

So we'll just hit refresh. We've got our new pricing experiment, and I can click in and and see the the variance here. And as Yurai showed, there's just a feature flag that backs these. Now what we're doing on the page level tests is we're using the post hog feature flag payloads, to avoid making a extra call to direct us for this information. So if we I think I can get a better view if I click edit here, maybe shrink this back a bit.

You can kinda see what's going on here. We've got an experiment type. It's a page level test. We've got a control path, so that's our pricing. And then we have a path that we're going to redirect to.

And what happens on the front end if we go to local host 3,000. Now if I try to navigate to pricing, I'm either going to get the control did I I guess we may have to dive into the actual tools here. We'll we'll work that out in a moment. Love giving these demos on the fly. Alright.

What is next on the agenda? We'll just look at that really quickly. Alright. Our feature flag test. Alright.

So we've got the director side of things. We've nailed that piece. And then let's take a look at, like, the Next. Js side. Right?

We want to, like, walk through how this is actually set up and integrated. So a a couple important pieces that you need as far as, like, setting this up within Next. Js. Let's pull this up. Alright.

Can everybody see this? Okay. Let me try to close close the terminal a bit here. I I can shrink the the size of the make make the font just a little bit bigger. Alright.

So, again, once you download this repo, you know, feel free to to browse through it on GitHub. We'll, again, we'll we'll send you all of this. But let's let's start on the direct side of things. Inside this Next. Js application, there are our fetchers.

So we're just using these two. Close. Shrink that. Okay. There we go.

Alright. So these fetchers are are basically just communicating using the Directus SDK. What we've got here and the only change that we've made from our standard Next. Js template is just making sure we grab the experiment data and the experiment variant that we've linked to a page block. So this all comes together on our, like, our page builder setup that we've got.

And one of the other things that you'll have to do inside Directus, you can fetch that data, but you need to be able to add the data to your permissions. So you've got to make sure that your experiment variance and your experiments are enabled underneath your permissions to make this work inside Directus. That is just as far as getting into best practices, that's one of my Directus best practices. At 99% of my errors are because I didn't set permissions. Right?

But we have to add that experiment variant there. And then inside what we call the page builder, there's a a bit of logic here that basically, filters out the blocks. So, this template is set up to run Next. Js server components, so we don't get this flash of content, whenever we enroll someone into a variant. But, basically, we're just checking to see, is this block attached to a variant in an experiment?

If it is, we get the feature flag from the post halt client, which we'll look at in a moment. And should we add this block? So if the feature flag is found and the block is the control variant, we'll add that control variant. If the feature flag is found and it's not the control variant, we'll make sure we add that to the, to that block. So, not a huge shift in in the logic as far as working with Directus, just simply matching those up.

On the post hoc side of it, and this is all we it's just standard boilerplate from the post hoc documentation. You need to have a post hoc provider. So we just set this up using use client here because this provider is going to go into a shared, like, a layout inside this Next. Js application. One of the important bits, especially for, like, server side rendering is the bootstrapping.

So, basically, we're getting all the feature flags on the server side from post hoc and making sure we pass that when we initialize the the post hoc JS client on the the client side. And, Yurai, do you have anything to to kinda add on that bootstrapping side? You know? I I know this was, like, one point that was, a a little bit of where I ran into some issues when I was implementing this.

Speaker 1: No. Not really. I would just say that this this is really the preferred way how to how to get a feature flex to your client, for a couple of reasons. Because the the other alternative is to actually fetch the feature flags directly from the client, but there is always some delay there. So, you know, you you you may get some usage events being sent without correct future flag information if you do it that way.

Right? But if you if you bootstrap your flags, that means you always evaluate the flags on the server, which is actually faster because, the POSOC library actually evaluates, the flags there without having to actually go to the POSOC server. So it's actually faster. And once the the web content is served, you already get the feature flags kind of, like, basically already bootstrapped to it. So, this is is actually what what we always recommend, for our users to do.

Speaker 0: Yeah. Yeah. Makes sense. Now how you do the bootstrapping, it depends on your specific application. The way that we chose to do it in this specific one.

So we've got this post hoc provider. There's a there's a shared layout that I'll I'll show you. This is just how we we use this provider. So inside our root layout component, and this is using the the Next. Js app router setup, we are actually, like, sending this bootstrap data in a header from a Next JS middleware.

So we get that here. We pass it to our provider, that sends it down through the client. But the middleware is an important piece. Now you you could do this via a, like, a server component and, like, if you're not doing the redirect testing, I found that that worked pretty well. But, as far as, like, the redirects, you probably wanna do this in the Next.

Js middleware. Just this is the best way that I've found to do it. So what we've done inside the middleware, if we get to the actual function here. Right? We get the path name that you're sending, you know, what we're navigating to, and then we're basically getting a distinct ID.

So this is just a helper that is somewhere, maybe in a where is that guy? Distinct ID. Yep. So a post log gives stores a cookie. We will try to get the distinct ID for that visitor, that user via that cookie.

If we can't find it, we're just gonna create one. Right? And then we will look for some cache data inside the cookie. So we've got, like, a a bootstrap cookie, where we're we're caching this data. But, basically, what we're doing to enhance performance and and make sure that you're not, like, delaying rendering every single time, we've got a flag route set up on the API side, which is somewhere.

Posthog flags. So, basically, there's a Node. Js posthog client. We pass that distinct ID to it. We go get all the flags and the payloads, and then we're caching that for sixty seconds.

So as the user navigates, this middleware gets triggered, and, you know, we're we bootstrap that data. And then we also use this to handle our redirect, at the page level. So we've got a check for redirect function, which basically looks at that flag data that we have here. So once we fetch all of those flags from post hoc, we're iterating through those and saying, okay. Are any of these redirects that we've set up matching this experiment?

If so, then we we send them through. You know, there's a a little bit more fancy stuff behind the scenes, but I I know we're coming up on time with this. Is there I I'm trying to think if there's any other, like, important pieces that I wanted to cover before we turn everybody just totally loose on this thing. I don't think so. Let's see.

Where's my checklist? Redirect. We've configured the provider. And I saw where is was it Jobchum? Yes.

So the that was a good spot. I figured out why this is not working. Thanks to JobChomp. It the public post hoc API key is from my previous project. So that's where I told everybody to take this down, make sure you copy it, but I forgot to stick it into my EMV.

And, yeah, that's why we were having issues. Always love it on the demos. That's that's always fun. Alright. That's it.

Let's, we'll open it up for q and a. You're I, you know, while we're waiting on questions to come up, I just wanted to say thank you for for jumping on with us and, you know, at least teaching me how to get get more use out of out of the post hoc side of things.

Speaker 1: Of course. Yeah. It's a it's a it's a pleasure. It's it's a very nice integration that you that you build there. And, we actually have our own kind of, like, no code experimentation tool.

It's still very, like, very much in beta. But, we'll I'd love to get access

Speaker 0: to that.

Speaker 1: Yeah. Whoever whoever uses, like, just Bosak can actually, like, already try it out. It's it's, like, not nearly as powerful as as as, like, what direct Directus allows right now as in kind of, like, rearranging blocks and, like like, doing all that. It's basically just for, like, simple simple style changes. But, yeah, perhaps there is also something for us to to learn here.

Speaker 0: Yeah. Yeah. You know, I like, hopefully, like, coming out of this, we'll have, like, a, like, a how to dev blog post to to put this together. But, you know, one of the things that I I just I struggle to find any, like, linking to CMS examples. So, now that we've got one, this is how an integration could work.

Let's let's take some questions here from Steven. Do you have a guide on what kind of traffic numbers you need to do effective testing?

Speaker 1: Yeah. So maybe I can answer that. So like I said, we we have this sample size calculator, which basically tells you exactly that. Now, that calculation is always tied to a particular metric. Right?

So if you are tracking five metrics, but each of those metrics has kind of, like, different, like, usage numbers, as in, like, different number of persons that actually generate that metric. To be to be really statistically rigorous, you actually have to take the metric with, kinda like the smallest traffic and make sure that you actually get enough traffic for that metric. Right? Other than that, it's it's it's already like what I mentioned. The the the larger the change you are targeting, the smaller the sample size.

But the smaller the change you are targeting means that, you kind of, like you need to have, like, a more sensitive test, and you need a you need a large larger sample size.

Speaker 0: That makes sense. Alright. One other question I see from Stefan. How would you set up an AB test for global components across multiple pages like a header? Will it be, like, a new type of test that needs to be set up first?

It it depends. Like, any good development oriented question, the answer is it depends. But, you know, if you've got the let me just pull up direct us real quick, and then we'll I'll give the bonus link in just a moment. Where's this at? Alright.

So, you know, a basically, inside the direct us instance that that we've shown here, on the page block level, we we just added a relationship to the variance. And, you know, we've got the corresponding logic inside the Next. Js application that basically just says, hey. Post all, give me the variance and then assigns one of those. But you could add the same relationship to other pieces of the website if you wanted to, whether that was, you know, your navigation, like, your navigation items, within the setup.

So this, this CMS setup has navigation already built into it. You know, you could potentially link it there. You know, you could do, like, a a hybrid approach inside the code where you, you know, you hard code some of these tests, which are, like, a lot of the examples in post hoc just for simplicity's sake are are there. In our own experience, it's like I I tryna I kind of, like, look at it through the lens of, like, hey. Is this something we're we're gonna do often?

Like, do I wanna test header elements often? If so, you know, it might make sense to enable your content editors to be able to do that. If it's, you know, like, a one and done test, you might just, add it to the code and and move on. So, hopefully, that's helpful. Let me throw up the well, I'll just post it here in the chat, if I can.

What's going on? The screen share is stuck. Why is this not working? Something's going on. Okay.

I can't post the link here in the chat. Matt, if you're around, post this link in the chat for me. My screen is fouling up as it often does on these demos. We'll we'll definitely send this out in a newsletter after the webinar as well, but, there's a repo where you can get all the source code. If you have any questions, feel free to follow-up with us.

On the Directus side of things, we are also offering a special little promo. And I can't get this to yeah. Hey. The screen share thing is just spinning for me. So that's where we're at with it.

Uriah, thanks for joining, man. I really enjoyed this. You know, this has been a fun project, and I I appreciate your support and your help along the way.

Speaker 1: Likewise.

Speaker 0: Excellent. We'll have a recording out for everyone. And with that, thank you, and good night.

Speaker 1: Thank you, everybody. Good luck. Bye bye.