Tech

Predicting the Impact of Product Changes Before You Ship — How We Built Moveo One's Behavioral Engine

Analytics tell you what happened. Moveo One's new behavioral engine tells you what will happen to your users when you ship a change — before a single real user touches it. Here's how we built it.

Vladimir Jeftović

Co-founder & CEO

·Apr 20, 2026·7 min read

Predicting the Impact of Product Changes Before You Ship — How We Built Moveo One's Behavioral Engine

Basically, all product teams had a moment that went like this:

A week out from shipping a redesigned onboarding. Or a new paywall. Or a restructured pricing page. Someone on the call asks the question.

"Are we sure this doesn't break anything for our best users?"

...

The honest answer is no - we’re not sure. You ship the change, you watch the dashboard - hey, you might just even have spoken to some customers - but you mostly hope. If it breaks something, you'll learn about it in (three) weeks. You will see a churn spike, a conversion drop, or a stubbornly flat metric that someone finally notices. Digression: I remember someone told me that we can use SCREAM methodology - we ship the thing and if nothing screams, it’s all working well :)

That's the workflow for most product teams today. We think it's backwards. We think this is how you either lose your best customers or you are slowing down development (and why would you do that today with all this productivity unlocked, when you can build features cheaper than ever)

Analytics tells you what happened. That's not enough anymore.

Every analytics tool on the market (PostHog, Mixpanel, Amplitude, Segment…) is built around the same idea: record what users did, so you can analyze it later.

That's useful. It's also strictly historical.

A/B testing focuses on a forward-looking layer, but it still requires you to ship the sh*t first, then measure the damage. You can't A/B test a change against users who don't exist yet — and every major product decision is, by definition, a decision about a future user base.

So product/engineering teams fall back to the oldest tool in the kit: intuition.

"We think this will lift activation."
"We think power users will love it."
"We think the new paywall won't hurt the freemium funnel."

Think. That's the verb that shows up when the data hasn't been generated yet. We wanted a better one.

Meet the Simulation Engine behind Moveo One's next release

(We still don't have the name, but we are thinking: Moveo One quantum, or something like that...)

For the last several months, we've been building a behavioral simulation engine that will power the next major feature in Moveo One: simulating how any product change will affect your user base, before you ship it... Or, if you are not shipping anything atm, just enjoy understanding your users on another level.

Moveo One already builds predictive models of your users. Every session gets scored 0–100% — probability of conversion, probability of churn, probability of completing onboarding. That model isn't just a number — it encodes the behavioral structure of your entire user base: which actions correlate with which outcomes, which metadata cohorts behave differently, and how intent evolves across a session. Now... imagine not one, not two, but a dozen models that are predicting every step at any moment, giving you the probability of each and possible decision... that's why we think calling it 'quantum'.

The Simulation Engine takes those models and inverts them. Instead of using them to score real users, we use them to generate synthetic ones — agents whose behavior is calibrated against the same scores your real users produce.

Then we run those agents through your site/web-app/mobile app. Real browser. Real DOM. Real APK. Real interactions. Thousands of sessions, organized into probability layers that match your actual user distribution.

The result: a live, running, breathing model of your user base. And you can point it at any version of your product. At anything.

How it actually works — the technical part

If you're the person on your team who wants to understand the machinery, here's the honest version. Every agent decision blends three signals. None of them alone would be enough.

Layer 1 — Probability-calibrated Markov chains. A user with an 85% conversion probability clicks differently from a user at 12%. They scroll longer on features pages. They skip testimonials. They hover on pricing. These aren't qualitative observations — they're measurable transition probabilities in the Markov sense. The engine exports the probability distributions from your Moveo One model and builds per-bucket transition matrices. A "high-intent" agent doesn't just feel like a high-intent user — they pick their next action according to a distribution we anchored to real behavioral data at that probability range.

Layer 2 — LLM-driven persona reasoning. Markov chains are great at "what's statistically likely." They're bad at "what would this specific user, with this specific context, on this specific page, actually do right now." So every agent has a persona, derived from the signals your Moveo One model weights as important (most important for that probability layer)— platform, language, session depth, previous-session behavior, whatever your data surfaces, how long does it read, how long does it interact... what's the cognitive load, understanding of language etc... That persona is injected into an LLM that reasons about each action choice in natural language: "A free-tier user comparing pricing options on mobile — they'd probably tap the comparison toggle before the CTA. They have this or that matrix of interaction and feature correlation with specific actions.... and so on." We use small models for most decisions (cheap, fast, cached). Heavier contexts go to bigger models when screenshots are in play.

Layer 3 — Model-learned feature importance. The model knows which features matter. If your feature importance export says ux: 0.7, navigation: 0.3 on probability reasoning, agents bias their attention toward actions tagged with those feature keys. This is the heuristic layer — the part that lets the simulation respond when a real feature change actually affects an underlying importance weight.

Calibration loop. After every session, the engine compares the agent's observed behavior to the expected probability range. If a layer drifts — say, 70–80% agents are consistently landing at 55% — the engine writes a layer tuning: a behavioral correction (Markov weight multiplier with specific contextual changes in LLM instructions) that pulls the bucket back toward its target. The tuning is persistent. Every subsequent run loads it automatically. The engine gets more accurate the more it runs.

What this unlocks — the part worth screenshotting

Here's the workflow that matters.

Calibrate against production. Run the engine on your live product. It generates a tuning that matches your current user base's observed behavior.
Point it at staging. Same agents, same tunings — but the target URL/APP is your feature branch, your new paywall, your redesigned checkout.
Diff the buckets. Compare the bucket distribution before and after the change. The cohort that moves the most is the cohort you need to pay attention to.
‍

‍

That's the whole thing. And the output is the kind of sentence your product team has never been able to generate before:

"Users in the 6-8 months LTV bucket with mobile + language=es drop to 4–5 months when the new paywall is enabled. Primary cause: three of their top-five actions in the existing funnel (hover pricing, toggle compare, tap features) no longer resolve on the new page structure."

That isn't a guess. It isn't an intuition. It's a measured behavioral delta, generated before a single real user ever touched the change.

Why the cohort-level view matters more than the aggregate

If you've ever shipped an A/B test with "neutral" aggregate results, you already know the pain. A feature moves the average needle by 1.2%, which barely clears statistical significance — so you ship it. Six weeks later, your best customers are quieter, your support queue looks different, and nobody can quite say why.

The answer, almost always, is that the aggregate was masking a mixed result. Your high-intent power users improved. Your mid-funnel exploratory users got worse. The averages cancelled.

Simulation fixes this by design. Every run breaks down by bucket. Every bucket breaks down by metadata. You don't get a single number — you get the distribution of outcomes across every cohort you care about.

If your 5% highest-LTV users are about to take a 12% hit, the simulation tells you that before the release notes go out.

The quiet thesis behind this whole thing

Product analytics has been stuck at "observe and explain" for fifteen years. Dashboards got better. Query languages got easier. Event tracking got cheaper. But the fundamental loop — ship, wait, measure, react — never actually changed.

We think the next decade of product tooling isn't about making that loop faster. It's about replacing the observation step with prediction, and the measurement step with simulation.

Moveo One is the prediction layer. The Simulation Engine is what extends it forward in time. Together, they're what we mean when we say "analytics that tell you what's about to happen, not just what already did."

What's coming — and how to get early access

The Simulation Engine is still in internal testing. We're running it against production workloads across multiple Moveo One customers right now to benchmark accuracy. Early results are strong enough that we're opening a small beta in the coming weeks.

If you want to simulate your product before you ship it — if you're the kind of team that's tired of shipping into a fog and measuring the wreckage afterwards — we'd love to have you in.

At Moveo One, we build analytics that go beyond observation. We score every session 0–100% in real time, break behaviour down by cohort, and now — with the Simulation Engine on the way — we're about to let you run that same model forward in time, on any version of your product, before a single real user touches it. If that's the product you've always wished existed, get in touch and we'll put you on the early-access list.

Literature & further reading

Markov Decision Processes — Puterman, M. L. (1994).
https://en.wikipedia.org/wiki/Markov_decision_process
https://direct.mit.edu/books/monograph/2503/Growing-Artificial-SocietiesSocial-Science-from
Agent-Based Modeling and Simulation — Macal, C. M., & North, M. J. (2010). Journal of Simulation (https://www.researchgate.net/publication/216813135_Agent-based_modeling_and_simulation)
Why Most Published A/B Test Results Are Wrong — Kohavi, R., & Thomke, S. (2017). Harvard Business Review.

‍

Keep reading

More from the blog

TipsFeb 4, 2026

What Is Attention and Why Is It So Hard to Maintain Today?

TipsDec 8, 2025

Part Two: How the Brain Reacts to Monotone vs. Highlighted Stimuli

Highlighted stimuli activate the brain in powerful ways, boosting attention and sharpening focus, while monotone input pushes the mind toward idle mode. Understanding how the brain reacts to pronounced visual and auditory cues helps explain why some content captures us instantly — and how platforms like Moveo One use these patterns to predict user engagement.

TipsNov 9, 2025

Part One: How the Brain Reacts to Monotone vs. Highlighted Stimuli

When stimuli stay the same, the brain tunes out — attention fades, boredom rises, and behavior slows. This post explores how monotone input affects neural activity and focus, and how understanding these patterns can help predict and improve user engagement.