Streaming Analytics affects both tracking and analytics!
Updated post from my previous blog.
Streaming Analytics will change the way you should think about both tracking and analytics in digital analytics. How?
Tracking: Since tracking has been focused on decision making, it has mainly captured performance metrics rather than signals that can be used to personalize the user experience. That has to shift with the advent of streaming analytics.
Example: the aggregate user interactions can reveal a lot about a user’s preferences that is hard to capture looking at a single event and it is too late to act on in batch analytics. Imagine a travel company where you can search for travel packages. A search include data such as selected destination airport, hotel concept, price, duration, departure time. A user does many searches in a sessions and the aggregate reveals a lot about the users intention and preferences. Is he/she primarily looking for a certain:
destination
hotel concept
price sensitive or not
a specific date/s
etc.
That opens opportunities for different kinds of real time personalization based on signals and has less to do with tracking outcomes. The same is true for preferred sorting. You apply the same for a electronics/telco company selling cell phones, is the user looking for a particular brand, price level, features, etc.?
Analytics: In batch analytics you let the query run over your data, but in streaming analytics you let your data run over your query. This is fundamentally something different than fast data warehouses with streaming ingest. Streaming (real-time) Analytics is really about applications/automation and not decision making (for that you can resort to batch). Hence, you have to enable your applications to act on the analytics in real time, either by push or pull of an “analytics” feed. Also, the language having the greatest adoption to perform analytics is still SQL, hence you should leverage streaming/real-time SQL.
But how does that look like? I’ve added a super simple example using StreamProcessor (will be open sourced soon, sponsored by my employer Mathem) where I run dataflow SQL on top of a stream to show how you can query your GA4 data in real-time and write that to firestore to activate data in apps (web or native) in real-time. The example could be much more advanced and really leverage the power of SQL and aggregations etc. rather than writing directly from the browser to firestore.