BIPI
BIPI

PLG Instrumentation: Building an Event Taxonomy That Survives Year Two

Growth Systems

Most product analytics implementations rot within 18 months because event names drift, teams add events without governance, and identify resolution breaks at scale. A clean taxonomy on day one saves a 12-month rebuild later.

By Arjun Raghavan, Security & Systems Lead, BIPI · August 19, 2024 · 8 min read

#plg#product-analytics#growth

A PLG company came to us last year with 4,200 distinct event names in Amplitude. Their growth team couldn't trust any cohort analysis because the same user action had been instrumented three different ways across two years and a frontend rewrite. The data wasn't wrong, exactly. It was untrustworthy, which is functionally the same thing.

Rebuilding their event taxonomy took four months and required a full reinstrumentation. The cost was a quarter of growth team productivity and roughly $180K in engineering time. The original taxonomy decision had taken an afternoon two years prior. That's the math on getting this wrong.

Event taxonomy as a contract, not a habit

An event taxonomy is a contract between product, engineering, marketing, and analytics about what gets tracked, how it's named, and what properties travel with it. Without that contract written down and enforced, you get drift within six months.

  • Naming convention: object_action verb-final (e.g. 'project_created', not 'createdProject')
  • Required properties on every event: user_id, account_id, source, environment
  • Event approval workflow: PRs reviewed by analytics owner before merge
  • Deprecation policy: events get versioned (v2) not silently changed
  • Quarterly taxonomy audit to retire dead events

The naming convention itself is less important than the consistency. We've seen teams ship 'object_action', 'Action Object', and 'doSomethingHappened' all in the same product. Pick one and enforce it ruthlessly.

Identify resolution breaks at scale, plan for it

Anonymous-to-identified user stitching is where most PLG analytics quietly fall apart. A user signs up on the marketing site (anonymous_id), creates an account (now has user_id), invites teammates (separate identifies), uses an SSO flow (potentially gets a new anonymous_id), and you have to stitch all of this back to one person.

Segment, RudderStack, and Snowplow all handle this differently. Test it explicitly with a synthetic user journey before trusting your funnels.

Server-side vs client-side: stop doing it all client-side

Client-side tracking is convenient and unreliable. Ad blockers, browser privacy modes, and aggressive Safari ITP wipe out 15-30% of events depending on your audience. Critical events (signups, payments, conversions) belong on the server. Marketing-only events (page views, scroll depth) can stay client-side.

Tools, with honest tradeoffs

Segment is mature and expensive once you scale. Snowplow gives you full data ownership and infinite flexibility but costs you an engineer to operate. RudderStack is the open-source-ish middle ground. PostHog has caught up dramatically and bundles product analytics, replays, and feature flags. Amplitude and Mixpanel are analytics layers that need a CDP underneath.

  1. Under 1M events/month and need to ship fast: PostHog
  2. 1-50M events/month, mid-size team: RudderStack + Amplitude
  3. 50M+ events and data team owns warehouse: Snowplow + dbt + your own viz
  4. Enterprise with compliance team that won't move: Segment + Amplitude

The right tool depends entirely on data team maturity. If you don't have a data engineer, Snowplow will hurt you. If you have a strong data team, Segment will feel like training wheels you're paying $200K a year for.

What to instrument first

On day one, you don't need 200 events. You need maybe 25 well-chosen ones that cover the activation funnel, core product loops, and revenue-generating actions. We start every PLG instrumentation with a single map: from anonymous landing to first value to first invitation to first payment to first expansion. Each transition gets one event. Add depth later.

The teams that scale this cleanly are the ones who treat the first 25 events as the spine, version it, and grow from there. The teams that ship 200 events on day one are the ones we end up rebuilding from scratch in year two.

Governance is the unsexy multiplier

Assign one person as analytics owner. Every new event proposal goes through them. They approve the name, the properties, the implementation location. This sounds like bureaucracy. It's the difference between trustworthy data in year three and an expensive rebuild.

Read more field notes, explore our services, or get in touch at info@bipi.in. Privacy Policy · Terms.