Your AI copilot is a researcher, not a developer

I went into my first AI-assisted GTM audit expecting code generation to be the main value. Write me a trigger. Build me a Custom HTML tag. Generate a data layer push. Any competent LLM can do that stuff, and it works fine.

What I didn't expect was that the moments that mattered most were research moments. Questions I needed answered before I could decide whether to fix something, delete it, or leave it alone. The copilot's ability to look things up quickly and in context turned out to be worth more than any script it generated.

Attribution research and UTM validation

Midway through a container with about 150 tags, I hit a wall. There was a custom attribution system built on Universal Analytics cookies: hundreds of lines of JavaScript parsing __utmz values and injecting them into hidden form fields for Salesforce. The question wasn't whether the code was well-written. It was whether the approach still worked at all.

I asked the copilot: "Can we read GA4's source/medium the way we used to read Universal Analytics?"

Ten-second answer: No. UA stored attribution in a browser cookie (__utmz). GA4 doesn't. It sends data directly to Google's servers at collection time. There's nothing in the browser to read. The entire attribution system was parsing a cookie that stopped being written when UA sunset in July 2023.

That killed a whole line of investigation. I was about to start modernizing the cookie-parsing code. Instead, the question shifted to whether the platform's native capabilities could replace it. Ten seconds of research, an hour of work redirected.

A different kind of question came up later: "What utm_medium values does GA4 actually recognize for channel grouping?"

The copilot pulled Google's channel grouping documentation, extracted the regex patterns GA4 uses to assign channels, and cross-referenced 16 UTM values the client was using. Five mapped correctly. Eleven would show up as "Unassigned" in reports. The marketing team was running campaigns with UTM parameters GA4 couldn't interpret, and nobody had caught it. That research took about three minutes. Doing it manually would have been a 30-minute detour through Google's support pages and regex syntax.

Consent mode enforcement levels

I was staring at the GTM consent settings UI, looking at two options that seemed nearly identical: "Built-In Consent Checks" and "Additional Consent Checks." Both appeared to control whether a tag fires based on consent state. I genuinely couldn't tell why both existed.

The copilot explained it clearly. Built-In means the tag fires in a degraded mode: cookieless pings where Google models the conversions it can't directly measure. Additional means a hard block. No consent, no data, no modeling. NOT_NEEDED means the tag ignores consent entirely.

That distinction shaped the entire approach to fixing consent across the container. Six tags were set to NOT_NEEDED when they should have been respecting consent. But the fix wasn't flipping them all to the strictest setting (which is what the copilot initially suggested). Some tags needed Built-In, where degraded measurement is acceptable. Others needed Additional, where any data without consent is a compliance risk. Getting that right required both the technical explanation and knowing enough about the client's consent model to apply it correctly.

The right profile for this workflow

There's a specific profile where an AI copilot's research value is highest. It's not the GTM specialist with 10 years of experience who has GA4 channel grouping rules memorized. That person gets time savings on documentation and repetitive tasks, but the research value is modest.

It's also not the marketing coordinator who just got GTM admin access and doesn't know enough to ask the right questions. "What's wrong with my container?" is too broad for useful output.

The sweet spot is the person in the middle. They understand what consent mode is but can't recall the enforcement levels. They know UTM parameters but don't have GA4's regex patterns committed to memory. They can evaluate an answer and act on it, but they can't produce it from recall. That's where the copilot transforms a multi-day audit into a single session.

This is the profile of most people who inherit a GTM container. Competent enough to be handed the responsibility. Not specialized enough to know every platform's quirks by heart. And the quirks change. Google updates consent mode behavior. GA4 adds new channel definitions. Platform templates get deprecated and replaced. Keeping up with every change across every platform isn't realistic when GTM management is one part of your job, not all of it.

Better data, better questions

When the copilot had access to a structured dataset (findings categorized, affected tags identified), the research questions got specific. Instead of "tell me about consent mode," the question was "these six advertising tags bypass consent. What enforcement level should each one have based on the tag type and the data it collects?"

Specificity changes the output quality. The copilot can focus on one thing (researching the correct configuration for known tags) rather than parsing the container, identifying problems, and proposing fixes all at once. The structured data handles the first two steps. The copilot handles the research.

I've found this pattern holds across every audit. Without structured data, the copilot's research is broader and shallower. With it, the research goes deep on exactly the questions that matter.

Research speed as the real bottleneck

Looking back, the bottleneck was never "how do I configure this tag." GTM's UI is well-documented and Preview mode gives instant feedback. The bottleneck was always "should I configure this tag, and how specifically?"

Should I delete the UTM parser or modernize it? Depends on whether GA4's native attribution plus HubSpot's UTM handling covers the client's reporting needs. I asked the copilot: "Can HubSpot handle UTM parameter capture natively, or does it need the custom JavaScript that's in this container?" The answer wasn't "probably." It was specific: custom properties need to be created manually, added as hidden form fields, auto-populated from URL parameters, and synced to Salesforce via field mapping. HubSpot doesn't do this out of the box. That specificity changed the recommendation from "just remove the custom code" to "here are three options with different tradeoffs."

That kind of answer would have taken me a solid 20 minutes to compile on my own: searching HubSpot's knowledge base, finding the relevant documentation on custom properties, confirming the Salesforce sync requirements. The copilot had it in seconds because the question was narrow and specific. I wasn't asking "how does HubSpot work." I was asking "does HubSpot do this one thing that would let me delete 640 lines of JavaScript."

Should I delete a customer ID encoder that hasn't fired in a year? Maybe. But it pushes to the data layer, and another system might be reading from it. The audit data alone couldn't confirm that. I paused it instead of deleting it.

Should I change the consent configuration on a batch of advertising tags? Technically straightforward, but shifting consent types on 40+ tags changes conversion volume in Google Ads. The copilot helped me understand the reporting implications and draft a communication for the client team before making any changes. The research prevented a technically correct fix from becoming a stakeholder problem.

Every decision that slowed me down was a research problem. The implementation was straightforward once I knew what to do. If you're thinking about bringing AI into your GTM audit workflow, the research speed is the capability worth paying attention to.