Rethinking the Data Moat: Where Your Real Moat Lives

Madhav Sivadas

1 month ago

Series: Rethinking

“Open government, Prime Minister. Freedom of information. We should always tell the press, freely and frankly, anything that they could easily find out some other way.”
— Sir Humphrey Appleby, Yes, Prime Minister

Sir Humphrey was talking about political transparency — the art of appearing open while giving away nothing that was not already available. But the line lands differently in a commercial context, and it is the reason this article exists. The data that insurance intermediaries, new MGAs, and specialist underwriting agencies believe is proprietary — the data they are building walls around, the data they cite as the reason to keep everything in-house — is, for the most part, data that their counterparties, their carriers, their reinsurers, and increasingly any well-tooled competitor can find out some other way.

In brief: The instinct to keep data and AI in-house is understandable, but it rests on a claim that does not survive examination. Most of what intermediaries call “our data” is held under obligation, not owned. The slice that is actually yours — derived features, operational telemetry — is less exclusive than the moat argument assumes. And the assets that actually differentiate an intermediary or underwriting agency have nothing to do with the archive. They are relationships, judgment, accountability, and market position. None of those is threatened by a well-governed platform. All of them are threatened by spending your best years defending an archive instead of building what compounds.

Somewhere in the senior leadership of a large insurance broker, freight network, or financial intermediary, a version of the same conversation is happening right now. We have data our competitors do not have. That data is our moat. Any platform that touches that data is a threat. We need to own the infrastructure.

I have had this conversation three times, with senior leaders from three of the world’s largest insurance brokers, in three different pitches. Each time it arrived early and with unusual intensity — not as a considered strategic position but as a reflex. What struck me was the gap between the strength of the claim and the fragility of the underlying title. The louder the ownership argument, the more it seemed to be protecting something other than the client’s interest.

The same conversation happens inside carriers. A chief underwriter or head of digital makes the case for internal AI: twenty years of claims data, pricing decisions, and loss development patterns. That data is ours. Keep it on our own infrastructure. And now I hear it from new MGAs and specialist underwriting agencies too — firms that have been in operation for a handful of years, sometimes less, telling me they have unique knowledge of their market that no platform should touch. The argument is structurally identical every time. The error is structurally identical too.

I want to be precise about what the error is, because the premise is right. Data and AI do create a moat. The question is whose data, and where the moat actually forms.

What the Data Actually Is

The data that flows through an insurance placement is not a single thing. It maps across five distinct layers, and treating them as equivalent is where the defensive strategy goes wrong.

Layer	What it means	Who holds it
Legal ownership	Transferable title — permanent	The client owns their data. The carrier owns the policy terms. Neither is the broker.
Custody	Possession + duty of care — conditional	The broker holds client data under fiduciary obligation. The duty ends when the relationship ends.
Usage rights	Contractual permission — revocable, scoped	The broker uses client data to perform the engagement. Cannot repurpose without authorisation.
Control of capture	Who logs it, in what schema	Creates workflow advantage — not data ownership.
Derived features	Outcomes, labels, negotiation patterns	The strongest data argument — examined below.

Most of what intermediaries call “their data” falls into custody and usage rights — held under obligation, not owned. Confidentiality is a covenant. It expires, it transfers, and it has defined limits. Control of capture creates real switching friction, but that friction is relational — it lives in the professional history of the relationship, not in a database you can lock away from a successor.

I hear this from new MGAs particularly. They have been writing a book for two or three years, they have some claims experience, some pricing data, and they believe this gives them a proprietary dataset that justifies building their own infrastructure. I don’t think that is right, and the next section is why.

The Strongest Version of the Argument — and Why It Fails

Before I take this apart, it deserves to be stated at its strongest. A serious defender of the data moat position would not claim to own raw client data. The actual argument runs like this: we hold a proprietary derived dataset. How underwriters reacted to specific risk structures. Which programme configurations held up under claims. Which negotiation pathways worked in which market conditions. Decades of operational telemetry. AI makes that more valuable, not less. And owning the infrastructure that captures and processes it is how we protect it.

I think there are two things wrong with this, and they are different things.

The first is a moat-type confusion. A flywheel moat compounds automatically — the product improves as more data comes in, that improvement attracts more users, and more users generate more data. Search engines work this way. Insurance brokerage does not. What the intermediary has is an execution moat — built through process efficiency, pricing intelligence, and institutional knowledge. A broker who knows from years of placement history that a particular underwriter systematically underprices a specific risk class in a specific cycle has a genuine advantage. But execution moats don’t compound on their own. They require continuous reinvestment — analytical talent, market engagement, and professional judgment. And they don’t require infrastructure ownership to protect.

The in-house instinct treats the execution moat as though it were a flywheel. It is not.

The second problem is exclusivity — or the lack of it. This is where Sir Humphrey’s observation bites hardest.

Carriers hold their side of every placement history. Reinsurers hold aggregate loss data across portfolios that dwarf any individual broker’s book — Munich Re’s NatCatSERVICE has tracked more than 28,000 loss events since 1980; Swiss Re’s sigma database covers global P&C premium and loss trends across all major markets. A competitor with decent analytical tooling and strong carrier relationships can approximate most of what sits in your proprietary archive without ever having direct access to it.

And that is only the counterparty side. The “some other way” runs wider than most people in this conversation acknowledge. Regulatory disclosure alone — Solvency II reporting, Lloyd’s syndicate accounts, Companies House filings, AM Best and S&P ratings data — forces into the open loss ratios, reserve development, premium volumes, and combined ratios that were once the exclusive province of the parties to the transaction. A well-tooled analyst can reconstruct a surprisingly detailed picture of a competitor’s book from public sources without picking up the phone. Add the third-party data ecosystem — bureau data, catastrophe model vendors, claims databases, benchmarking services — and the walls around the proprietary archive start to look less like fortifications and more like fences with missing boards.

Then there is the part that makes this argument urgent rather than merely academic. Modern AI changes what “easily find out some other way” means in practice. The analytical capability to synthesise, cross-reference, and infer from publicly available and commercially licensed data has moved faster than most incumbents have noticed. What used to require twenty years of accumulated placement history to know intuitively — which underwriters misprice which risk classes in which cycle conditions — can now be approximated by a model trained on the aggregated experience of many participants across many sources. The window of exclusivity on most intermediary data is shorter than people assume, and it is closing fast.

For the new MGAs and UAs who believe they have built something unique in three or four years of operation — I understand the instinct, but the market data they are sitting on is a thin slice of what the carriers and reinsurers on the other side of those same transactions already hold in far greater depth. The “unique knowledge” is not in the data. It might be in the people — in the underwriter’s judgment about which risks to take and which to walk away from — but that is a different asset entirely, and it does not live in an archive.

In specialist lines — marine liability, political risk, complex cyber E&O — the data scarcity problem runs even deeper. The datasets are structurally small, not temporarily small. The specialism is defined by the rarity and complexity of the events it covers, and no amount of waiting changes that. The label problem compounds it — a complex casualty claim may not close for seven years, which means your model trains on incomplete data at exactly the point where the prediction matters most. D&O data from before the current wave of ESG litigation prices a different exposure than exists today. Cyber loss patterns from five years ago describe a threat environment that has shifted materially. The lines where proprietary AI is argued most forcefully are the lines where the statistical learning approach is least reliable.

The archive decays. The capability to recondition it does not — but that capability requires people, not infrastructure.

The IP Delusion

There is one more step in the data moat reasoning that usually goes unstated. The company holds the data. The company believes it is proprietary. The implicit conclusion is that the data constitutes intellectual property — something that can be protected, defended, and that competitors are somehow infringing when they approximate the same insights from other sources. I want to be direct about this: that conclusion does not hold up legally or commercially.

If the data is a genuine business secret — something nobody else holds, generated through an original capture methodology that is itself novel — then protect it. That is precisely what trade secret law exists for. The Defend Trade Secrets Act and state equivalents provide real protection, but with a firm condition: the information must actually be secret, and the holder must take reasonable measures to maintain that secrecy. When carriers, reinsurers, and rating agencies hold their side of every transaction, and when regulators require disclosure of loss ratios, reserve development, and premium volumes, the secrecy condition is already compromised. The protection was always fragile. It is becoming more so as AI-assisted extraction closes the gap between what is formally disclosed and what can be derived from it.

Copyright offers nothing here. The Supreme Court settled this in Feist v. Rural Telephone in 1991: raw facts are not copyrightable regardless of the effort spent collecting them. A database of placement histories, claims outcomes, and pricing decisions is a database of facts. No copyright attaches to the underlying data, only to sufficiently creative selection or arrangement — and that is a thin and hard-won protection that does not cover what most organisations mean when they talk about their data.

Patent prosecution, post-Alice, is the most decisive point. The U.S Supreme Court decision in Alice Corp. v. CLS Bank (2014), and the USPTO guidance that followed have made data-related inventions extremely difficult to patent. The two-part test asks first whether the claim is directed to an abstract idea, and second whether it adds something inventive beyond applying that idea on a computer. Methods of collecting, organising, or analysing transactional data almost always fail the first step. Adding “on a computer” no longer rescues a claim. Most of what insurance intermediaries and underwriting agencies describe as proprietary data methodology — pricing models, risk assessment frameworks, negotiation pattern libraries — would not clear Alice. The IP walls were never legally walls.

The one genuine exception is truly original source data: readings from a proprietary instrument, outputs from a bespoke capture method that physically exists only because this company created the means to generate it. That is rare in intermediary and agency businesses. The data flows through transactions between parties who both have records. Both sides of the ledger exist.

Then there is the decay argument, which matters regardless of the legal position. The things that were actually obscure — buried in paper files, trapped in pre-digital workflows, expensive to reconstruct — are being digitised at a rate that is not linear. The window of practical obscurity is not just closing; it is closing faster every year, and the rate is accelerating. AI-assisted extraction and inference means that what used to require years of proprietary accumulation can now be reconstructed in a fraction of the time from sources that are already public or commercially licensed. The case for the data moat was always weaker than its defenders believed. The case today is weaker still.

Protecting what is actually yours — a real trade secret, a truly original dataset, a novel methodology — is sensible and worth the effort. The mistake is in thinking that a collection of anecdotal experiences or a collection of pre-existing data originating from external sources is a differentiating value proposition, and building strategy around defending what others can already find out some other way.

This is especially true for intermediaries — the old business maxim still applies here: “what got you here will not get you there.”

There is a final point here that I want to make carefully, because it is easy to misread. None of this is an argument against using data. Data, used well, is the engine of better decisions — better underwriting, better pricing, better claims outcomes, better client service. The argument is specifically against the belief that holding data — keeping it locked away, refusing to share it with platforms, building walls around the archive — is itself a source of competitive advantage. It is not. The same data, or a close enough approximation of it, is available to anyone willing to look. The advantage was never in the possession. It was always in the capability to act on it — and that capability lives in the people, the judgment, and the analytical frameworks that an organisation builds and maintains. Two organisations can hold equivalent data and produce very different outcomes. The difference is not in the archive. It is in what they do when they open it.

Why Player-Led Infrastructure Fails

If the data moat is an execution moat and the data is less exclusive than the argument assumes, the case for owning the infrastructure collapses further. A proprietary B2B platform built by one participant asks every counterparty to operate on a competitor’s infrastructure. I have watched this play out.

TradeLens is the clearest recent example. Launched in 2018 by Maersk and IBM as a blockchain-based logistics platform, it attracted genuine early participation from ports, freight forwarders, and some carriers. It shut down in 2022. Maersk’s own announcement cited the need for “a fully neutral, industry-wide platform.” Competing shipping lines would not join a platform owned by their largest competitor. The entities that would have created the network effects never showed up, because the player who built the thing was the reason they stayed away.

This is not a technology failure. It is an architectural inevitability — the same one every time. The entities not already clients of the sponsoring player have no incentive to join and every reason not to. The ecosystem is only as large as the sponsor’s existing footprint, which is precisely not what an ecosystem needs to be.

The positive counterpart is SWIFT, founded in 1973 because banks refused to operate on infrastructure owned by a single competitor. The cooperative structure enabled adoption across more than 11,000 institutions in over 200 countries. The network adoption that proprietary infrastructure sacrifices is the infrastructure’s most valuable property.

This connects to a pattern I wrote about in Rethinking AI for Automation: the distinction between Box A — core systems where data processing happens — and Box B — the workaround layer that exists because Box A’s systems don’t interoperate. Proprietary infrastructure creates integration gaps at the network level in exactly the way that incompatible enterprise systems create them inside an organisation. The long tail of integration — the vast number of small connection cases, each too expensive to build individually — gets extended, not shortened, by player-owned platforms.

The Assets That Actually Compound

If the archive is not the primary moat, what is? I think the answer is more valuable and more defensible, but it requires naming precisely — because people who have not clearly articulated their real assets tend to default to defending the ones they can point to.

Relationships. The introduced connection between client and counterparty and the trust that flows from it. A platform can facilitate a transaction between parties who already trust each other. It cannot create the trust. No change of infrastructure transfers a relationship.

Judgment. The accumulated expertise to assess a situation and recommend a course of action. This includes tacit knowledge that has never appeared in any system — it arrived through a conversation, a pattern noticed across dozens of placements, an instinct about which markets are hardening before the data confirms it. A competitor cannot get at it by accessing your archive.

Accountability. The professional stands behind the recommendation. If the placement goes wrong, the professional is answerable — professionally and legally. A platform has terms of service. A licensed professional has professional liability. This is the clearest single justification for the margin, and it cannot be automated or replicated.

Market access and position. Panel relationships, coverholder arrangements, and market standing built through years of performance. No new entrant replicates it regardless of platform or AI.

None of these is in the data. None compounds through accumulation. They compound through use — through transactions, through relationships honoured, through professional judgment exercised and accountable outcomes delivered.

Two cases from deployments I have been directly involved in make the point concrete. In one, a historical cyber placement — a risk that had already been bound — was run through an AI-driven platform. The analysis surfaced issues that had not appeared in the broker’s original process; the insurer could have demanded a materially higher premium. The data existed. The archive existed. What was missing was the analytical framework to interrogate it at the point of decision.

In the second, the platform flagged a risk profile that an experienced underwriter chose to override. Correctly. The flag was technically accurate, but the underwriter’s direct experience with that client gave them relational context the system could not access. That is governance: the system’s output as structured input to professional judgment.

The analyst can be automated. The underwriter who governs the AI, owns the accountability, and maintains the relationships no system can replicate — that professional is not threatened by automation. They are clarified by it.

What to Do

If data and AI create the moat — and they do — the correct question is which entity holds the data that produces the most powerful AI-enabled judgment. Not the individual broker. Not the individual carrier. The entity that sees transactions across the full distribution chain, at scale, across all participants, at the moment the B2B event occurs. A participant who contributes to that network, governs the AI it produces, and applies professional judgment to its outputs has the moat working for them. The one running AI in isolation on a narrow archive has accepted the argument’s premise and acted against their own interest.

This is the constructive counterpart to the misdirection I diagnosed in Rethinking AI for Automation: Solving the Real Problem— the same instinct to invest in the visible asset rather than the structural one, repeated at the strategic level. For what the right architecture looks like in practice — a transaction flowing through systems that understand each other semantically — see Rethinking B2B Transactions.

Treat as commodity: Core plumbing — connectivity, workflow execution, policy administration, and data transport. Solved in principle. Proprietary versions create network friction without producing differentiation.

Differentiate here: AI governance, advisory capability, and the client experience. The analytical frameworks and human oversight that constitute the professional service.

Treat data this way: Permissioned, portable, and continuously curated. Historical data not reconditioned for current market context is record, not intelligence.

Partner without being captured: Exit rights. Multi-homing. Layer ownership — your analytics and advisory frameworks are yours, portable across any infrastructure.

The in-house instinct persists because the real assets — judgment, relationships, accountability — are harder to point to than a proprietary archive. I get that. But defending something of limited strategic significance while the real assets go undefended is the more serious risk. Name the assets. Invest in them deliberately. And treat the infrastructure question for what it is: a question about which layer to differentiate on — not a question of survival.

Madhav Sivadas is an enterprise software integration architect with nearly thirty years in process integration, UI automation, and enterprise workflow. He founded Inventys (acquired 2012), holds multiple US patents in software integration, and is the founder and CEO of Telligro, building AI-driven intelligent transaction networks for insurance, logistics, and financial intermediaries.

madhavsivadas.com