Traffic-driven discovery

You cannot modernize what you cannot see. Most SOAP estates have accreted over two decades, and nobody holds a current, complete inventory.

You cannot modernize what you cannot see. Most SOAP estates have accreted over two decades, and nobody holds a current, complete inventory. Traffic-driven discovery builds that inventory automatically, from live traffic at your gateway edge — the load balancer or API gateway that your SOAP calls already pass through. If a service is taking real calls, it shows up.

This page explains how discovery works end to end: how the platform learns the inventory, how the master discovery view is computed, how consumer attribution works, and the read-path rule that keeps it fast.

The model in one paragraph

Your gateway edge already sees every SOAP request and response. A connector feeds that traffic into the platform (and, where the gateway exposes a management API, also enumerates the configured services directly). The platform persists each observed transaction, then a background worker rolls the raw rows up into pre-computed aggregation tables. The operator-facing surfaces — the master discovery view, dashboards, and per-service rollups — read from those aggregation tables, never from the raw logs at query time. That is what makes them load in sub-second time even across estates of many hundreds of services.

How the inventory is learned

Discovery has two complementary inputs, and a connector may provide one or both:

Traffic ingest. The gateway's traffic logs (request line, response status, headers, latency, client IP, and — where the edge can capture it — request and response bodies) are received by the platform. Every distinct front-end/back-end service that appears in traffic becomes a discovered service, ranked by how much real traffic it carries.
Management-API discovery. Where the gateway exposes a management API, the platform also queries it directly to enumerate virtual servers / services and their front-end and back-end endpoints. This gives you a structured inventory (names, addresses, ports) rather than one scraped purely from URLs in traffic, and surfaces configured-but-currently-idle services too.

Discovery spans more than one source. Beyond the gateway-edge feed, the platform's own runtime proxy traffic is also discovered. Source is an explicit dimension, so you can filter to one source or combine across all of them; default metric views sum across sources.

Once a service is discovered you onboard it with the WSDL wizard or zero-touch autopilot — discovery tells you what to onboard and in what order (busiest first); onboarding turns it into a published service.

The master discovery view

The master discovery view is the headline discovery surface. It lists every discovered service with per-service request counts over three rolling windows:

| Window | Meaning | |---|---| | 24 hours | How busy the service is right now | | 7 days | Recent steady-state volume | | 1 month | Whether it is a long-lived service or a one-off |

Sorting by any window turns "we think we have a few hundred SOAP services somewhere" into a ranked, evidence-based worklist you can drive down by traffic volume. A busy service that nobody remembered surfaces immediately; a service that has gone silent for a month is visible as such.

The view loads in sub-second time because it reads from a pre-computed aggregation table, not from the raw traffic log. Across a large estate it routinely renders hundreds of services with correct counts without a perceptible wait.

Consumer attribution

Discovery answers "who is calling what", not just "which services are busy". Each observed transaction carries a client IP, which the platform resolves through the standard forwarded-header chain so the original consumer is attributed rather than the gateway's own hop:

X-Client-IP — the dedicated client-IP header (the value most edges inject explicitly).
X-Forwarded-For — the de-facto proxy chain header; the left-most non-gateway entry wins.
Forwarded — the RFC 7239 standard header, including its for= form.

The platform extracts candidates from these headers (in that order) and picks the first one that is not the gateway's own immediate-hop IP. If no forwarded header is present, or every candidate equals the gateway IP, attribution gracefully falls back to the gateway hop itself — the transaction is still ingested and discovered; only the per-consumer fidelity degrades.

Why this matters at install time. If your edge does not inject a client-IP header, every request appears to come from the gateway's own source-NAT address, and per-consumer analytics collapse to a single meaningless aggregate. Each connector page documents the one-line header-injection snippet for that vendor so attribution is accurate at the edge. This is a configuration recommendation, not a hard requirement — discovery works either way.

Raw logs vs aggregation tables — the read-path rule

A core architectural rule of the platform: time-bucketed operator surfaces read from pre-computed aggregation tables, never from raw logs at query time — and this holds for all traffic sources, not just one gateway type.

Aggregation tables drive: the master discovery view, dashboards, KPI tiles, timeline charts, per-service rollups, and SLO reports. These are refreshed continuously by a background worker.

Raw logs are used only for:

Ingest-health signals — "how long since the last row arrived from this source?" and zero-row-window detection.
Drill-down on a specific transaction — after you click into a single event by correlation ID, the raw row is fetched to show the full request/response and headers.
Curator workers — the background jobs that derive downstream tables (aggregations and learned examples) from the raw rows, offline.

The practical consequence: operator surfaces stay fast at any estate size, and the freshness of the numbers depends on the aggregation refresh, not on scanning tens of millions of raw rows on every page load.

Where to go next

Connectors overview — the two sides of a connector (management-API discovery + traffic ingest), the per-vendor table, and the generic-HTTP receiver.
Learned examples — how real captured payloads become curated examples and mapping evidence.
Onboarding a service — turn a discovered service into a published one.

All Specaria SOAP to REST docs

Loading…