Bring Your Own AI · Diagnostic Playbook

The diagnostic playbook our engineers wrote.

Built from 27 years of performance-engineering judgment. Human-authored rules drive the analysis, AI writes the report.

What the playbook contains.

Under the WPLoadTester 7 AI Assistant is a curated decision tree. The Assistant uses it to classify a failing or degraded load test, then to drill into the specific failure signature. The tree is small, deterministic, and authored entirely by Web Performance engineers.

  • Symptom-to-cause pattern rules written in plain English
  • A three-level decision tree the AI walks to classify a failing test
  • Concrete diagnostic signatures, such as the pairing of timeouts with HTTP 500 errors at the same load level
  • Navigation hints telling the AI which level to consult next

Each rule in the playbook was authored by a senior performance engineer from accumulated diagnostic experience. The LLM navigates the tree. It does not author the rules.

Three levels, one route from symptom to root cause.

The Assistant starts at the top of the tree and commits to a route before it sees that route's diagnostic content. That commit is on purpose. It prevents the model from cherry-picking conclusions, and it produces an audit trail of how the diagnosis was reached.

L0 · Entry

Classifies the test by error rate, latency, and throughput. Routes to one of seven symptom domains. A single prompt with no parameters.

L1 · Symptom domain

Seven domains, one per failure family the playbook covers. Each L1 page tells the Assistant what to look at next and which L2 signatures are most likely.

L2 · Specific signature

Concrete diagnostic conclusions for a specific pattern. The first signature, timeouts coexisting with 500 errors, ships in 7.0. The library grows with each 7.x release.

The seven L1 symptom domains.

Every load test that fails or degrades does so in one of these seven shapes. The L1 layer is fully populated in WPLoadTester 7.0.

  • timeout: Requests that exceed their configured wait time. Usually points at an upstream service running out of threads, connections, or worker capacity.
  • http500: Application-server errors returned to the client. Usually points at code paths that only misbehave under load: lock contention, exhausted pools, race conditions on shared state.
  • gateway: HTTP 502 from a reverse proxy or API gateway. Usually points at a hung or crashed backend behind the gateway, not the gateway itself.
  • loadbalancer: Uneven distribution of requests across backend servers. Usually points at sticky sessions, misconfigured weights, or one server failing while the load balancer keeps routing to it.
  • unavailable: HTTP 503 signaling overload or maintenance. Usually points at an upstream third-party service rate-limiting against the load test's traffic shape.
  • connection: TCP resets, broken pipes, refused connections. Usually points at a layer below HTTP: firewall, NAT translation, or OS file-descriptor and ephemeral-port limits.
  • performance: No errors, but response times degrade as load rises. Usually points at a database, cache, or downstream service approaching its own capacity ceiling.

A worked L2 example: timeouts and 500 errors at the same load level.

The first L2 signature in the library names a pattern any senior performance engineer recognizes on sight. When timeouts and HTTP 500 errors appear together at the same load level, they are usually one cascading failure, not two independent problems.

The upstream service slows under load. Downstream requests time out waiting for it. The downstream service then returns 500 errors because the responses it needed never arrived. Two symptoms, one root cause. The playbook tells the Assistant to look for the pairing at the same load level and treat it as a single root cause to investigate first, rather than as two unrelated findings.

More L2 signatures are being added in subsequent 7.x releases. The L0 and L1 layers are fully populated today, and the Assistant can fall back to L1 guidance whenever a specific L2 pattern is not yet authored.

Hard math first. AI second.

Both the playbook and the AI sit downstream of a deterministic detection layer. When a test finishes, WPLoadTester runs hard-coded calculations against the data: error rates, latency percentiles, throughput inflections, the specific arithmetic that decides whether a given condition is present. These calculations are unit-tested. They produce yes/no triggers, not interpretations.

A triggered condition surfaces the matching playbook entry, which carries the human-authored diagnostic context. The AI receives both, and then does two things on top:

  • It verifies. The AI cross-checks the calculated trigger against the playbook's typical signature. If they agree, the report carries that finding with confidence. If they disagree, the AI surfaces the conflict rather than picking a side.
  • It extends. The AI looks for relationships the math and the playbook did not cover: anomalies in adjacent metrics, suspicious correlations between pages, unusual cliff shapes. The playbook is the floor of the analysis, not the ceiling.

The principle holds across the product. Use hard code and math wherever possible, and feed those deterministic outputs into the AI to verify the conclusions and look for other relationships. The AI is never the primary source of truth for something that can be calculated.

What the playbook adds.

Three properties define how the playbook contributes to the analysis, and why it accumulates over time rather than being something that can be written in a sitting.

Human-authored

Every rule was written by a senior performance engineer from accumulated experience diagnosing real load-test failures. The LLM navigates the tree. It does not write the rules.

Deterministic and auditable

Lookup is exact-string match against the level and symptom names. There is no fuzzy retrieval, no vector index, no embedding. Every routing decision the Assistant makes is logged and can be replayed.

Built by doing the work

The pattern recognition encoded in the playbook came from diagnosing the failures, not from reading about them. That kind of judgment accumulates from years of consulting engagements; the playbook is the artifact of that accumulation.

This is the layer of the AI Assistant that explains the canonical claim on the Bring Your Own AI page: the AI sits on top of a deterministic expert system battle-tested against thousands of real test cases. The playbook is one half of that expert system. The ASM correlation rules are the other.

Reachable from any MCP client.

The playbook is exposed to MCP clients as the get_triage_prompt tool on the WPLoadTester 7 MCP server. Each call returns one level of the tree along with navigation hints to the next level. If you drive WPLoadTester from Claude Code, Codex, Cursor, Windsurf, or any other MCP-compatible client, the AI in your terminal walks the same playbook the in-app Assistant uses.

See the MCP server reference

Run the playbook against your own test cases.

The diagnostic playbook ships with every WPLoadTester 7 install. Request the beta and put the AI Assistant, or your favorite agentic CLI over MCP, against your existing test cases.

Comparing tiers? See the Free vs Pro split.

Software

Copyright © 2026 Web Performance, Inc.

A Durham web design company

×

(1) 919-845-7601 9AM-5PM EST

Complete this form and we will get back to you as soon as possible. Please note: Technical support questions should be posted to our online support system.

About You
How Many Concurrent Users