OnChainFlows

Scoring Engine Spec

Crypto Signal Scoring Framework

Signal scoring converts raw transfer events into prioritized operational outputs using confidence, persistence, and market-context weights.

Open Dashboard

Version v1.1

Last updated 2026-02-17

Scope Event scoring and priority assignment for production alerts

Confirmation logic

Base score is generated only after finality and route-validation checks pass.
Confirmation weights include finality confidence, attribution confidence, and route-completeness score.
Signals with incomplete route decomposition are capped below critical-priority levels.
Score state is recalculated when downstream route updates materially change confidence.

Threshold logic

Trigger gates require both size threshold pass and context threshold pass.
Context thresholds include venue inventory impact and persistence conditions.
Adaptive bands adjust by realized volatility and liquidity depth buckets.
Priority escalation requires repeated threshold passes in rolling windows.

Entity clustering

Counterparty quality contributes weighted score adjustments by confidence band.
Cluster confidence decay is applied when drift exceeds tolerance windows.
Multi-entity corroboration increases confidence multipliers for directionality.
Unknown-cluster concentration triggers uncertainty penalties.

Exchange labeling approach

Exchange class type affects score interpretation coefficients.
Internal exchange routes are scored as low directional impact by default.
Cross-exchange settlement flows receive neutral baseline unless persistence confirms pressure.
Venue-specific coefficients are versioned and audited.

Update frequency

Scoring outputs are updated on each qualifying event and route-state change.
Coefficient tuning occurs weekly with backtest review.
Emergency tuning is enabled during volatility shocks.
Full scorecard recalibration runs monthly.

This specification defines how the crypto signal scoring engine transforms route-validated transfer events into ranked operational outputs. The design is explicitly control-first: large notional movement is treated as one variable, not a verdict. In production, whale alert scoring only becomes escalation-eligible when size, route quality, attribution confidence, and persistence all converge. This prevents notional outliers from bypassing uncertainty controls during high-noise sessions.

The framework is built for decision stability under changing market structure. Coefficients can be tuned, but score-state transitions remain replayable and auditable so analysts can explain why a specific event was classified as informational, elevated, high, or critical at any point in time. For methodology-level context on confirmation state transitions and label governance, refer to the production flow methodology and the entity and exchange labeling system.

Core scoring model

BaseScore = SizeWeight + RouteWeight + EntityWeight
AdjustedScore = BaseScore * RegimeMultiplier * PersistenceMultiplier
Priority = f(AdjustedScore, ConfidenceFloor, AlertPolicy)

The model separates evidence quality from directional interpretation. SizeWeight captures transfer scale after normalization, RouteWeight captures decomposition completeness and venue-boundary clarity, and EntityWeight captures attribution confidence and counterparty quality. This separation is essential for large transfer analysis because it blocks the common failure mode where high notional dominates weak route evidence.

AdjustedScore introduces regime and persistence effects only after the base evidence passes minimum quality floors. In practical terms, this means volatility-aware multipliers can amplify already credible events, but they cannot rescue events with unresolved route branches or degraded entity confidence. The objective is to avoid false urgency when transaction paths are ambiguous.

For interpretation, the model is better viewed as a constrained decision function than a raw ranking formula. Priority is not a direct mapping from value size. It is a policy-aware output that gates escalation based on confidence floors and operational risk policy. This is how the system preserves signal quality when market impact signals diverge from headline transfer size.

A representative sequence:

A high-value transfer passes size thresholds.
Route decomposition identifies one confirmed boundary crossing and one unresolved branch.
Entity confidence remains medium because cluster corroboration is partial.
Persistence is absent in the rolling window.
Score remains below critical despite high notional.

This behavior is intentional: it favors controlled conviction over headline-driven escalation.

Priority tiers

Informational: monitor-only, no escalation.
Elevated: analyst review required.
High: risk workflow escalation.
Critical: immediate execution and risk action path.

Tier boundaries reflect operational response cost, not just score quartiles. Informational is designed for context accumulation and does not trigger intervention paths. Elevated creates a human review checkpoint where attribution uncertainty and route nuance can be resolved before risk actions are taken.

High and Critical are reserved for events where confidence, threshold persistence, and policy gates align. In production, this distinction reduces noisy churn in incident channels and keeps action bandwidth focused on events with durable directional evidence. When tuned correctly, the tiering system treats market impact signals as conditional on context rather than assuming all large transfers are market-moving.

A practical operating pattern is to measure transition quality between tiers, not only absolute alert counts. If Elevated -> High promotion rates spike while replay confidence declines, the issue is typically coefficient imbalance or route-quality drift, not true directional pressure. Tracking tier transition health improves robustness of downstream risk workflows.

Whale alert scoring under route uncertainty

Whale alert scoring is often misapplied when systems treat custody reshuffles and external transfers as equivalent. This framework avoids that by requiring route-boundary semantics before directional weight is increased. Internal exchange maintenance routes default to low directional impact, while confirmed boundary exits or entries are evaluated for escalation.

The critical design choice is uncertainty propagation. If route decomposition is incomplete, the uncertainty is carried into score caps rather than hidden in a single aggregate metric. This preserves interpretability and prevents overconfident alerts during fragmented transfer patterns.

Edge cases that commonly distort whale monitoring:

Cold-to-hot venue rotations that resemble external inflow bursts.
Cross-exchange settlement loops that inflate gross movement but net to neutral pressure.
Mixed-ownership aggregator hops where partial attribution can incorrectly imply coordinated intent.

In these cases, the alerting layer should remain conservative until corroboration arrives through repeated directional passes, improved route completeness, or stronger cluster confidence. This is where large transfer analysis benefits from explicit uncertainty penalties: the system can still surface the event for review without overstating directional certainty.

A technical example: a multi-hop transfer appears to leave exchange custody, then partially re-enters via a settlement branch within the same short window. Without boundary-aware logic, this sequence can be misread as two independent external flows. Under this framework, only confirmed net boundary movement receives significant directional weight, and unresolved branches keep priority below critical.

Transaction impact scoring and persistence windows

Transaction impact scoring is designed to evaluate whether transfer activity is likely to produce durable market pressure rather than transient notional spikes. Persistence multipliers are therefore applied across rolling windows that are asset- and liquidity-bucket aware. Repeated coherent directionality matters more than isolated size events.

Window design should align with execution reality. Very short windows overreact to bursty routing noise, while very long windows suppress timely escalation during rapid inventory shifts. A balanced configuration typically combines:

A short window for immediate pressure detection.
An intermediate window for continuity confirmation.
A decay function that prevents stale events from anchoring current priority.

This approach improves transaction impact scoring when liquidity regimes shift quickly. For example, if repeated exchange outflows occur during thinning order-book depth, persistence and regime multipliers can raise priority even when individual transfers are below historical headline size. Conversely, repeated internal reshuffles should remain muted because route semantics do not support external pressure interpretation.

Analytically, persistence should be treated as evidence reinforcement, not evidence substitution. Strong repetition cannot compensate for weak attribution. When attribution confidence drifts downward, persistence multipliers should flatten until label quality recovers. This keeps market impact signals tied to route validity instead of purely temporal clustering.

Calibration limits, failure modes, and governance

Coefficient tuning should optimize for error asymmetry, not generic score smoothness. In operational environments, false-critical events can consume significant response capacity, while false-medium events can delay reaction under real pressure. Calibration therefore needs explicit guardrails by asset, venue class, and volatility regime.

Common failure modes include threshold overfitting to a single volatility period, stale venue coefficients after custody topology changes, and hidden coupling between persistence multipliers and confidence floors. Each failure mode can create silent priority drift that appears as improved recall while degrading precision.

Recommended governance controls:

Weekly backtest review with route-level error attribution.
Monthly scorecard recalibration with regime-stratified replay windows.
Emergency tuning protocol for volatility shocks, followed by formal post-release validation.

Governance is strongest when parameter updates are versioned and replay evidence is attached to each release. This allows analysts to audit why policy behavior changed and whether the change improved directional accuracy or only shifted alert volume.

Release timing and control governance should remain aligned with the update cadence and change-control policy.

In production, crypto signal scoring performs best when confidence gating, route semantics, persistence logic, and policy thresholds are tuned as an integrated control loop. Treated this way, the framework produces operationally useful prioritization while limiting overreaction to ambiguous high-value flows.

FAQ

Why can a large transfer receive a medium score?

Large size alone is insufficient; low route confidence or neutral context reduces final priority.

How is persistence included in scoring?

Repeated directional signals within rolling windows add persistence multipliers.

Are scoring coefficients static?

No. Coefficients are tuned on a schedule and adjusted for regime changes.

Can unknown entities trigger high-priority alerts?

Yes, but only when other confidence and impact gates are strongly satisfied.

Implementation step

Put this logic into production alerts.

Use the documented rules in your live monitoring workflow to capture high-signal moves without rebuilding the stack from scratch.

Open Live Dashboard Browse Documentation Hub