Expand description
Configuration for the Validator Client Monitor
The Validator Client Monitor tracks client-observed performance metrics for validators in the Sui network. It runs from the perspective of a fullnode and monitors:
- Transaction submission latency
- Effects retrieval latency
- Health check response times
- Success/failure rates
§Tuning Guide
§Monitoring Metrics
The following Prometheus metrics can help tune the configuration:
validator_client_observed_latency- Histogram of operation latencies per validatorvalidator_client_operation_success_total- Counter of successful operationsvalidator_client_operation_failure_total- Counter of failed operationsvalidator_client_observed_score- Current score for each validator (0-1)validator_client_consecutive_failures- Current consecutive failure countvalidator_client_selections_total- How often each validator is selected
§Configuration Parameters
§Health Check Settings
-
health_check_interval: How often to probe validator health- Default: 10s
- Decrease for more responsive failure detection (higher overhead)
- Increase to reduce network traffic
- Monitor
validator_client_operation_success_total{operation="health_check"}to see probe frequency
-
health_check_timeout: Maximum time to wait for health check response- Default: 2s
- Should be less than
health_check_interval - Set based on p99 of
validator_client_observed_latency{operation="health_check"}
§Failure Handling
-
max_consecutive_failures: Failures before temporary exclusion- Default: 5
- Lower values = faster exclusion of problematic validators
- Higher values = more tolerance for transient issues
- Monitor
validator_client_consecutive_failuresto see failure patterns
-
failure_cooldown: How long to exclude failed validators- Default: 30s
- Should be several times the
health_check_interval - Too short = thrashing between exclusion/inclusion
- Too long = reduced validator pool during transient issues
§Score Weights
Scores combine reliability and latency metrics. Adjust weights based on priorities:
-
reliability: Weight for success rate (0-1)- Default: 0.6
- Increase if consistency is critical
- Decrease if latency is more important than occasional failures
-
latency: Weight for latency scores- Default: 0.4
- Increase for latency-sensitive applications
- Individual operation weights can be tuned separately
§Example Configurations
§Low Latency Priority
validator-client-monitor-config:
health-check-interval: 5s
health-check-timeout: 1s
max-consecutive-failures: 3
failure-cooldown: 20s
score-weights:
latency: 0.7
reliability: 0.3
effects-latency-weight: 0.6 # Effects queries are critical§High Reliability Priority
validator-client-monitor-config:
health-check-interval: 15s
max-consecutive-failures: 10 # Very tolerant
failure-cooldown: 60s
score-weights:
latency: 0.2
reliability: 0.8Structs§
- Validator
Client Monitor Config - Configuration for validator client monitoring from the client perspective