Statistics
Under the stats config option, statistics collection for federation
endpoints can be configured. When enabled, LightHouse captures detailed
metrics about requests to federation endpoints including timing, client
information, and query parameters.
enabled¶
boolean
false
optional
LH_STATS_ENABLED
The enabled option controls whether statistics collection is active.
config.yaml
stats:
enabled: true
buffer¶
object optional
The buffer option configures the in-memory ring buffer used for
non-blocking request capture. The buffer holds request data temporarily
before flushing to the database.
config.yaml
stats:
enabled: true
buffer:
size: 10000
flush_interval: 5s
flush_threshold: 0.8
size¶
integer
10000
optional
LH_STATS_BUFFER_SIZE
The maximum number of request entries to hold in the ring buffer. If the buffer fills up before flushing, older entries are overwritten.
For high-traffic deployments, increase this value to reduce the chance of data loss during traffic spikes.
flush_interval¶
duration
5s
optional
LH_STATS_BUFFER_FLUSH_INTERVAL
How often the buffer is flushed to the database. Shorter intervals reduce the risk of data loss but increase database write frequency.
flush_threshold¶
float (0.0-1.0)
0.8
optional
LH_STATS_BUFFER_FLUSH_THRESHOLD
Triggers an immediate flush when the buffer reaches this percentage of capacity. This prevents data loss during sudden traffic spikes.
capture¶
object optional
The capture option controls what data is collected from each request.
config.yaml
stats:
enabled: true
capture:
client_ip: true
user_agent: true
query_params: true
geo_ip:
enabled: false
database_path: /path/to/GeoLite2-Country.mmdb
client_ip¶
boolean
true
optional
LH_STATS_CAPTURE_CLIENT_IP
Records the client's IP address. When behind a reverse proxy, ensure
server.forwarded_ip_header is configured correctly.
user_agent¶
boolean
true
optional
LH_STATS_CAPTURE_USER_AGENT
Records the User-Agent header from requests.
query_params¶
boolean
true
optional
LH_STATS_CAPTURE_QUERY_PARAMS
Records URL query parameters as JSON. This is useful for analyzing which entities are being fetched or resolved most frequently.
geo_ip¶
object optional
GeoIP lookup enables country detection from client IP addresses.
enabled¶
boolean
false
optional
LH_STATS_CAPTURE_GEO_IP_ENABLED
Enables GeoIP country lookup.
database_path¶
file path
required if enabled
LH_STATS_CAPTURE_GEO_IP_DATABASE_PATH
Path to a MaxMind GeoLite2-Country or GeoIP2-Country database file (.mmdb).
Obtaining GeoIP Database
The GeoLite2-Country database is free but requires registration at
MaxMind.
Download the .mmdb file and specify its path here.
retention¶
object optional
The retention option defines how long statistics data is kept.
config.yaml
stats:
enabled: true
retention:
detailed_days: 90
aggregated_days: 365
detailed_days¶
integer
90
optional
LH_STATS_RETENTION_DETAILED_DAYS
Number of days to keep individual request logs. After this period, detailed logs are deleted but daily aggregates are preserved.
aggregated_days¶
integer
365
optional
LH_STATS_RETENTION_AGGREGATED_DAYS
Number of days to keep daily aggregated statistics. This enables long-term trend analysis with minimal storage requirements.
endpoints¶
array of strings
empty (all federation endpoints)
optional
LH_STATS_ENDPOINTS
List of endpoint paths to track. If empty or not specified, all federation endpoints are tracked (excluding the admin API).
For environment variables, use comma-separated values: LH_STATS_ENDPOINTS="/.well-known/openid-federation,/fetch,/resolve"
config.yaml
stats:
enabled: true
endpoints:
- /.well-known/openid-federation
- /fetch
- /resolve
Complete Example¶
config.yaml
stats:
enabled: true
buffer:
size: 10000
flush_interval: 5s
flush_threshold: 0.8
capture:
client_ip: true
user_agent: true
query_params: true
geo_ip:
enabled: true
database_path: /data/GeoLite2-Country.mmdb
retention:
detailed_days: 90
aggregated_days: 365
endpoints: [] # Track all federation endpoints
Database Considerations¶
Statistics data is stored in two tables:
federation_request_logs- Individual request records (detailed)federation_daily_stats- Aggregated daily statistics (compact)
Storage Estimates¶
| Traffic Level | Requests/Day | Daily Storage | Yearly Storage (Detailed) | Yearly Storage (Aggregated) |
|---|---|---|---|---|
| Low | 10,000 | ~5 MB | ~1.8 GB | ~50 MB |
| Medium | 100,000 | ~50 MB | ~18 GB | ~500 MB |
| High | 1,000,000 | ~500 MB | ~180 GB | ~5 GB |
PostgreSQL Recommended
For high-volume deployments (>100,000 requests/day), PostgreSQL is recommended for its superior performance with bulk inserts and analytical queries.