# SLO workload SLO is the type of test where app based on ydb-sdk is tested against falling YDB cluster nodes, tablets, network (that is possible situations for distributed DBs with hundreds of nodes) ### Usage: It has 3 commands: - `create` - creates table in database - `cleanup` - drops table in database - `run` - runs workload (read and write to table with sets RPS) ### Run examples with all arguments: create: `$APP create grpcs://ydb.cool.example.com:2135 /some/folder -t tableName -min-partitions-count 6 -max-partitions-count 1000 -partition-size 1 -с 1000 -write-timeout 10000` cleanup: `$APP cleanup grpcs://ydb.cool.example.com:2135 /some/folder -t tableName` run: `$APP create run grpcs://ydb.cool.example.com:2135 /some/folder -t tableName -prom-pgw http://prometheus-pushgateway:9091 -report-period 250 -read-rps 1000 -read-timeout 10000 -write-rps 100 -write-timeout 10000 -time 600 -shutdown-time 30` ## Arguments for commands: ### create `$APP create [options]` ``` Arguments: endpoint YDB endpoint to connect to db YDB database to connect to Options: -t -table-name table name to create -min-partitions-count minimum amount of partitions in table -max-partitions-count maximum amount of partitions in table -partition-size partition size in mb -c -initial-data-count amount of initially created rows -write-timeout write timeout milliseconds ``` ### cleanup `$APP cleanup [options]` ``` Arguments: endpoint YDB endpoint to connect to db YDB database to connect to Options: -t -table-name table name to create -write-timeout write timeout milliseconds ``` ### run `$APP run [options]` ``` Arguments: endpoint YDB endpoint to connect to db YDB database to connect to Options: -t -table-name table name to create -initial-data-count amount of initially created rows -prom-pgw prometheus push gateway -report-period prometheus push period in milliseconds -read-rps read RPS -read-timeout read timeout milliseconds -write-rps write RPS -write-timeout write timeout milliseconds -time run time in seconds -shutdown-time graceful shutdown time in seconds ``` ## Authentication Workload using anonymous credentials. ## What's inside When running `run` command, the program creates three jobs: `readJob`, `writeJob`, `metricsJob`. - `readJob` reads rows from the table one by one with random identifiers generated by writeJob - `writeJob` generates and inserts rows - `metricsJob` periodically sends metrics to Prometheus Table have these fields: - `hash Uint64 Digest::NumericHash(id)` - `id Uint64` - `payload_double Double` - `payload_hash Uint64` - `payload_str UTF8` - `payload_timestamp Timestamp` Primary key: `("hash", "id")` ## Collected metrics - `oks` - amount of OK requests - `not_oks` - amount of not OK requests - `inflight` - amount of requests in flight - `latency` - summary of latencies in ms - `attempts` - summary of amount for request - `error` - amount of errors - `query_latency` - summary of latencies in ms in query > You must reset metrics to keep them `0` in prometheus and grafana before beginning and after ending of jobs In `php` it looks like that: ```php $pushGateway->delete('workload-php', [ 'sdk' => 'php', 'sdkVersion' => Ydb::VERSION ]); ``` ## Look at metrics in grafana You can get dashboard used in that test [here](https://github.com/ydb-platform/slo-tests/blob/main/k8s/helms/grafana.yaml#L69) - you will need to import json into grafana.