# SLO workload

SLO is the type of test where app based on ydb-sdk is tested against falling YDB cluster nodes, tablets, network
(that is possible situations for distributed DBs with hundreds of nodes)

### Usage:

It has 3 commands:

- `create`  - creates table in database
- `cleanup` - drops table in database
- `run`     - runs workload (read and write to table with sets RPS)

### Run examples with all arguments:

create:

`$APP create grpcs://ydb.cool.example.com:2135 /some/folder -t tableName
-min-partitions-count 6 -max-partitions-count 1000 -partition-size 1 -с 1000
-write-timeout 10000`

cleanup:

`$APP cleanup grpcs://ydb.cool.example.com:2135 /some/folder -t tableName`

run:

`$APP create run grpcs://ydb.cool.example.com:2135 /some/folder -t tableName
-prom-pgw http://prometheus-pushgateway:9091 -report-period 250
-read-rps 1000 -read-timeout 10000
-write-rps 100 -write-timeout 10000
-time 600 -shutdown-time 30`

## Arguments for commands:

### create
`$APP create <endpoint> <db> [options]`

```
Arguments:
  endpoint                        YDB endpoint to connect to
  db                              YDB database to connect to

Options:
  -t -table-name         <string> table name to create

  -min-partitions-count  <int>    minimum amount of partitions in table
  -max-partitions-count  <int>    maximum amount of partitions in table
  -partition-size        <int>    partition size in mb

  -c -initial-data-count <int>    amount of initially created rows

  -write-timeout         <int>    write timeout milliseconds
```

### cleanup
`$APP cleanup <endpoint> <db> [options]`

```
Arguments:
  endpoint                        YDB endpoint to connect to
  db                              YDB database to connect to

Options:
  -t -table-name         <string> table name to create

  -write-timeout         <int>    write timeout milliseconds
```

### run
`$APP run <endpoint> <db> [options]`

```
Arguments:
  endpoint                        YDB endpoint to connect to
  db                              YDB database to connect to

Options:
  -t -table-name         <string> table name to create

  -initial-data-count    <int>    amount of initially created rows

  -prom-pgw              <string> prometheus push gateway
  -report-period         <int>    prometheus push period in milliseconds

  -read-rps              <int>    read RPS
  -read-timeout          <int>    read timeout milliseconds

  -write-rps             <int>    write RPS
  -write-timeout         <int>    write timeout milliseconds

  -time                  <int>    run time in seconds
  -shutdown-time         <int>    graceful shutdown time in seconds
```

## Authentication

Workload using anonymous credentials.

## What's inside
When running `run` command, the program creates three jobs: `readJob`, `writeJob`, `metricsJob`.

- `readJob`    reads rows from the table one by one with random identifiers generated by writeJob
- `writeJob`   generates and inserts rows
- `metricsJob` periodically sends metrics to Prometheus

Table have these fields:
- `hash Uint64 Digest::NumericHash(id)`
- `id Uint64`
- `payload_double Double`
- `payload_hash Uint64`
- `payload_str UTF8`
- `payload_timestamp Timestamp`

Primary key: `("hash", "id")`

## Collected metrics
- `oks`           - amount of OK requests
- `not_oks`       - amount of not OK requests
- `inflight`      - amount of requests in flight
- `latency`       - summary of latencies in ms
- `attempts`      - summary of amount for request
- `error`         - amount of errors
- `query_latency` - summary of latencies in ms in query

> You must reset metrics to keep them `0` in prometheus and grafana before beginning and after ending of jobs

In `php` it looks like that:
```php
$pushGateway->delete('workload-php', [
    'sdk' => 'php',
    'sdkVersion' => Ydb::VERSION
]);
```

## Look at metrics in grafana
You can get dashboard used in that test [here](https://github.com/ydb-platform/slo-tests/blob/main/k8s/helms/grafana.yaml#L69) - you will need to import json into grafana.