* chore(transaction): Simplify PollExecution()
Remove seqlock_ from transaction. This change is possible because:
- We don't re-use shard_data[0] for multi transactions anymore
- We disarm atomically and poll callbacks are stateless
This makes it safe to call PollExecution() unconditionally that will determine on it's own whether the caller needs to run or is already expired
---------
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Previously, transactions would run out of order only when all shards determined that the keys locks were free. With this change, each shard might decide to run out of order independently if the locks are free. COORD_OOO is now deprecated and the OUT_OF_ORDER per-shard flag should is used to indicate it
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Motivation - after we submitted #2429 some smart-ass clients
prevent users from accessing single-node commands like "SELECT".
This PR fixes it by allowing consistent sharding based on hashtags
even with cluster mode disabled.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat: allow throttling tiered writes
The throttling is controlled by tiered_storage_throttle_us flag
and can be disabled by passing `--tiered_storage_throttle_us=0`.
This introduces a soft back-pressure during writes.
On my machine `debug POPULATE 10000000 key 1000 RAND` with tiered_storage_throttle_us=0
offloads 12% of all the entries, but with tiered_storage_throttle_us=1 it offloads
almost 100% by prolonging the operation from 0.96s to 1.72s.
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
fix: fix "debug exec" command
It used mutex lock inside Await callback which is prohibited.
In addition, we improved loggings across the transaction code.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
chore: simplify transaction multi-locking
Also, add the ananlysis routine that determines whether the schewduled transaction is contended with other transaction in a
shard thread.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Fix AnalyzeTxQueue to stop crashing for various transaction types.
2. Pass exec command length to slowlog
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
This command shows the current state of transaction queues,
specifically how many armed (ready to run) transactions there,
how loaded these queue are and how many locks there are in each shard.
In addition, if a tx queue becomes too long, we will output warning logs about
the state of the queue, in order to be able to identify
the bottlenecks post-factum.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. How many transactions we processed by type
2. How many transactions we processed by width (number of unique shards).
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fixes#1936
Eviction Implementation
This patch provides a very simple eviction implementation for the interface mentioned above. In my opinion, the eviction algorithm approximates an LRU policy given that normal buckets always store the most recently accessed data while stash buckets are holding less active data.
The algorithm first selects a small set of segments as eviction targets. Starting from the last slot of the last stash bucket in each of the segments, we walk backward to evict key-value pairs stored in each visited slot. The eviction stopped either when a target memory release goal or the max number of evicted key-value pairs is reached. Therefore, we can upper bound the eviction time through the following two parameters that can be set when DF starts. Note that these two parameters could be retrieved and changed by user through CONFIG GET and CONFIG SET commands.
---------
Signed-off-by: Yue Li <61070669+theyueli@users.noreply.github.com>
Up until know we did not have cached rss metric in the process.
This PR consolidates caching of all values together inside the EngineShard periodic fiber
code. Also, we know expose rss_mem_current that can be used internally for identifying
memory pressuring periods during the process run.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
This will allow some use cases with few busy keys to distribute load
more evenly between threads.
Idea by @dranikpg.
To calculate how many entries are needed in the table I used the
following quick-n-dirty code, to reach <2.5% collision with 100 keys:
```cpp
bool Distribute(int balls = 100, int bins = 100) {
vector<int> v(bins);
for (int i = 0; i < balls; ++i) {
v[rand() % v.size()]++;
}
for (int v : v) {
if (v >= 2) {
return true;
}
}
return false;
}
int main(int argc, char** argv) {
int has_2_balls = 0;
constexpr int kRounds = 1'000'000;
for (int i = 0; i < kRounds; ++i) {
has_2_balls += Distribute(100, 100'000);
}
cout << has_2_balls << " rounds had 2+ balls in a single bin out of " << kRounds << endl;
}
```
In addition, add more states to tx local_mask to allow easier debugging.
Finally, add check-fail to verify tx invariants in order to prevent
reaching errorneous states that are nearly impossible to analyze.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Make blpop test pass with squashing. The problem was that the txid was not properly reported
back to the testing code.
Also, fixed redundant codepath for defragmentation.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Remove Boost.Fibers mentions and remove fibers_ext mentions.
Done in preparation to switch to helio-native fb2 implementation.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Added a test that was breaking earlier.
2. Made sure that multiple waked brpop transaction would not
snatch items from one another.
3. Fixed watched-queues clean-up logic inside blocking_controller that caused deadlocks.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. pytest extensions and fixes - allows running them
with the existing local server by providing its port (--existing <port>).
2. Extend "DEBUG WATCHED" command to provide more information about watched state.
3. Improve debug/vlog printings around the code.
This noisy PR is a preparation before BRPOP fix that will follow later.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
fix(server): Fix a bug when an expired transaction stays in watched queue.
Now we remove the transaction from the watched queues in a consistent manner based on the
keys it was assigned to watch.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Support tiered deletion.
2. Add notion of tiered entity in "DEBUG OBJECT" output.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Set mem_defrag_threshold so that defrag will be trigerred when memory grows.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <roman@dragonflydb.io>