* opt(server): Execute lua on target shard, if it's 1
This will save hops by short-circuiting execution of commands.
* Reuse unique shard id from tx
Only switch threads for LOCK_AHEAD
* Signedness
* feat(server): Add `--lock_on_hashtags` mode.
This new mode effectively locks hashtags (i.e. strings within {curly
braces}) instead of the full keys being used.
This can allow scripts to access undeclared keys if they all use a
common hashtag, like for the case of BullMQ.
To make sure this mode is tested, I added a way to specify flags via env
variables, and modified `ci.yml` to run all tests using this mode as well.
While at it, I also added `--cluster_mode=emulated` mode to CI.
* bug(server): multi atomicity fix
The bug: when multi transaction run OOO we removed it from trasaction
queue, causing non atomic execution.
The fix: When we run multi transaction unless it is the head in txq we
remove it inside unlock multi from txq.
Signed-off-by: adi_holden <adi@dragonflydb.io>
* feat: very minimal code that adds b-tree to the codebase
The motivation to have our own b-tree to repalce zskiplist is shown by #1567
Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.
Currently the functionality supports Insert method only to reduce the review complexity.
The design decisions behind the data structure are described in src/core/detail/btree_internal.h
* chore: rewrote template logic for internal classes
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Short-circuit ExecuteAsync().
* Do not leak (hopefully :)
* Add documentation about coordinator_index_
* Use ServerState::tlocal()->thread_index()
In addition, add more states to tx local_mask to allow easier debugging.
Finally, add check-fail to verify tx invariants in order to prevent
reaching errorneous states that are nearly impossible to analyze.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix(server): Initialize ServerFamily with all listeners.
- Add a test for CLIENT LIST which is the visible result of this.
* use std move
* feat: Implement replicas take over
* Basic test
* Address CR comments
* Write a better test. Sadly it fails
* chore: Expose AwaitDispatches for reuse in takeover
* Ensure that no commands can execute during or after a takeover
* CR progress
* Actually disable the expiration
* Improve tests coverage
* Fix the dispatch waiting code
* Improve testing coverage and fix a shutdown snaphot bug
* don't replicate a replica
fix: remove bad check-fail in the transaction code.
Fixes#1421.
The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData
Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`
Finally, a pytest is added to cover the issue.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
NotifyPending was being called when a blocked transaction expires, which
meant other blocked transactions could be woken up even though another
transaction could be in progress. NotifyPending has no affect on the
blocked transaction.
Signed-off-by: Andy Dunstall <andydunstall@hotmail.co.uk>
* feat: run tx-schedule inline if the dest shard is on the same thread (#908)
The optimization is applied within ScheduleSingleHop call.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix(server): Don't inline schedule when in LOADING
* Fix the another pre-emption bug with inline scheduling
* Better locking around journal callbacks
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
Along the way, performs small cleanups in command handling code.
XGROUP HELP is special because it falls out of Dragonfly command taxonomy design,
where a command name determines where its key is located. All other XGROUP subcommands
expect to see XGROUP <subcmd> <key> and this one obviously does not need any key.
I fix it by working around the issue and introduce a dedicated dummy command for this combination.
Fixes#854.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
chore: remove cmd name from the list of arguments
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
The deadlock happenned during the brpop flow where we access
shard_data.local_data from both coordinator and shard threads.
Originally, shard_data.local_data was not designed for concurrent access,
and I used ARMED bit to deduplicate callback runs for each shard.
The problem is that within BRPOP flow, the
ExecuteAsync would apply "=| ARMED" and in parallel NotifySuspended would apply
" |= AWAKED" in the shard thread, and both R/M/W operations would corrupt each other.
Therefore, I separated now completely shard-local local_data mask and is_armed boolean.
Moreover, since now we use atomics for is_armed, I increased PerShardData size to 64 bytes
to avoid false cache sharding betweenn PerShardData objects.
Fixes#945
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Added a test that was breaking earlier.
2. Made sure that multiple waked brpop transaction would not
snatch items from one another.
3. Fixed watched-queues clean-up logic inside blocking_controller that caused deadlocks.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. pytest extensions and fixes - allows running them
with the existing local server by providing its port (--existing <port>).
2. Extend "DEBUG WATCHED" command to provide more information about watched state.
3. Improve debug/vlog printings around the code.
This noisy PR is a preparation before BRPOP fix that will follow later.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
feat(server): Basic multi transaction modes
This commit adds the notion of multi transaction modes that allow controlling the execution and
locking behaviour of multi transactions.
In general, there are four modes:
- GLOBAL: all commands run within a global transaction. There is no need for recording locks. Lua scripts can theoretically run with undeclared keys.
- LOCK_AHEAD: the transaction locks all keys ahead likewise to a regular transaction and schedules itself.
- LOCK_INCREMENTAL: the transaction determines what shards it has keys in and schedules itself on those shards, but locks only when accessing a new key. This allows other transactions to run ooo alonside with a big multi-transaction that accesses a contended key only at its very end.
- NON_ATOMIC: all commands run separately, no atomicity is provided, likewise to a pipeline
This commit only adds support for the first 3 modes to EXEC commands.
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
This change allows to track which transactions are run as out of order.
OOO txs are more performant and inhibit substantially less latency.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* bug(transaction): local result needs to be reset on InitByArgs Fixes#752
Signed-off-by: adi_holden <adi@dragonflydb.io>
* add unit test
Signed-off-by: adi_holden <adi@dragonflydb.io>
---------
Signed-off-by: adi_holden <adi@dragonflydb.io>