* bug(server): global command stalls on server load with pipeline mode
fixes#1797
the bug: global command is not able to schedule into txq when high load pipelined commands. Only after the load finish the global transaction gets scheduled into the txq. The reason for this is when we start a global transaction we set the shard lock and all the transactions start to enter the txq. They compete with the global tx on the order they are inserted into the queue to preserve transaction atomicity. Because the global tx needs to be inserted to all shard queues its chance to schedule with order with all the other transactions is low.
the solution: lock the global transaction inside the schedule in shard, locking closer to scheduling decreases the number of transactions in the queue and the competition on ordering correctly has higher chance now.
Signed-off-by: adi_holden <adi@dragonflydb.io>
* opt(server): Execute lua on target shard, if it's 1
This will save hops by short-circuiting execution of commands.
* Reuse unique shard id from tx
Only switch threads for LOCK_AHEAD
* Signedness
* feat(server): Add `--lock_on_hashtags` mode.
This new mode effectively locks hashtags (i.e. strings within {curly
braces}) instead of the full keys being used.
This can allow scripts to access undeclared keys if they all use a
common hashtag, like for the case of BullMQ.
To make sure this mode is tested, I added a way to specify flags via env
variables, and modified `ci.yml` to run all tests using this mode as well.
While at it, I also added `--cluster_mode=emulated` mode to CI.
* bug(server): multi atomicity fix
The bug: when multi transaction run OOO we removed it from trasaction
queue, causing non atomic execution.
The fix: When we run multi transaction unless it is the head in txq we
remove it inside unlock multi from txq.
Signed-off-by: adi_holden <adi@dragonflydb.io>
* feat: very minimal code that adds b-tree to the codebase
The motivation to have our own b-tree to repalce zskiplist is shown by #1567
Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.
Currently the functionality supports Insert method only to reduce the review complexity.
The design decisions behind the data structure are described in src/core/detail/btree_internal.h
* chore: rewrote template logic for internal classes
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* feat(server): Short-circuit ExecuteAsync().
* Do not leak (hopefully :)
* Add documentation about coordinator_index_
* Use ServerState::tlocal()->thread_index()
In addition, add more states to tx local_mask to allow easier debugging.
Finally, add check-fail to verify tx invariants in order to prevent
reaching errorneous states that are nearly impossible to analyze.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix(server): Initialize ServerFamily with all listeners.
- Add a test for CLIENT LIST which is the visible result of this.
* use std move
* feat: Implement replicas take over
* Basic test
* Address CR comments
* Write a better test. Sadly it fails
* chore: Expose AwaitDispatches for reuse in takeover
* Ensure that no commands can execute during or after a takeover
* CR progress
* Actually disable the expiration
* Improve tests coverage
* Fix the dispatch waiting code
* Improve testing coverage and fix a shutdown snaphot bug
* don't replicate a replica
fix: remove bad check-fail in the transaction code.
Fixes#1421.
The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData
Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`
Finally, a pytest is added to cover the issue.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
NotifyPending was being called when a blocked transaction expires, which
meant other blocked transactions could be woken up even though another
transaction could be in progress. NotifyPending has no affect on the
blocked transaction.
Signed-off-by: Andy Dunstall <andydunstall@hotmail.co.uk>
* feat: run tx-schedule inline if the dest shard is on the same thread (#908)
The optimization is applied within ScheduleSingleHop call.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
* fix(server): Don't inline schedule when in LOADING
* Fix the another pre-emption bug with inline scheduling
* Better locking around journal callbacks
---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
Along the way, performs small cleanups in command handling code.
XGROUP HELP is special because it falls out of Dragonfly command taxonomy design,
where a command name determines where its key is located. All other XGROUP subcommands
expect to see XGROUP <subcmd> <key> and this one obviously does not need any key.
I fix it by working around the issue and introduce a dedicated dummy command for this combination.
Fixes#854.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
chore: remove cmd name from the list of arguments
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
The deadlock happenned during the brpop flow where we access
shard_data.local_data from both coordinator and shard threads.
Originally, shard_data.local_data was not designed for concurrent access,
and I used ARMED bit to deduplicate callback runs for each shard.
The problem is that within BRPOP flow, the
ExecuteAsync would apply "=| ARMED" and in parallel NotifySuspended would apply
" |= AWAKED" in the shard thread, and both R/M/W operations would corrupt each other.
Therefore, I separated now completely shard-local local_data mask and is_armed boolean.
Moreover, since now we use atomics for is_armed, I increased PerShardData size to 64 bytes
to avoid false cache sharding betweenn PerShardData objects.
Fixes#945
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. Added a test that was breaking earlier.
2. Made sure that multiple waked brpop transaction would not
snatch items from one another.
3. Fixed watched-queues clean-up logic inside blocking_controller that caused deadlocks.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
1. pytest extensions and fixes - allows running them
with the existing local server by providing its port (--existing <port>).
2. Extend "DEBUG WATCHED" command to provide more information about watched state.
3. Improve debug/vlog printings around the code.
This noisy PR is a preparation before BRPOP fix that will follow later.
Signed-off-by: Roman Gershman <roman@dragonflydb.io>