Commit graph

196 commits

Author SHA1 Message Date
Vladislav
663c1f9e1b
fix: fix dchecks (#1681)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-09 22:44:44 +03:00
Vladislav
b9e8a2c0da
chore: Connection fixes (#1663)
* chore: Connection safety checks

Signed-off-by: Vladislav <vlad@dragonflydb.io>
2023-08-09 17:57:41 +03:00
Shahar Mike
734401098c
opt(server): Execute lua on target shard, if it's 1 (#1639)
* opt(server): Execute lua on target shard, if it's 1

This will save hops by short-circuiting execution of commands.

* Reuse unique shard id from tx
Only switch threads for LOCK_AHEAD

* Signedness
2023-08-09 14:18:34 +03:00
Vladislav
a0da723628
fix: remove coordinator_index_ from tx & fix short circuit (#1640)
fix: remove coordinator_index_ from tx

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-03 22:43:43 +03:00
Shahar Mike
67a4c4e6cb
feat(server): Add --lock_on_hashtags mode. (#1611)
* feat(server): Add `--lock_on_hashtags` mode.

This new mode effectively locks hashtags (i.e. strings within {curly
braces}) instead of the full keys being used.
This can allow scripts to access undeclared keys if they all use a
common hashtag, like for the case of BullMQ.

To make sure this mode is tested, I added a way to specify flags via env
variables, and modified `ci.yml` to run all tests using this mode as well.
While at it, I also added `--cluster_mode=emulated` mode to CI.
2023-08-03 20:13:36 +03:00
Vladislav
da17a39410
fix: remove empty hop for non-expiring transactions (#1605)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-03 16:26:58 +03:00
Vladislav
844fe57dec
feat: Remove batch locks from non-atomic squashing (#1613)
feat: Remove batch locks from non-atomic squashing

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-02 09:16:47 +03:00
Vladislav
3a4b3c97c8
fix: simplify ScheduleInShard (#1610)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-01 08:15:42 +03:00
Vladislav
eda941dca6
fix: add Transaction::Conclude (#1606)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-07-31 15:37:29 +03:00
adiholden
366f50230b
bug(server): multi atomicity fix (#1593)
* bug(server): multi atomicity fix

The bug: when multi transaction run OOO we removed it from trasaction
queue, causing non atomic execution.
The fix: When we run multi transaction unless it is the head in txq we
remove it inside unlock multi from txq.

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-07-31 14:50:33 +03:00
Roman Gershman
723cc623c2
feat: very minimal code that adds b-tree to the codebase (#1596)
* feat: very minimal code that adds b-tree to the codebase

The motivation to have our own b-tree to repalce zskiplist is shown by #1567
Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.

Currently the functionality supports Insert method only to reduce the review complexity.
The design decisions behind the data structure are described in src/core/detail/btree_internal.h

* chore: rewrote template logic for internal classes

---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-31 13:40:27 +03:00
Shahar Mike
fba0800081
opt(server): Short-circuit ExecuteAsync(). (#1601)
* feat(server): Short-circuit ExecuteAsync().

* Do not leak (hopefully :)

* Add documentation about coordinator_index_

* Use ServerState::tlocal()->thread_index()
2023-07-31 12:12:59 +03:00
Abhradeep Chakraborty
da2ad7eceb
feat(stream): add support for xreadgroup command (#1475)
Signed-off-by: Abhradeep Chakraborty <abhradeep@dragonflydb.io>
2023-07-11 08:11:19 +03:00
Rounak Nandanwar
674f06875c
fix: zunion and zunionstore zero numkeys bug (#1522)
Fixes #1442

Signed-off-by: rounaknandanwar <rounak.nandanwar@gmail.com>
2023-07-10 09:50:59 +03:00
Roy Jacobson
0f69d32b11
takeover: Cancel blocking commands (#1514)
* fix: Cancel blocking commands when performing a takeover

* Add some comments

* Make CancelBlocking a method of ConnectionContext

* add a small todo
2023-07-05 17:09:10 +02:00
Roman Gershman
84d09800c3
chore: refresh helio (#1506)
In addition, add more states to tx local_mask to allow easier debugging.
Finally, add check-fail to verify tx invariants in order to prevent
reaching errorneous states that are nearly impossible to analyze.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-04 16:51:53 +03:00
Roy Jacobson
4babed54d3
feat: Support atomic replica takeover (#1314)
* fix(server): Initialize ServerFamily with all listeners.

- Add a test for CLIENT LIST which is the visible result of this.

* use std move

* feat: Implement replicas take over

* Basic test

* Address CR comments

* Write a better test. Sadly it fails

* chore: Expose AwaitDispatches for reuse in takeover

* Ensure that no commands can execute during or after a takeover

* CR progress

* Actually disable the expiration

* Improve tests coverage

* Fix the dispatch waiting code

* Improve testing coverage and fix a shutdown snaphot bug

* don't replicate a replica
2023-07-02 16:11:28 +02:00
Vladislav
6f78ae5073
fix: call NotifyPending only from tx queue invocations (#1439)
* fix: call NotifyPending only from tx queue invocations

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-06-21 10:26:22 +03:00
Vladislav
6d4d740d6e
fix: Don't remove non-concluding tx from queue on ooo runs (#1427)
* fix: Don't remove non-concluding tx from queue on ooo runs

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-06-18 21:14:28 +03:00
Roman Gershman
69e6ad799a
fix: remove bad check-fail in the transaction code (#1420)
fix: remove bad check-fail in the transaction code.

Fixes #1421.

The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData

Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`

Finally, a pytest is added to cover the issue.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-06-18 07:03:08 +03:00
Andy Dunstall
d6e97fcf6d
fix: remove NotifyPending from UnwatchShardCb (#1402)
NotifyPending was being called when a blocked transaction expires, which
meant other blocked transactions could be woken up even though another
transaction could be in progress. NotifyPending has no affect on the
blocked transaction.

Signed-off-by: Andy Dunstall <andydunstall@hotmail.co.uk>
2023-06-13 08:59:34 +03:00
Kostas Kyrimis
42116fa012
feat(zset family): Implement ZDiff command issue #1311 (#1333)
Signed-off-by: Kostas <kostaskyrim@gmail.com>
2023-06-05 18:26:01 +03:00
Andy Dunstall
1cfeff21a4
feat(streams): Add support for XREAD BLOCK (#1291)
* feat(streams): Add support for XREAD BLOCK

---------

Signed-off-by: Andrew Dunstall <andydunstall@hotmail.co.uk>
2023-05-27 22:47:31 +03:00
Roman Gershman
9658eab036
fix: support XREAD ... STREAMS ... keys derivation (#1250) 2023-05-21 20:51:45 +03:00
Roy Jacobson
7adf3799f0
feature(server): Bring back inline scheduling (#1130)
* feat: run tx-schedule inline if the dest shard is on the same thread (#908)

The optimization is applied within ScheduleSingleHop call.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* fix(server): Don't inline schedule when in LOADING

* Fix the another pre-emption bug with inline scheduling

* Better locking around journal callbacks

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-05-21 10:10:38 +03:00
Vladislav
cb82680aca
Remove blpop FindFirst hop after wakeup (#1168)
Remove BLPOP hop after wake
2023-05-03 19:45:06 +03:00
Roman Gershman
418f529b0e
fix: 'xgroup help' should show help message (#1159)
Along the way, performs small cleanups in command handling code.
XGROUP HELP is special because it falls out of Dragonfly command taxonomy design,
where a command name determines where its key is located. All other XGROUP subcommands
expect to see XGROUP <subcmd> <key> and this one obviously does not need any key.
I fix it by working around the issue and introduce a dedicated dummy command for this combination.

Fixes #854.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-04-30 09:53:01 +03:00
adiholden
7f56a435c4
bug(server): replicate scripts in stable state (#1114)
* bug(server): replicate scripts in stable state

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-04-23 23:46:51 +03:00
Vladislav
b345604226
fix: Remove incremental locking (#1094) 2023-04-15 06:59:19 -07:00
Vladislav
70cf436c05
Lua script async calls (#1070)
Introduces squashing for scripts and a new `redis.acall` command for async commands
2023-04-12 23:37:25 +03:00
Vladislav
282c168d34
fix: Update cntx->cid on multi-tx'es (#1081)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-04-12 23:28:31 +03:00
Vladislav
015ed622c5
fix(server): Optimize StoredCmd (#1053)
Opmitize StoredCmd to allow inline storage
2023-04-11 10:14:36 +03:00
Vladislav
a12ddfe108
Remove cmd name from args (#1057)
chore: remove cmd name from the list of arguments

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-04-10 14:14:52 +03:00
Vladislav
c29db83b7e
feat(server): Squashed exec (#1025)
Introduces squashed executor that allows squashing single-shard commands within multi transactions
2023-04-08 23:34:33 +03:00
Roman Gershman
8e0080133c
fix: add missing barrier to fix reads in the coordinator fiber (#1009)
Fixes #997.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-30 06:55:28 +03:00
Vladislav
623c5a85e3
fix(server): Fix transaction index + shard_data multi re-use (#958) 2023-03-19 12:18:02 +03:00
Roman Gershman
4870d8150c fix: Fix deadlock in the transaction code.
The deadlock happenned during the brpop flow where we access
shard_data.local_data from both coordinator and shard threads.
Originally, shard_data.local_data was not designed for concurrent access,
and I used ARMED bit to deduplicate callback runs for each shard.
The problem is that within BRPOP flow, the
ExecuteAsync would apply "=| ARMED" and in parallel NotifySuspended would apply
" |= AWAKED" in the shard thread, and both R/M/W operations would corrupt each other.

Therefore, I separated now completely shard-local local_data mask and is_armed boolean.
Moreover, since now we use atomics for is_armed, I increased PerShardData size to 64 bytes
to avoid false cache sharding betweenn PerShardData objects.

Fixes #945

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 20:36:11 +02:00
Roman Gershman
f4081f3979 fix: improve consistency around brpop flow
1. Added a test that was breaking earlier.
2. Made sure that multiple waked brpop transaction would not
   snatch items from one another.
3. Fixed watched-queues clean-up logic inside blocking_controller that caused deadlocks.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 11:49:23 +02:00
Roman Gershman
c96f637f73 chore: some pytests and logging improvements
1. pytest extensions and fixes - allows running them
   with the existing local server by providing its port (--existing <port>).
2. Extend "DEBUG WATCHED" command to provide more information about watched state.
3. Improve debug/vlog printings around the code.

This noisy PR is a preparation before BRPOP fix that will follow later.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 10:52:20 +02:00
Roman Gershman
8cf8115116
chore(server): pass coordinator thread to a transaction object (#905)
This should help with some of the optimizations we may do in the future.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-03 14:40:29 +02:00
Vladislav
be4ef01975
fix(server): Reorder ExecuteAsync callback seqlock check (#873)
fix(server): Reoder cb check

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-02-24 21:55:56 +02:00
Vladislav
03e99a5d96
EVAL multi modes + non atomic modes (#818)
- Implement multi modes for eval
- Implement non atomic mode
- Enhance tests
2023-02-20 09:43:31 +03:00
Vladislav
4ef06e759a
Basic multi modes for MULTI/EXEC (#796)
feat(server): Basic multi transaction modes

This commit adds the notion of multi transaction modes that allow controlling the execution and
locking behaviour of multi transactions.
In general, there are four modes:
- GLOBAL: all commands run within a global transaction. There is no need for recording locks. Lua scripts can theoretically run with undeclared keys.
- LOCK_AHEAD: the transaction locks all keys ahead likewise to a regular transaction and schedules itself.
- LOCK_INCREMENTAL: the transaction determines what shards it has keys in and schedules itself on those shards, but locks only when accessing a new key. This allows other transactions to run ooo alonside with a big multi-transaction that accesses a contended key only at its very end.
- NON_ATOMIC: all commands run separately, no atomicity is provided, likewise to a pipeline

This commit only adds support for the first 3 modes to EXEC commands.

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-02-18 20:18:28 +03:00
adiholden
50f50c8380
feat(server): write journal record with optional await based on flag… (#791)
* feat(server): write journal recorod with optional await based on flag issue #788

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-02-15 09:34:24 +02:00
adiholden
72bad6c5ab
fix(replica) : replica will not sync execution multi shard commands as default (#800)
-sfix(replica) : replica will not sync execution multi shard commands as default
2023-02-14 16:30:14 +02:00
Vladislav
3e46fd1318
Refactor initialization phase of transaction (#790)
* refactor(server): Split transaction init

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-02-14 08:32:02 +02:00
Vladislav
6e612e7545
feat(server): Async unlock multi (#774)
* feat(server):Async unlock multi

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-02-11 18:57:02 +02:00
Vladislav
07973d40eb
Update transaction and enable OOO for regular transactions (#769)
* refactor(server): Update ScheduleSingleHop

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-02-09 19:36:55 +02:00
Roman Gershman
46b42e571f
chore(server): track ooo transactions via metrics. (#763)
This change allows to track which transactions are run as out of order.
OOO txs are more performant and inhibit substantially less latency.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-02-07 19:40:25 +02:00
adiholden
4a826fdb7b
bug(transaction): local result needs to be reset on InitByArgs Fixes … (#762)
* bug(transaction): local result needs to be reset on InitByArgs Fixes #752

Signed-off-by: adi_holden <adi@dragonflydb.io>

* add unit test

Signed-off-by: adi_holden <adi@dragonflydb.io>

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-02-06 15:48:12 +02:00