Commit graph

206 commits

Author SHA1 Message Date
Roman Gershman
4201ac416e
chore: remove Command step argument (#2150)
It will be represented via INTERLEAVED_KEYS option.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-11-10 19:10:56 +02:00
Vladislav
39c1827fa7
fix(transaction): Reset reverse index in multi-tx (#2086)
* fix(transaction): Reset reverse index in multi-tx

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-10-30 17:22:35 +03:00
Roman Gershman
bcfd1863c7
fix: reject zset variadic commands with 0 keys (#2022)
Fixes the assertion failure as reported by #1994.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-10-15 14:34:04 +03:00
Vladislav
cb9a45f2a9
fix(server): Don't recompute shard for squashed stub tx (#2017)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-10-15 09:27:58 +03:00
Vladislav
fc0943989e
feat(search): return scores (#1870)
* feat(search): return scores

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-09-25 10:03:17 +03:00
Shahar Mike
2091f53777
opt(lua): Avoid separate hops for lock & unlock on single shard execution (#1900) 2023-09-22 14:09:17 +03:00
Shahar Mike
8cc448f6b4
opt(server): Call reserve() with correct argument (#1914) 2023-09-22 08:47:49 +00:00
adiholden
36ac31427d
bug(server): global command stalls on server load with pipeline mode (#1909)
* bug(server): global command stalls on server load with pipeline mode
fixes #1797
the bug: global command is not able to schedule into txq when high load pipelined commands. Only after the load finish the global transaction gets scheduled into the txq. The reason for this is when we start a global transaction we set the shard lock and all the transactions start to enter the txq. They compete with the global tx on the order they are inserted into the queue to preserve transaction atomicity. Because the global tx needs to be inserted to all shard queues its chance to schedule with order with all the other transactions is low.
the solution: lock the global transaction inside the schedule in shard, locking closer to scheduling decreases the number of transactions in the queue and the competition on ordering correctly has higher chance now.

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-09-21 13:35:57 +00:00
Vladislav
769f5a19cd
feat: Span-all no-key transactional commands (#1864)
* feat: Span-all no-key transactional commands

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-09-19 11:10:56 +03:00
Shahar Mike
b91435e360
opt(lua): Coordinate single-shard Lua evals in remote thread (#1845)
* opt(lua): !!WIP!! Coordinate single-shard Lua evals in remote thread

This removes the need for an additional (and costly) Fiber.
2023-09-13 11:52:35 +03:00
Vladislav
663c1f9e1b
fix: fix dchecks (#1681)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-09 22:44:44 +03:00
Vladislav
b9e8a2c0da
chore: Connection fixes (#1663)
* chore: Connection safety checks

Signed-off-by: Vladislav <vlad@dragonflydb.io>
2023-08-09 17:57:41 +03:00
Shahar Mike
734401098c
opt(server): Execute lua on target shard, if it's 1 (#1639)
* opt(server): Execute lua on target shard, if it's 1

This will save hops by short-circuiting execution of commands.

* Reuse unique shard id from tx
Only switch threads for LOCK_AHEAD

* Signedness
2023-08-09 14:18:34 +03:00
Vladislav
a0da723628
fix: remove coordinator_index_ from tx & fix short circuit (#1640)
fix: remove coordinator_index_ from tx

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-03 22:43:43 +03:00
Shahar Mike
67a4c4e6cb
feat(server): Add --lock_on_hashtags mode. (#1611)
* feat(server): Add `--lock_on_hashtags` mode.

This new mode effectively locks hashtags (i.e. strings within {curly
braces}) instead of the full keys being used.
This can allow scripts to access undeclared keys if they all use a
common hashtag, like for the case of BullMQ.

To make sure this mode is tested, I added a way to specify flags via env
variables, and modified `ci.yml` to run all tests using this mode as well.
While at it, I also added `--cluster_mode=emulated` mode to CI.
2023-08-03 20:13:36 +03:00
Vladislav
da17a39410
fix: remove empty hop for non-expiring transactions (#1605)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-03 16:26:58 +03:00
Vladislav
844fe57dec
feat: Remove batch locks from non-atomic squashing (#1613)
feat: Remove batch locks from non-atomic squashing

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-02 09:16:47 +03:00
Vladislav
3a4b3c97c8
fix: simplify ScheduleInShard (#1610)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-01 08:15:42 +03:00
Vladislav
eda941dca6
fix: add Transaction::Conclude (#1606)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-07-31 15:37:29 +03:00
adiholden
366f50230b
bug(server): multi atomicity fix (#1593)
* bug(server): multi atomicity fix

The bug: when multi transaction run OOO we removed it from trasaction
queue, causing non atomic execution.
The fix: When we run multi transaction unless it is the head in txq we
remove it inside unlock multi from txq.

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-07-31 14:50:33 +03:00
Roman Gershman
723cc623c2
feat: very minimal code that adds b-tree to the codebase (#1596)
* feat: very minimal code that adds b-tree to the codebase

The motivation to have our own b-tree to repalce zskiplist is shown by #1567
Based on the results we should greatly reduce the memory overhead per item when using a modern b-tree.

Currently the functionality supports Insert method only to reduce the review complexity.
The design decisions behind the data structure are described in src/core/detail/btree_internal.h

* chore: rewrote template logic for internal classes

---------
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-31 13:40:27 +03:00
Shahar Mike
fba0800081
opt(server): Short-circuit ExecuteAsync(). (#1601)
* feat(server): Short-circuit ExecuteAsync().

* Do not leak (hopefully :)

* Add documentation about coordinator_index_

* Use ServerState::tlocal()->thread_index()
2023-07-31 12:12:59 +03:00
Abhradeep Chakraborty
da2ad7eceb
feat(stream): add support for xreadgroup command (#1475)
Signed-off-by: Abhradeep Chakraborty <abhradeep@dragonflydb.io>
2023-07-11 08:11:19 +03:00
Rounak Nandanwar
674f06875c
fix: zunion and zunionstore zero numkeys bug (#1522)
Fixes #1442

Signed-off-by: rounaknandanwar <rounak.nandanwar@gmail.com>
2023-07-10 09:50:59 +03:00
Roy Jacobson
0f69d32b11
takeover: Cancel blocking commands (#1514)
* fix: Cancel blocking commands when performing a takeover

* Add some comments

* Make CancelBlocking a method of ConnectionContext

* add a small todo
2023-07-05 17:09:10 +02:00
Roman Gershman
84d09800c3
chore: refresh helio (#1506)
In addition, add more states to tx local_mask to allow easier debugging.
Finally, add check-fail to verify tx invariants in order to prevent
reaching errorneous states that are nearly impossible to analyze.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-07-04 16:51:53 +03:00
Roy Jacobson
4babed54d3
feat: Support atomic replica takeover (#1314)
* fix(server): Initialize ServerFamily with all listeners.

- Add a test for CLIENT LIST which is the visible result of this.

* use std move

* feat: Implement replicas take over

* Basic test

* Address CR comments

* Write a better test. Sadly it fails

* chore: Expose AwaitDispatches for reuse in takeover

* Ensure that no commands can execute during or after a takeover

* CR progress

* Actually disable the expiration

* Improve tests coverage

* Fix the dispatch waiting code

* Improve testing coverage and fix a shutdown snaphot bug

* don't replicate a replica
2023-07-02 16:11:28 +02:00
Vladislav
6f78ae5073
fix: call NotifyPending only from tx queue invocations (#1439)
* fix: call NotifyPending only from tx queue invocations

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-06-21 10:26:22 +03:00
Vladislav
6d4d740d6e
fix: Don't remove non-concluding tx from queue on ooo runs (#1427)
* fix: Don't remove non-concluding tx from queue on ooo runs

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-06-18 21:14:28 +03:00
Roman Gershman
69e6ad799a
fix: remove bad check-fail in the transaction code (#1420)
fix: remove bad check-fail in the transaction code.

Fixes #1421.

The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData

Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`

Finally, a pytest is added to cover the issue.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-06-18 07:03:08 +03:00
Andy Dunstall
d6e97fcf6d
fix: remove NotifyPending from UnwatchShardCb (#1402)
NotifyPending was being called when a blocked transaction expires, which
meant other blocked transactions could be woken up even though another
transaction could be in progress. NotifyPending has no affect on the
blocked transaction.

Signed-off-by: Andy Dunstall <andydunstall@hotmail.co.uk>
2023-06-13 08:59:34 +03:00
Kostas Kyrimis
42116fa012
feat(zset family): Implement ZDiff command issue #1311 (#1333)
Signed-off-by: Kostas <kostaskyrim@gmail.com>
2023-06-05 18:26:01 +03:00
Andy Dunstall
1cfeff21a4
feat(streams): Add support for XREAD BLOCK (#1291)
* feat(streams): Add support for XREAD BLOCK

---------

Signed-off-by: Andrew Dunstall <andydunstall@hotmail.co.uk>
2023-05-27 22:47:31 +03:00
Roman Gershman
9658eab036
fix: support XREAD ... STREAMS ... keys derivation (#1250) 2023-05-21 20:51:45 +03:00
Roy Jacobson
7adf3799f0
feature(server): Bring back inline scheduling (#1130)
* feat: run tx-schedule inline if the dest shard is on the same thread (#908)

The optimization is applied within ScheduleSingleHop call.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* fix(server): Don't inline schedule when in LOADING

* Fix the another pre-emption bug with inline scheduling

* Better locking around journal callbacks

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-05-21 10:10:38 +03:00
Vladislav
cb82680aca
Remove blpop FindFirst hop after wakeup (#1168)
Remove BLPOP hop after wake
2023-05-03 19:45:06 +03:00
Roman Gershman
418f529b0e
fix: 'xgroup help' should show help message (#1159)
Along the way, performs small cleanups in command handling code.
XGROUP HELP is special because it falls out of Dragonfly command taxonomy design,
where a command name determines where its key is located. All other XGROUP subcommands
expect to see XGROUP <subcmd> <key> and this one obviously does not need any key.
I fix it by working around the issue and introduce a dedicated dummy command for this combination.

Fixes #854.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-04-30 09:53:01 +03:00
adiholden
7f56a435c4
bug(server): replicate scripts in stable state (#1114)
* bug(server): replicate scripts in stable state

---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2023-04-23 23:46:51 +03:00
Vladislav
b345604226
fix: Remove incremental locking (#1094) 2023-04-15 06:59:19 -07:00
Vladislav
70cf436c05
Lua script async calls (#1070)
Introduces squashing for scripts and a new `redis.acall` command for async commands
2023-04-12 23:37:25 +03:00
Vladislav
282c168d34
fix: Update cntx->cid on multi-tx'es (#1081)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-04-12 23:28:31 +03:00
Vladislav
015ed622c5
fix(server): Optimize StoredCmd (#1053)
Opmitize StoredCmd to allow inline storage
2023-04-11 10:14:36 +03:00
Vladislav
a12ddfe108
Remove cmd name from args (#1057)
chore: remove cmd name from the list of arguments

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-04-10 14:14:52 +03:00
Vladislav
c29db83b7e
feat(server): Squashed exec (#1025)
Introduces squashed executor that allows squashing single-shard commands within multi transactions
2023-04-08 23:34:33 +03:00
Roman Gershman
8e0080133c
fix: add missing barrier to fix reads in the coordinator fiber (#1009)
Fixes #997.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-30 06:55:28 +03:00
Vladislav
623c5a85e3
fix(server): Fix transaction index + shard_data multi re-use (#958) 2023-03-19 12:18:02 +03:00
Roman Gershman
4870d8150c fix: Fix deadlock in the transaction code.
The deadlock happenned during the brpop flow where we access
shard_data.local_data from both coordinator and shard threads.
Originally, shard_data.local_data was not designed for concurrent access,
and I used ARMED bit to deduplicate callback runs for each shard.
The problem is that within BRPOP flow, the
ExecuteAsync would apply "=| ARMED" and in parallel NotifySuspended would apply
" |= AWAKED" in the shard thread, and both R/M/W operations would corrupt each other.

Therefore, I separated now completely shard-local local_data mask and is_armed boolean.
Moreover, since now we use atomics for is_armed, I increased PerShardData size to 64 bytes
to avoid false cache sharding betweenn PerShardData objects.

Fixes #945

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 20:36:11 +02:00
Roman Gershman
f4081f3979 fix: improve consistency around brpop flow
1. Added a test that was breaking earlier.
2. Made sure that multiple waked brpop transaction would not
   snatch items from one another.
3. Fixed watched-queues clean-up logic inside blocking_controller that caused deadlocks.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 11:49:23 +02:00
Roman Gershman
c96f637f73 chore: some pytests and logging improvements
1. pytest extensions and fixes - allows running them
   with the existing local server by providing its port (--existing <port>).
2. Extend "DEBUG WATCHED" command to provide more information about watched state.
3. Improve debug/vlog printings around the code.

This noisy PR is a preparation before BRPOP fix that will follow later.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-17 10:52:20 +02:00
Roman Gershman
8cf8115116
chore(server): pass coordinator thread to a transaction object (#905)
This should help with some of the optimizations we may do in the future.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-03-03 14:40:29 +02:00