Commit graph

61 commits

Author SHA1 Message Date
Roman Gershman
bf410b6e0b
chore: add ability to track connections stuck at send (#4330)
* chore: add ability to track connections stuck at send

Add send_delay_seconds/send_delay_ms metrics.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
2024-12-18 08:56:36 +02:00
Roman Gershman
19164badf9
fix: potential OOM when first request sent in small bits (#4325)
Before: if socket data arrived in small bits, then CheckForHttpProto would grow
io_buf_ capacity exponentially with each iteration. For example, test_match_http test
easily causes OOM.

This PR ensures that there is always a buffer available - but it grows linearly with the input size.
Currently, the total input in CheckForHttpProto is limited to 1024.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-17 13:26:33 +02:00
Roman Gershman
fa8f3f5564
fix: regression in squashing code when determining eval commands (#4116)
The regression was caused by #3947 and it causes crashes in bullmq.
It has not been found till now because python client sends commands in uppercase.
Fixes #4113

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Co-authored-by: Kostas Kyrimis <kostas@dragonflydb.io>
2024-11-11 19:54:47 +00:00
Vladislav
32a31cf1d8
chore(facade): Fix bad new IO glue (#3940)
* chore(facade): Fix bad new IO glue

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-10-18 23:25:56 +03:00
Shahar Mike
b2ebfd05d4
fix: Do not publish to connections without context (#3873)
* fix: Do not publish to connections without context

This is a rare case where a closed connection is kept alive while the
handling fiber yields, therefore leaving `cc_` (the connection context)
pointing to null for other fibers to see.

As far as I can see, this can only happen during server shutdown, but
there could be other cases that I have missed.

The test on its own does _not_ reproduce the crash, however with added
`ThisFiber::SleepFor()`s I could reproduce the crash:

* Right before `DispatchBrief()`
  [here](e3214cb603/src/server/channel_store.cc (L154))
* Right after connection context `reset()`
  [here](2ab480e160/src/facade/dragonfly_connection.cc (L750))

In any case, calling `SendPubMessageAsync()` to a connection where `cc_`
is null is a bug, and we fix that here.

* rewording
2024-10-08 14:45:57 +03:00
Kostas Kyrimis
b19f722011
chore: do not close connections at the end of pytest (#3811)
A common case is that we need to clean up a connection before we exit a test via .close() method. This is needed because otherwise the connection will raise a warning that it is left unclosed. However, remembering to call .close() at each connection at the end of the test is cumbersome! Luckily, fixtures in python can be marked as async which allow us to:

* cache all clients created by DflyInstance.client()
* clean them all at the end of the fixture in one go

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:41 +03:00
Kostas Kyrimis
ed11c8d3a4
chore: allow config set notify_keyspace_events (#3790)
We do not allow notify_keyspace_events to be set at runtime via config set command.

* allow notify_keyspace_events in config set command
* add tests

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:02 +03:00
Roman Gershman
e21ba0b3d9
chore: symbolize stack traces in tests upon crash (#3714)
We disable address space randomization when building the binary
and use addr2line to symbolize the stacktrace if it exists.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-09-16 13:43:16 +03:00
Shahar Mike
b10a4a5348
feat(server): Support CLIENT SETINFO (#3673)
Add support for `CLIENT SETINFO <LIB-NAME | LIB-VER>` and also return
that as part of `CLIENT LIST`, like Valkey.

Fixes #3137
2024-09-09 11:03:05 +03:00
Roman Gershman
4b1574b5c8
chore: fix test_parser_memory_stats flakiness (#3354)
* chore: fix test_parser_memory_stats flakiness

1. Added a robust assert_eventually decorator for pytests
2. Improved the assertion condition in TieredStorageTest.BackgroundOffloading
3. Added total_uploaded stats for tiering that tells how many times offloaded values
   were promoted back to RAM.

* chore: skip test_cluster_fuzzymigration
2024-07-22 10:41:26 +00:00
Roman Gershman
feb9bc266a
chore: pull helio (#3350)
* chore: pull helio

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-07-21 15:26:25 +03:00
Roman Gershman
fb7782bcce
chore: remove redundant metrics from memory stats (#3345)
Leave only connection memory usage in memory stats.
We should think how we can move it also to /metrics.
In addition, added a test verifying that redis parser memory
usage is tracked.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-07-20 06:02:55 -04:00
Kostas Kyrimis
5956275818
chore: replace session wide fixtures with scope (#3251)
* chore: replace session wide fixtures with scope
2024-07-02 10:26:26 +03:00
Vladislav
4cc9834d89
fix(pytest): timed ticker for simpler conditions (#3242)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-06-29 15:48:25 +03:00
Vladislav
4357933775
feat(server): expiry notifications (#3154)
Adds basic support for keyspace notifications, only Ex

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-06-24 16:23:40 +03:00
Roman Gershman
007d4854db
chore: Introduce pipeline back-pressure (#3152)
* chore: Introduce pipeline back-pressure

Also, improve synchronization primitives and replace them with
thread-local variations.

Before the change, on my local machine with the dragonfly running with 8 threads,
`memtier_benchmark  -c 10 --threads 8  --command="PING"  --key-maximum 100000000  --hide-histogram --distinct-client-seed --pipeline=20 --test-time=10`

reached 10M qps with 0.327ms p99.9.

After the change, the same command showed 13.8M qps with 0.2ms p99.9
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-06-10 12:39:41 +03:00
Shahar Mike
229eeeb014
fix: Fix live-lock in connection test (#3135)
fix: Fix livelocking connection test
2024-06-05 21:30:02 +03:00
Roman Gershman
b02521cf51
chore: prevent Dispatch fiber to be launched during migration (#3123)
* chore: prevent Dispatch fiber to be launched during the connection migration
2024-06-04 14:13:48 +00:00
Kostas Kyrimis
3924fcad68
chore: pull helio add test for tls deadlock (#3111)
* pull helio
* add test that covers tls deadlock
2024-06-03 14:13:47 +00:00
Kostas Kyrimis
c2f13993d9
fix(acl): authentication with UDS socket (#2895)
* disable authentication on UDS socket
* add a test so the bug won't happen again
2024-04-12 16:01:12 +03:00
adiholden
b1e688b33f
bug(server): set connection flags block/pause flag on all blocking commands (#2816)
* bug((server)): set connecttion blocking and puash flags on all blocking commands

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-04-09 09:49:33 +03:00
Roman Gershman
604e9c6e97
fix: authorize the http connection to call commands (#2863)
fix: authorize the http connection to call DF commands

The assumption is that basic-auth already covers the authentication part.
And thanks to @sunneydev for finding the bug and providing the tests.
The tests actually uncovered another bug where we may parse partial http requests.
This one is handled by https://github.com/romange/helio/pull/243

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-04-08 13:19:01 +03:00
Vladislav
76729d6e4c
fix(tests): Fix numsub test (#2852)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-04-07 09:48:57 +03:00
Roman Gershman
d3b90c8210
fix: correct json response for errors (#2813)
Fixes #2811

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-04-01 22:56:26 +03:00
Roman Gershman
966d7f55ba
chore: preparation for basic http api (#2764)
* chore: preparation for basic http api

The goal is to provide very basic support for simple commands,
fancy stuff like pipelining, blocking commands won't work.

1. Added optional registration for /api handler.
2. Implemented parsing of post body.
3. Added basic formatting routine for the response. It does not cover all the commands but should suffice for
   basic usage.

The API is a POST method and the body of the request should contain command arguments formatted as json array.
For example, `'["set", "foo", "bar", "ex", "100"]'`.
The response is a json object with either `result` field holding the response of the command or
`error` field containing the error message sent by the server.
See `test_http` test in tests/dragonfly/connection_test.py for more details.


* chore: cover iouring with enable_direct_fd

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-03-25 12:12:31 +02:00
Roman Gershman
2d246adbbb
chore: better error reporting when connecting to tls with plain socket (#2740)
* chore: better error reporting when connecting to tls with plain socket

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-03-19 17:20:23 +02:00
adiholden
7e4527098b
fix(server): client pause work while blocking commands run (#2584)
fix #2576
fix #2661

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-02-28 11:07:03 +00:00
Vladislav
5ee61db0f3
feat(connection): Support pipelining with Memcached (#2648)
* feat(connection): Support pipelining with Memcached

Adds support for pipelining to Memcached, enhances Memcached pytests

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2024-02-23 20:18:25 +03:00
Vladislav
a5b9401449
fix: reduce test_pipeline_batching_while_migrating flakiness (#2475)
* fix: reduce test_pipeline_batching_while_migrating flakiness
2024-01-25 17:55:12 +03:00
Vladislav
484b4de216
Fix flush when migrating connection (#2407)
fix: don't miss flush for control messages
2024-01-13 09:57:33 +03:00
s-shiraki
bd3e57d262
feat(server): Implement NUMSUB subcommand (#2282)
* feat(server): Implement NUMSUB subcommand

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix: test

* fix: build error
2023-12-16 20:42:15 +02:00
Vladislav
7ca07a498f
fix(server): Fix client pause and add test (#2298)
Fixes a bug in which we incorrectly determined paused dispatches, which led to not allowing multiple (overlapping) client pauses
2023-12-12 19:28:48 +03:00
Vladislav
d6044edbab
fix(squashing): Reset base command id (#2209)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-26 12:40:37 +02:00
Vladislav
604c600166
fix(pytest): Fix renamed flag (#2197)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-11-20 20:54:11 +00:00
Vladislav
d21f82a5f9
chore: connection fixes (#2192)
* chore: add more states to client connections

* fix: clear pipelined messages before close

* fix: skip same thread on backpressure
---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Co-authored-by: Roman Gershman <roman@dragonflydb.io>
2023-11-20 17:08:12 +00:00
Vladislav
2cb7d30603
fix: skip setting tcp_nodelay for unix domain sockets (#2033)
* fix handling of unix domain sockets

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-10-22 11:00:51 +03:00
Roy Jacobson
5c9c9255d2
chore: Small refactor of DflyInstance (#1951)
* Move to its own file
* Unify self.args and self.params.args earlier so it can be inspected.
2023-09-28 10:11:11 +03:00
Vladislav
d8b99dce93
chore(regtest): Update redis dependency (#1915)
* chore(regtest): Update redis dependency

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-09-23 10:06:21 +03:00
Roman Gershman
6dd51de9fe
fix: fix memcache bugs (#1745)
1. If the first request sent to the connection is large (2kb or more)
   Dragonfly was closing the connection.
2. Changed server side error reporting according to memcache protocol:
   https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L172
3. Fixed the wrong casting in DispatchCommand.
4. Remove practically unused code that translated opstatus to strings.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-08-27 11:29:01 +03:00
Vladislav
ac79167530
fix: Add small timeout to monitor (#1718)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-21 10:20:43 +00:00
Vladislav
c65b9cf63d
fix: Fix squashing, pytest arg formatting (#1712)
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-18 09:28:19 +03:00
Vladislav
4fbd0e38dd
feat: Pipeline squashing (#1619)
* feat: Pipeline squashing

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Signed-off-by: Vladislav <vladislav.oleshko@gmail.com>
Co-authored-by: Kostas Kyrimis <kostaskyrim@gmail.com>
2023-08-17 16:06:48 +03:00
Vladislav
71fa2f275e
fix: MONITOR now works for multi transactions (#1675)
* fix: fix monitoring for multi transactions

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-08-17 12:50:16 +03:00
Roy Jacobson
4c85d5825d
tests: Add a password to TLS configurations (#1603)
Add a password to TLS configurations
2023-07-31 08:48:36 +00:00
Kostas Kyrimis
7944af3c62
feat: Add black formatter to the project (#1544)
Add black formatter and run it on pytests
2023-07-17 13:13:12 +03:00
Kostas Kyrimis
77a223d36d
fix: add tls-ca-cert-file and tls-ca-cert-dir flags to allow tls certificate validation (#1515)
1. add tls-ca-cert-file flag
2. add tls-ca-cert-dir flag
3. enables redis-cli to connect over tls without --insecure flag by properly validating certificate wtih CA
2023-07-11 08:28:18 +03:00
Kostas Kyrimis
15481b81ce
feat(replication): allow non-tls connections between replica and master on admin port #1419 (#1490)
1. Add new flag no_tls_on_admin_port
2. Add replication tests for no_tls_on_admin_port
2023-07-06 14:04:45 +03:00
Roman Gershman
69e6ad799a
fix: remove bad check-fail in the transaction code (#1420)
fix: remove bad check-fail in the transaction code.

Fixes #1421.

The failure reproduces for dragongly running with a single thread where all the
arguments grouped within the same ShardData

Also, we improve verbosity levels inside reply_builder.cc.
For that we extend SinkReplyBuilder to support protocol errors reporting
and we remove ad-hoc code for this from dragonfly_connection.
Required to track errors easily with `--vmodule=reply_builder=1`

Finally, a pytest is added to cover the issue.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2023-06-18 07:03:08 +03:00
Vladislav
e837b3d229
Fix reply builder access issue (#1378)
* fix: Fix invalid reply builder use

---------

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-06-10 00:50:05 +03:00
Vladislav
19732fcd0c
fix: fix dispatch ordering (#1254)
Now `SUBSCRIBE` will respond synchronously.  The change is here so we:

1. Maintain the order in pipelined requests
2. Don't have a "race condition": subscribe needs to update channel store pointers on all threads. While it awaits for all threads to complete the callback, some of them might have done it earlier, so they can already start sending messages before the initial ack is sent

Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
2023-05-23 23:55:00 +03:00