Commit graph

671 commits

Author SHA1 Message Date
Borys
0832d23f13
fix: allow cluster node load snapshot bigger than maxmemory (#4394) 2025-01-02 13:50:21 +02:00
adiholden
810af83074
fix(server): debug populate consume less memory (#4384)
* fix server: debug populate consume less memory

Signed-off-by: adi_holden <adi@dragonflydb.io>
2025-01-01 10:49:17 +02:00
Stepan Bagritsevich
5e7acbb396
fix(snapshot_test): Fix test_big_value_serialization_memory_limit after adding streams support (#4383)
* fix(snapshot_test): Fix test_big_value_serialization_memory_limit after adding streams support

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* feat: Temporary run test 30 times

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix: Remove repeat

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* Revert "fix: Remove repeat"

This reverts commit b7f1ed90b2.

* Revert "feat: Temporary run test 30 times"

This reverts commit 78bfb0fc26.

* Revert "fix(snapshot_test): Fix test_big_value_serialization_memory_limit after adding streams support"

This reverts commit f33036db2a.

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-30 17:38:57 +04:00
Borys
b7532538fe
test: fix test_network_disconnect_during_migration (#4378) 2024-12-30 14:02:12 +02:00
Borys
5b9c7e415a
fix: stream memory counting during snapshot loading (#4346)
* fix: stream memory counting during snapshot loading
2024-12-27 09:02:47 +02:00
Stepan Bagritsevich
0065c27f27
feat(rdb_saver): Support big value serialization for stream (#4376)
fixes dragonflydb#4317

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-26 15:15:35 +00:00
adiholden
14fda354cc
fix(pytest): fix failing test test_replication_timeout_on_full_sync (#4370)
fix: increase timeout
fix: remove skip

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-25 14:23:33 +02:00
Roman Gershman
966a1a46fd
feat: support memcache meta responses (#4366)
Fixes #4348 and #3071

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-25 07:46:50 +00:00
adiholden
937f1bc058
chore(test): disable failing test untill fixed (#4367)
chore test: disable failing test untill fixed

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-25 05:27:00 +02:00
Roman Gershman
a27cce81b1
feat: add support for meta memcache commands (#4362)
This is a stripped down version of supporting the memcache meta requests.

a. Not all meta flags are supported, but TTL, flags, arithmetics are supported.
b. does not include reply support.
c. does not include new semantics that are not part of the older, ascii protocol.

The parser interface has not changed significantly, and the meta commands are emulated
using the old, high level commands like ADD,REPLACE, INCR etc.

See https://raw.githubusercontent.com/memcached/memcached/refs/heads/master/doc/protocol.txt for more details
regarding the meta commands spec.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-24 15:33:24 +02:00
Borys
dc81594cd5
test: skip test_network_disconnect_during_migration (#4359) 2024-12-24 10:21:03 +02:00
Borys
76000b9672
fix: test_network_disconnect_during_migration (#4345) 2024-12-22 13:02:53 +02:00
adiholden
e462fc0401
fix(server): use compression for non big values (#4331)
* fix server: use compression for non big values
---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 22:03:45 +02:00
adiholden
3f68028c08
fix(pytest): call stop for all instances even if stop raise exception (#4339)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 18:03:37 +02:00
Borys
6a7931985b
fix: cluster tests stability (#4338) 2024-12-18 13:43:45 +00:00
Shahar Mike
32d71071ae
chore: Disable failing test (#4337) 2024-12-18 12:54:33 +00:00
adiholden
e46717248d
fix (regression tests): skip flaky test (#4336)
skip flaky test

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 10:35:47 +00:00
Borys
d0d720a375
test: skip test_network_disconnect_during_migration because it is uns… (#4334)
test: skip test_network_disconnect_during_migration because it is unstable
2024-12-18 08:38:03 +00:00
Roman Gershman
bf410b6e0b
chore: add ability to track connections stuck at send (#4330)
* chore: add ability to track connections stuck at send

Add send_delay_seconds/send_delay_ms metrics.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
2024-12-18 08:56:36 +02:00
Roman Gershman
19164badf9
fix: potential OOM when first request sent in small bits (#4325)
Before: if socket data arrived in small bits, then CheckForHttpProto would grow
io_buf_ capacity exponentially with each iteration. For example, test_match_http test
easily causes OOM.

This PR ensures that there is always a buffer available - but it grows linearly with the input size.
Currently, the total input in CheckForHttpProto is limited to 1024.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-17 13:26:33 +02:00
adiholden
0fe5e86a1a
fix(test): seeder test remove check (#4320)
fix test: seeder test remove check

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-16 16:12:49 +02:00
Borys
8237d8fa81
refactor: remove serialization_max_chunk_size for cluster tests (#4316) 2024-12-16 13:42:56 +02:00
Shahar Mike
af04b558db
fix: Remove hardcoded @assert_eventually 100 times retry (#4318)
This was a subtle and minor bug. Nice catch Borys!
2024-12-16 10:40:45 +00:00
adiholden
027eff2ad3
server: report redis version 7.2.0 to support Sidekiq (#4286)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-16 11:06:08 +02:00
Borys
5e2d0d32b8
test: update logs and test for debug purpose (#4309)
test: update logs and test for devug purpose
2024-12-16 10:25:23 +02:00
Roman Gershman
53637790e8
fix: circular dependency in qlist (#4302)
* fix: circular dependency in qlist

fixes #4294

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* chore: fixes

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-13 11:04:14 +02:00
Kostas Kyrimis
b37287bf14
chore: test metrics for huge value serialization (#4262)
* fix seeder bugs
* add test
* add assertions for huge value metrics

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-12 14:19:14 +02:00
Borys
f892d9b7fb
fix: increase cluster migration default timeout (#4293) 2024-12-11 14:39:41 +00:00
adiholden
03d679ac31
fix(server) : dont apply eviction on rss over limit (#4276)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-09 14:19:25 +02:00
Shahar Mike
bafd8b3b8b
chore: Fix test_rss_used_mem_gap for all types (#4254)
* chore: Fix `test_rss_used_mem_gap` for all types

The test fails when it checks the gap between `used_memory` and
`object_used_memory`, by assuming that all used memory is consumed by
the `type` it `DEBUG POPULATE`s with.

This assumption is wrong, because there are other overheads, for example
the dash table and string keys.

The test failed for types `STRING` and `LIST` because they used a larger
number of keys as part of the test parameters, which added a larger
overhead.

I fixed the parameters such that all types use the same number of keys,
and also the same number of elements, modifying only the element sizes
(except for `STRING` which doesn't have sub-elements) so that the
overall `min_rss` requirement of 3.5gb still passes.

Fixes #3723

* threshold

* list

* comments test assert

* previous numbers

* ???
2024-12-08 13:34:59 +02:00
Kostas Kyrimis
f9f93b108c
chore: monotonically increasing ports for cluster tests (#4268)
We have cascading failures in cluster tests because on assertion failures the nodes are not properly cleaned up and subsequent test cases that use the same ports fail. I added a monotonically increasing port generator to mitigate this effect.
2024-12-06 12:07:23 +02:00
Borys
17651b2610
fix: test_network_disconnect_during_migration data size was too big f… (#4260)
fix: test_network_disconnect_during_migration data size was too big for timeout
2024-12-05 13:04:58 +00:00
Stepan Bagritsevich
5483d1d05e
fix(eviction): Tune eviction threshold in cache mode (#4142)
* fix(eviction): Tune eviction threshold in cache mode

fixes #4139

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: small fix in tiered_storage_test

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* chore(dragonfly_test): Remove ResetService

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: fix test_cache_eviction_with_rss_deny_oom test

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(dragonfly_test): Fix DflyEngineTest.Bug207

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(dragonfly_test): Increase string size in the test Bug207

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 3

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 4

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix: Fix failing tests

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 5

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: resolve conficts

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-05 12:26:59 +00:00
Kostas Kyrimis
267d5ab370
chore: remove DbSlice mutex and add ConditionFlag in SliceSnapshot (#4073)
* remove DbSlice mutex
* add ConditionFlag in SliceSnapshot
* disable compression when big value serialization is on
* add metrics

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-05 13:24:23 +02:00
Kostas Kyrimis
7ccad66fb1
feat: add support for big values in SeederV2 (#4222)
* add support for big values in SeederV2

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-05 08:47:41 +00:00
Borys
071e299971
refactor: remove redundant allocations for streamer (#4225)
* refactor: remove redundant allocations for streamer
2024-12-05 08:15:31 +00:00
adiholden
7a23ec2aac
fix(server): fix memory leak on lua error (#4236)
The bug:
calling lua_error does not return, instead it unwinds the Lua call stack until an error handler is found or the
script exits. This lead to memory leak on object that should release memory in destructor.
Specific example is the absl::FixedArray<string_view, 4> args(argc); which allocates on heap if argc > 4. The free was not called leading to memory leak.
The fix:
Add scoping to to the function so that the destructor is called before calling raise error

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-03 16:47:43 +02:00
Shahar Mike
b0d633fb61
chore: Hide replica info in real cluster if --managed_service_info (#4241)
So far we only handled emulated cluster. This PR adds real cluster
support.

Related to #4173
2024-12-02 20:22:07 +02:00
Shahar Mike
779bba71f9
fix: Fix test_network_disconnect_during_migration test (#4224)
There are actually a few failures fixed in this PR, only one of which is a test bug:

* `db_slice_->Traverse()` can yield, causing `fiber_cancelled_`'s value to change
* When a migration is cancelled, it may never finish `WaitForInflightToComplete()` because it has `in_flight_bytes_` that will never reach destination due to the cancellation
* `IterateMap()` with numeric key/values overrode the key's buffer with the value's buffer

Fixes #4207
2024-12-02 15:55:23 +02:00
Borys
dc04b196d5
test: fix and unskip test_migration_timeout_on_sync (#4216) 2024-11-28 14:54:17 +02:00
Borys
d6f2b76666
fix: cluster_mgr script (#4210) 2024-11-27 14:09:19 +00:00
Kostas Kyrimis
66e0fd0908
fix: stream memory tracking (#4067)
* add object memory usage for streams
* add test
2024-11-27 12:41:08 +02:00
Borys
0531c39aae
test: skip test_cluster_mgr because of unclosed instance (#4191) 2024-11-26 09:34:06 +00:00
Roman Gershman
2c663f3833
chore: produce core files in regtests (#4185)
Should work only for self-hosted runners.
The core files will be kept in /var/crash/
We also copy automatically dragonfly binary into /var/crash to be able to debug later.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:13:11 +00:00
Roman Gershman
63742dd0cf
fix: stop using openssl for container healthchecks (#4181)
Dragonfly responds to ascii based requests to tls port with:
`-ERR Bad TLS header, double check if you enabled TLS for your client.`

Therefore, it is possible to test now both tls and non-tls ports with a plain-text PING.
Fixes #4171

Also, blacklist the bloom-filter test that Dragonfly does not support yet.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-25 17:41:17 +02:00
Borys
43c83d29fa
feat: cluster migrations restarts immediately if timeout happens (#4081)
* feat: cluster migrations restarts immediately if timeout happens

* feat: add DEBUG MIGRATION PAUSE command
2024-11-25 16:02:22 +02:00
Shahar Mike
3c65651c69
feat: Huge values breakdown in cluster migration (#4144)
* feat: Huge values breakdown in cluster migration

Before this PR we used `RESTORE` commands for transferring data between
source and target nodes in cluster slots migration.

While this _works_, it has a side effect of consuming 2x memory for huge
values (i.e. if a single key's value takes 10gb, serializing it will
take 20gb or even 30gb).

With this PR we break down huge keys into multiple commands (`RPUSH`,
`HSET`, etc), respecting the existing `--serialization_max_chunk_size`
flag.

Part of #4100
2024-11-25 15:58:18 +02:00
Shahar Mike
6a7f345bc5
chore: Hide replicas from CLUSTER subcmds in managed mode (#4174)
* chore: Hide replicas from `CLUSTER` subcmds in managed mode

Part of #4173 (see for context)

* server.client()
2024-11-24 13:10:32 +00:00
Roman Gershman
91caa940b9
chore: fix shutdown sequence in Dragonfly server (#4168)
1. Better logging in regtests
2. Release resources in dfly_main in more controlled manner.
3. Switch to ignoring signals when unregister signal handlers during the shutdown.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-24 10:35:00 +02:00
Roman Gershman
b8c2dd888a
chore: log exit code of failing dragonfly in tests (#4166)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-11-22 11:40:10 +02:00