Commit graph

88 commits

Author SHA1 Message Date
Shahar Mike
24a1ec6ab2
fix: Huge entries fail to load outside RDB / replication (#4154)
* fix: Huge entries fail to load outside RDB / replication

We have an internal utility tool that we use to deserialize values in
some use cases:

* `RESTORE`
* Cluster slot migration
* `RENAME`, if the source and target shards are different

We [recently](https://github.com/dragonflydb/dragonfly/issues/3760)
changed this area of the code, which caused this regression as it only
handled RDB / replication streams.

Fixes #4143
2024-11-20 14:00:07 +00:00
Borys
4bc9ad6f01
test: add test for snapshoting during migration (#4108)
* test: add test for snapshotting during migration

* test: add test to run replication after migration
2024-11-13 13:40:00 +02:00
Borys
e4b468d953
fix: reduce memory consumption during migration (#4017)
* refactor: reduce memory consumption for RestoreStreamer
* fix: add Throttling into RestoreStreamer::WriteBucket
2024-11-03 17:03:45 +02:00
Borys
c80d21fcba
fix: crash if we OOM during migration process (#3968) 2024-10-23 17:04:08 +03:00
Kostas Kyrimis
478a5d476d
chore: disable test_cluster_memory_consumption_migration (#3948)
Test takes more than 10 minutes on the CI and it causes it to timeout
2024-10-21 08:41:15 +03:00
Borys
866c82a3fa
test: add test to reproduce a lot of memory consumtion during migration (#3939) 2024-10-17 14:23:26 +03:00
Kostas Kyrimis
b19f722011
chore: do not close connections at the end of pytest (#3811)
A common case is that we need to clean up a connection before we exit a test via .close() method. This is needed because otherwise the connection will raise a warning that it is left unclosed. However, remembering to call .close() at each connection at the end of the test is cumbersome! Luckily, fixtures in python can be marked as async which allow us to:

* cache all clients created by DflyInstance.client()
* clean them all at the end of the fixture in one go

---------

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-09-30 09:54:41 +03:00
Borys
93de559977
Update dflycluster slot-migration-status reply (#3707)
* feat: update DFLYCLUSTER SLOT-MIGRATION-STATUS reply
2024-09-15 09:44:40 +03:00
Borys
bae2767707
test: fix test_cluster_replication_migration (#3699) 2024-09-11 23:00:53 +03:00
Borys
35c287b813
test: unskip cluster tests and add debug info (#3681) 2024-09-09 22:21:17 +03:00
Borys
2cc2a23247
fix: deadlock in the cluster migration process (#3653) 2024-09-05 21:55:15 +03:00
Shahar Mike
de5ecc7447
chore: Split --cluster_announce_ip and --replica_announce_ip (#3615)
chore: Split `cluster_announce_ip` and `replica_announce_ip`

This PR partially reverts #3421

Fixes #3541
2024-09-01 12:43:44 +00:00
Kostas Kyrimis
238bf3ee85
fix: disable test_cluster_flushall_during_migration (#3573)
* disable test_cluster_flushall_during_migration
2024-08-26 17:50:49 +03:00
Borys
48a28c3ea3
refactor: set info_replication_valkey_compatible=true (#3467)
* refactor: set info_replication_valkey_compatible=true
* test: mark test_cluster_replication_migration as skipped because it's broken
2024-08-08 21:42:58 +03:00
Vladislav
2ef475865f
test(cluster): Migration replication test (#3417) 2024-08-04 12:45:02 +03:00
Shahar Mike
2aa0b70035
feat(server): Support replica-announce-ip/port (#3421)
* feat: Support `replica-announce-ip`/`port`

Before this PR, we only supported `cluster_announce_ip`.
It's basically the same feature, but used for cluster announcements
instead of replication.

This PR adds support for `replica-announce-ip` and
`replica-announce-port`, which can be set via new flags `--announce_ip=`
and `--announce_port=`. These flags apply to both cluster and replica
announcements.

Tested via running Sentinel, and making sure it is able to connect to
announced ip+port, while it can't connect to announced false /
unavailable ip+port.

Note: this PR deprecates `--cluster_announce_ip`, but continues to
support it. We will remove it in a future version.

Fixes #3380

* fix failing test

* destructure
2024-08-04 12:35:14 +03:00
Borys
e2b6cfb384
chore: skip cluster tests if redis-server wasn't found (#3416)
* chore: skip cluster tests if redis-server wasn't found
2024-08-01 13:04:02 +00:00
Vladislav
f536f8afbd
chore: cancel slot migrations on shutdown (#3405) 2024-07-30 12:47:58 +03:00
Kostas Kyrimis
cd863b89b4
chore: disable cluster_fuzzymigration (#3373)
* mark cluster_fuzzymigration as skipped

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-07-24 11:46:44 +03:00
Roman Gershman
aac90f25b5
fix: failure in test_cluster_fuzzymigration (#3363) 2024-07-22 22:39:41 +03:00
Roman Gershman
4b1574b5c8
chore: fix test_parser_memory_stats flakiness (#3354)
* chore: fix test_parser_memory_stats flakiness

1. Added a robust assert_eventually decorator for pytests
2. Improved the assertion condition in TieredStorageTest.BackgroundOffloading
3. Added total_uploaded stats for tiering that tells how many times offloaded values
   were promoted back to RAM.

* chore: skip test_cluster_fuzzymigration
2024-07-22 10:41:26 +00:00
Shahar Mike
2b54fd985f
fix: Cancel outgoing migration when retrying / closing (#3339) 2024-07-19 07:49:49 +00:00
Borys
cad62679a4
Fix blocking commands moved error (#3334)
* fix: BLPOP BZPOP(MIN|MAX) moved error
2024-07-18 20:38:13 +03:00
Borys
3891efac2c
fix: forbid DFLYCLUSTER commads set for emulated cluster mode (#3307)
* fix: forbid DFLYCLUSTER commads set for emulated cluster mode
* feat: add CLUSTER MYID and remove DFLYCLUSTER MYID
* fix(test): __del__ method in python can't be async
* fix: crash and test_replicate_disconnect_cluster
2024-07-16 14:17:28 +03:00
Kostas Kyrimis
bf1b6cef6e
chore: skip test_cluster_flushall_during_migration (#3316)
* skip failing test on ci

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-07-15 09:48:28 +03:00
Borys
84814a7358
fix: fix move error during migration finalization (#3253)
* fix: fix Move error during migration finalization
2024-07-02 14:23:54 +03:00
Kostas Kyrimis
5956275818
chore: replace session wide fixtures with scope (#3251)
* chore: replace session wide fixtures with scope
2024-07-02 10:26:26 +03:00
Shahar Mike
48c6f4bf74
chore: Re-enable previously flaky test (#3196) 2024-06-21 13:12:14 +03:00
Shahar Mike
6024d79bd6
feat(cluster): Support STICK bit in slot migration (#3200) 2024-06-21 08:18:03 +03:00
Shahar Mike
c8f2f253d6
test(cluster): Make sure migration maintains TTL (#3188) 2024-06-20 20:46:38 +03:00
Borys
4e7f6dc6ed
test: improve cluster_fuzzy_migration test (#3197) 2024-06-20 19:09:15 +03:00
Shahar Mike
f66ee5f47d
fix(cluster): Support FLUSHALL while slot migration is in progress (#3173)
* fix(cluster): Support `FLUSHALL` while slot migration is in progress

Fixes #3132

Also do a small refactor to move cancellation logic into
`RestoreStreamer`.
2024-06-20 11:40:23 +03:00
Kostas Kyrimis
d207789610
chore(ci): run replication tests on arm (#3168)
* combine replication tests and reg tests in one flow
* allow replication tests to run on arm
2024-06-18 16:48:35 +03:00
Borys
39dd73fc71
fix: fix bug in cluster/slot_set (#3143)
* fix: fix bug in cluster/slot_set

* fix: fix slot flushes
2024-06-07 14:31:11 +03:00
Borys
66a524a026
test: skip test_cluster_migration_cancel, it is broken (#3146) 2024-06-06 16:33:08 +03:00
Borys
7606af706f
fix: fix RestoreStreamer to prevent buckets skipping #2830 (#3119)
* fix: fix RestoreStreamer to prevent bucket skipping #2830
2024-06-04 11:50:03 +03:00
Borys
644dc3f139
New test for cluster migration: connection issue (#3102)
* test: update test_config_consistency, 
update test_cluster_data_migration, 
new cluster migration test for network issues
2024-06-02 09:16:03 +03:00
Borys
b02a789ebf
fix: add timeout for DFLYMIGRATE ACK to prevent deadlock (#3093)
* fix: add timeout for DFLYMIGRATE ACK to prevent deadlock
2024-05-28 17:41:51 +03:00
Borys
0dea257f41
fix: fix cluster incorrect keys status (#3083)
* fix: fix cluster incorrect keys status
2024-05-26 15:10:01 +03:00
Shahar Mike
082aba02ef
fix(cluster-migration): Support cancelling migration right after starting it (#2992)
* fix(cluster-migration): Support cancelling migration right after starting it

This fixes a few small places, but most importantly it does not allow a
migration to start before both the outgoing and incoming side received
the updated config. This solves a few edge cases.

Fixes #2968

* add TODO

* fix test

* gh comments and fixes

* add comment
2024-05-02 15:50:42 +03:00
Borys
415839df79
fix: fix deadlock and slot flush for migration cancel #2968 (#2972)
* fix: fix deadlock and slot flush for migration cancel #2968
2024-04-30 08:44:05 +00:00
Borys
654ec9f1c4
feat: add slot migration error processing (#2957)
* feat: add slot migration error processing
2024-04-29 10:51:23 +03:00
Shahar Mike
322b2e7ac1
fix(test): Unflake fuzzy cluster migration test (#2927)
* WIP WIP WIP: Test if fuzzy migration test is still flaky

* tune down

* rm ci changes
2024-04-19 23:04:01 +03:00
Borys
9a6a9ec198
feat: add ability reaply config with migration #2924 (#2926)
* feat: add ability reaply config with migration #2924
2024-04-19 16:21:54 +03:00
Shahar Mike
56965edbe1
feat(cluster): Migration cancellation support (#2869) 2024-04-17 13:19:31 +03:00
Borys
d99b0eda16
feat: retry ACK if the configs are different #2833 (#2906)
* feat: retry ACK if the configs are different #2833
2024-04-16 15:03:30 +03:00
adiholden
9cbe69576e
fix(cluster_replication): replicate redis cluster node bug fix (#2876)
* fix redis replication error handling and set cntx as journal emulated


Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-04-14 22:49:00 +03:00
Shahar Mike
b8693b4805
feat(cluster): Send number of keys for incoming and outgoing migrations. (#2858)
The number of keys in an _incoming_ migration indicates how many keys
were received, while for _outgoing_ it shows the total number. Combining
the two can provide the control plane with percentage.

This slightly modified the format of the response.

Fixes #2756
2024-04-08 21:17:03 +03:00
Borys
482bd58787
feat(cluster): add migration removing by config #2835 (#2844) 2024-04-05 11:03:54 +03:00
Borys
7b419c6d10
refactor(cluster): replace sync_id with node_id for slot migration #2835 (#2838) 2024-04-04 10:14:03 +03:00