Commit graph

2977 commits

Author SHA1 Message Date
Roman Gershman
95cd9dfb4c
chore: update helio and improve our stack overflow resiliency (#4349)
1. Run CI/Regression tests with HELIO_STACK_CHECK=4096.
   This will crash if a fiber stack usage goes below this limit.
2. Increase shard queue stack size to 64KB
3. Increase fiber stack size to 40KB on Debug builds.
4. Updated helio has some changes around the TLS socket code.
   In addition we add a helper script to generate self-signed certificates helpful for local development work.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-23 08:13:45 +00:00
Roman Gershman
28848d0be2
fix: avoid on stack allocation of lz4 compression context (#4322)
Fixes #4245

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-23 09:54:20 +02:00
Stepan Bagritsevich
1fa9a47a86
refactor(search_family): Add Aggregator class (#4290)
* refactor(search_family): Add Aggregator class

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(aggregator_test): Fix tests failing

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: Restore the previous comment

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 2

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments 3

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* fix(aggregator): Simplify comparator for the case when one of the values is not present

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-23 08:43:48 +04:00
Stepan Bagritsevich
8d66c25bc6
chore(rax_tree): Introduce raxFreeWithCallback call in RaxTreeMap destructor (#4255)
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-23 08:42:47 +04:00
Stepan Bagritsevich
612d50df3b
refactor(rdb_saver): Add SnapshotDataConsumer to SliceSnapshot (#4287)
* refactor(rdb_saver): Add SnapshotDataConsumer to SliceSnapshot

fixes #4218

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-23 08:42:13 +04:00
romange
d16209461c chore(helm-chart): update to v1.26.0 2024-12-22 12:53:18 +00:00
Borys
76000b9672
fix: test_network_disconnect_during_migration (#4345) 2024-12-22 13:02:53 +02:00
Stepan Bagritsevich
c5ef553ffc
fix(search_family): Fix logging in ParseFieldWithAtSign (#4343)
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-19 14:11:13 +00:00
Shahar Mike
79c4a1809b
fix: Stack overflow in DFLYCLUSTER CONFIG (#4342)
fix: Stack overflow in `DFLYCLUSTER CONFIG`

It's fine to use the heap in such cases, latency doesn't matter.
2024-12-19 09:56:33 +02:00
adiholden
e462fc0401
fix(server): use compression for non big values (#4331)
* fix server: use compression for non big values
---------

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 22:03:45 +02:00
Roman Gershman
904d21d666
fix: add content-type for metrics response (#4340)
chore: add content-type for metrics response.

Also, update the local stack to use prometheus 3.0
Finally, hex-escape arguments when logging an error for a command.

Fixes #4277

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-18 19:12:00 +00:00
adiholden
3f68028c08
fix(pytest): call stop for all instances even if stop raise exception (#4339)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 18:03:37 +02:00
romange
682a3f6b15 chore(helm-chart): update to v1.25.6 2024-12-18 14:00:03 +00:00
Borys
6a7931985b
fix: cluster tests stability (#4338) 2024-12-18 13:43:45 +00:00
Shahar Mike
32d71071ae
chore: Disable failing test (#4337) 2024-12-18 12:54:33 +00:00
adiholden
e46717248d
fix (regression tests): skip flaky test (#4336)
skip flaky test

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-18 10:35:47 +00:00
Roman Gershman
04e21c07da
chore: fix wording around the dispatch fiber in dragonfly_connection (#4333)
Replace dispatch with async because we already have dispatch fiber in proactor code.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-18 09:22:55 +00:00
Borys
d0d720a375
test: skip test_network_disconnect_during_migration because it is uns… (#4334)
test: skip test_network_disconnect_during_migration because it is unstable
2024-12-18 08:38:03 +00:00
Roman Gershman
c22c9448b5
fix: do not check-fail in OpRestore (#4332)
fix: do not check-fail OpRestore

In some rare cases we reach inconsistent state inside OpRestore where a key already exists, though it should not.
In that case log the error instead of crashing the server. In addition, we update the existing entry to the latest restored value.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-18 09:53:03 +02:00
Roman Gershman
bf410b6e0b
chore: add ability to track connections stuck at send (#4330)
* chore: add ability to track connections stuck at send

Add send_delay_seconds/send_delay_ms metrics.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Signed-off-by: Roman Gershman <romange@gmail.com>
Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>
2024-12-18 08:56:36 +02:00
Borys
15b293a7ec
fix: crash during getting info about replication (#4328) 2024-12-18 08:44:24 +02:00
Roman Gershman
19164badf9
fix: potential OOM when first request sent in small bits (#4325)
Before: if socket data arrived in small bits, then CheckForHttpProto would grow
io_buf_ capacity exponentially with each iteration. For example, test_match_http test
easily causes OOM.

This PR ensures that there is always a buffer available - but it grows linearly with the input size.
Currently, the total input in CheckForHttpProto is limited to 1024.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-17 13:26:33 +02:00
adiholden
0fe5e86a1a
fix(test): seeder test remove check (#4320)
fix test: seeder test remove check

Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-16 16:12:49 +02:00
Shahar Mike
dfd942d749
fix: Do not preempt on dispatcher fiber (#4323) 2024-12-16 15:39:35 +02:00
Roman Gershman
03516c2752
chore: factor out CompressorImpl into separate files (#4319)
Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-16 12:11:44 +00:00
Borys
8237d8fa81
refactor: remove serialization_max_chunk_size for cluster tests (#4316) 2024-12-16 13:42:56 +02:00
Shahar Mike
af04b558db
fix: Remove hardcoded @assert_eventually 100 times retry (#4318)
This was a subtle and minor bug. Nice catch Borys!
2024-12-16 10:40:45 +00:00
Roman Gershman
53d6b64233
chore: factor out rdb_load utilities into separate files (#4315)
* chore: factor out rdb_load utilities into separate files

rdb_load.cc is huge and contains many auxillary classes.
This PR moves DecompressImpl and ErrorRdb code into detail/

It also fixes minor bugs around error conditions with de-compression:
a. Do not check-fail on invalid opcode and return error_code instead.
b. Print correctly LZ4 errors.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* chore: fixes

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-16 09:16:02 +00:00
adiholden
027eff2ad3
server: report redis version 7.2.0 to support Sidekiq (#4286)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-16 11:06:08 +02:00
Borys
5e2d0d32b8
test: update logs and test for debug purpose (#4309)
test: update logs and test for devug purpose
2024-12-16 10:25:23 +02:00
Roman Gershman
84fa6bcb73
chore: improve parser state machine (#4313)
* chore: improve parser state machine

1. Separate argument type parsing from argument parsing itself.
2. Handle strings of length 1.

This is done in preparation of improving the parser contract -
so that when it returns INPUT_PENDING, it consumes the entire input.

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-16 10:12:03 +02:00
Stepan Bagritsevich
9d6b2a133c
fix(search_family): Fix FT.AGGREGATE output (#4311)
Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-15 21:05:49 +04:00
Shahar Mike
027d76c13d
fix: Protect BumpUp() from running in parallel to serialization (#4307)
* protect BumpUp() from running in parallel to serialization
2024-12-13 11:06:31 +02:00
Borys
55bc981a7e
feat: add migration_finalization_timeout_ms flag (#4301) 2024-12-13 11:05:27 +02:00
Roman Gershman
53637790e8
fix: circular dependency in qlist (#4302)
* fix: circular dependency in qlist

fixes #4294

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

* chore: fixes

Signed-off-by: Roman Gershman <roman@dragonflydb.io>

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-13 11:04:14 +02:00
Roman Gershman
f564513235
chore: update reflex version and fix its build on alpine (#4304) 2024-12-13 10:03:11 +02:00
Kostas Kyrimis
b37287bf14
chore: test metrics for huge value serialization (#4262)
* fix seeder bugs
* add test
* add assertions for huge value metrics

Signed-off-by: kostas <kostas@dragonflydb.io>
2024-12-12 14:19:14 +02:00
Borys
f892d9b7fb
fix: increase cluster migration default timeout (#4293) 2024-12-11 14:39:41 +00:00
Stepan Bagritsevich
76f79f0e0b
fix(search_family): Remove the output of extra fields in the FT.AGGREGATE command (#4231)
* fix(search_family): Remove the output of extra fields in the FT.AGGREGATE command

fixes dragonflydb#4230

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

* refactor: address comments

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>

---------

Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>
2024-12-11 11:21:20 +00:00
romange
1e3d9de0f6 chore(helm-chart): update to v1.25.5 2024-12-11 09:57:16 +00:00
Shahar Mike
e63613f5ed
fix: mismatch new-delete in unit test (#4288) 2024-12-11 08:30:58 +00:00
Roman Gershman
5fee668391
chore: add active time to stream consumers (#4285)
* chore: add active time to stream consumers

Adjust XINFO command to output active-time property.
Store active-time and switch to RDB_TYPE_STREAM_LISTPACKS_3 if FLAGS_stream_rdb_encode_v2
is enabled.
---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-10 14:11:20 +02:00
Shahar Mike
0562796cac
fix: Do not attempt to defrag StringSet as a StringMap (#4283)
That'd be a total waste of time and energy, not to mention you'll crash.

Fixes #4167
2024-12-10 08:44:52 +00:00
Roman Gershman
44e27efc00
chore: fixes the parse error for xread/xreadgroup with unbalanced ids (#4266)
Partially addresses #4193

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-10 10:25:19 +02:00
Roman Gershman
f428dc31be
fix: support loading of 7.x streams correctly (#4281)
Now rdb_load supports RDB_TYPE_STREAM_LISTPACKS, RDB_TYPE_STREAM_LISTPACKS_2 and RDB_TYPE_STREAM_LISTPACKS_3 formats.
rdb_save still saves with RDB_TYPE_STREAM_LISTPACKS format - we want to release the DF version that can load everything first, and
then update the replicaion format in the next versions.

Also, update rdb_test.cc

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-10 09:57:36 +02:00
Roman Gershman
4428480a4e
chore: update command interface for main_service commands (#4265)
Clean up command_registry interface as well.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-09 23:09:22 +00:00
Roman Gershman
bf4f45a7c2
chore: let resp parser provide more useful logs (#4273)
* chore: let resp parser provide more useful logs

1. More warning logs around bad BAD_ARRAYLEN messages
2. Lift the restriction around big bulk strings and log a warning instead.
3. Pull helio

Probably fixes #4213

---------

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
2024-12-09 18:12:07 +02:00
adiholden
03d679ac31
fix(server) : dont apply eviction on rss over limit (#4276)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-09 14:19:25 +02:00
adiholden
d2f479b5da
fix cluster: migration traverse bug (#4279)
Signed-off-by: adi_holden <adi@dragonflydb.io>
2024-12-09 13:22:02 +02:00
Borys
51e16b2ceb
fix: prohibit read commands during takeover (#4267) 2024-12-09 13:02:15 +02:00