dragonfly

mirror of https://github.com/dragonflydb/dragonfly.git synced 2025-05-10 18:05:44 +02:00

Author	SHA1	Message	Date
Kostas Kyrimis	267d5ab370	chore: remove DbSlice mutex and add ConditionFlag in SliceSnapshot (#4073 ) * remove DbSlice mutex * add ConditionFlag in SliceSnapshot * disable compression when big value serialization is on * add metrics --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-12-05 13:24:23 +02:00
Shahar Mike	95f2320825	chore: Hide managed service info in `INFO` (#4248 ) Specifically: * `INFO REPLICATION` does not list the replicas, but does still show `connected_slaves` * `INFO SERVER` does not show `thread_count` and `os` Fixes #4173	2024-12-03 16:09:13 +02:00
Roman Gershman	010bd8add4	chore: change the interface of stream and server commands (#4219 )	2024-11-28 18:44:01 +02:00
Borys	43c83d29fa	feat: cluster migrations restarts immediately if timeout happens (#4081 ) * feat: cluster migrations restarts immediately if timeout happens * feat: add DEBUG MIGRATION PAUSE command	2024-11-25 16:02:22 +02:00
Roman Gershman	0e7ae34fe4	fix: enforce load limits when loading snapshot (#4136 ) * fix: enforce load limits when loading snapshot Prevent loading snapshots with used memory higher than max memory limit. 1. Store the used memory metadata only inside the summary file 2. Load the summary file before loading anything else, and if the used-memory is higher, abort the load. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-11-20 06:12:47 +02:00
Borys	4e7800f94f	fix: UB during cmd squashing reply size calculation (#4149 ) * fix: UB during cmd squashing reply size calculation * feat: add promtheus metric commands_squashing_replies_bytes	2024-11-19 13:40:30 +02:00
Borys	e16ef838e4	feat: add INFO memory section for squashing replies memory consuming (#4147 ) * feat: add INFO memory section for squashing replies memory consuming * refactor: address comments	2024-11-18 21:16:41 +02:00
Roman Gershman	8bd2b9ed3e	chore: optimize info command (#4137 ) * chore: optimize info command Info command has a large latency when returning all the sections. But often a single section is required. Specifically, SERVER and REPLICATION sections are often fetched by clients or management components. This PR: 1. Removes any hops for "INFO SERVER" command. 2. Removes some redundant stats. 3. Prints latency stats around GetMetrics command if it took to much. Signed-off-by: Roman Gershman <roman@dragonflydb.io> * Update src/server/server_family.cc Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com> Signed-off-by: Roman Gershman <romange@gmail.com> * chore: remove GetMetrics dependency from the REPLICATION section Also, address comments Signed-off-by: Roman Gershman <roman@dragonflydb.io> * fix: clang build --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io> Signed-off-by: Roman Gershman <romange@gmail.com> Co-authored-by: Shahar Mike <chakaz@users.noreply.github.com>	2024-11-17 13:33:29 +02:00
Roman Gershman	be96e6cf99	chore: change Namespaces to be a global pointer (#4032 ) * chore: change Namespaces to be a global pointer Before the namespaces object was defined globally. However it has non-trivial d'tor that is being called after main exits. It's quite dangerous to have global non-POD objects being defined globally. For example, if we used LOG(INFO) inside the Clear function , that would crash dragonfly on exit. Ths PR changes it to be a global pointer. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-11-10 10:45:53 +00:00
Vladislav	eadce55b67	chore: remove old io (#3953 ) * chore: Remove old IO * fix: fix last error accounting * chore: use unique_ptr<char> in MGetResponse storage --------- Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>	2024-11-10 11:56:41 +02:00
adiholden	2d49a28c15	fix(server): handle running script load inside multi (#4074 ) Signed-off-by: adi_holden <adi@dragonflydb.io>	2024-11-10 09:34:40 +02:00
Roman Gershman	7df8c268d8	chore: eliminate redundant ConnectionContext arguments (#4065 )	2024-11-05 10:40:04 +02:00
Roman Gershman	d10e76b408	chore: support load/save from GCS (#4006 ) Not everything is supported but manual load save is supported. 1. Run dragonfly with `--dir gs://bucket/path` 2. In redis-cli: a. SET foo bar b. SAVE DF gsdump c. DFLY LOAD gs://bucket/path/gsdump-summary.dfs Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-30 13:57:58 +02:00
Roman Gershman	6f6897cef1	chore: pass RedisReplyBuilder explicitly from dragonfly connection (#4009 ) Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-29 14:52:09 +02:00
Roman Gershman	92be74f4e4	fix: build break in search_family (#4008 ) Also perform additional clean-up of cntx->reply_builder() - Part11 Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-28 17:40:27 +02:00
Roman Gershman	1bdd56c973	chore: pass SinkReplyBuilder and Transaction explicitly. Part10 (#3998 )	2024-10-28 16:18:52 +02:00
Roman Gershman	c2710604de	chore: refactor snapshot expanding logic (#4003 ) S3 and file expansion logic had some duplicate code. this PR refactors it before adding GCS support. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-28 10:08:10 +02:00
Roman Gershman	b0d52c69ba	feat: introduce metrics/logs of when pipelining is being throttled (#4000 ) * feat: introduce metrics/logs of when pipelining is being throttled Fixes #3999 following up on discussion at #3997. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-28 09:15:04 +02:00
Roman Gershman	7035606b4b	chore: pass SinkReplyBuilder and Transaction explicitly. Part6 (#3987 )	2024-10-24 18:47:18 +03:00
Roman Gershman	132ffe0920	chore: reduce dependency of debug/memory commands on ConnectionContext (#3977 ) chore: reduce dependency of debug/dfly/memory commands on ConnectionContext	2024-10-24 10:24:18 +03:00
Roman Gershman	4aa0ca1ef7	chore: get rid of MutableSlice (#3952 ) * chore: get rid of MutableSlice Signed-off-by: Roman Gershman <roman@dragonflydb.io> * chore: comments --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-23 21:50:39 +03:00
Roman Gershman	f0c30a6d59	feat: track request sizes histograms (#3951 ) This PR introduces "DEBUG RECVSIZE ENABLE\|DISABLE\|tid" command that allows tracking of request sizes. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-20 19:54:34 +03:00
Roman Gershman	14220a6a20	chore: get rid of ToUpper/ToLower mutations on arguments (#3950 ) Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-18 23:23:59 +03:00
Roman Gershman	5ab32b97d9	chore(refactoring): header clean ups (#3943 ) Move privately used header code to cc files. Remove redunandant includes. No functional changes. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-18 12:47:26 +03:00
adiholden	a1830e1b5e	feat(server): use listpack node encoding for list (#3914 ) Signed-off-by: adi_holden <adi@dragonflydb.io>	2024-10-15 13:55:26 +03:00
Roman Gershman	4012ad1855	fix: prevents Dragonfly from blocking in epoll during snapshotting (#3911 ) The problem - we used file write in non-direct mode when writing snapshots in epoll mode. As a result - lots of data was cached into OS memory. But then during the rename operation, when we rename "xxx.dfs.tmp" into "xxx.dfs", the OS flushes the file caches and the thread is stuck in OS system call rename for a long time. The fix - to use DIRECT mode and to avoid caching the data into OS caches at all. Fixes #3895 Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-12 18:26:12 +03:00
Roman Gershman	5d2c308c99	chore: schedule chains (#3819 ) Use intrusive queue that allows batching of scheduling calls instead of handling each call separately. This optimizations improves latency and throughput by 3-5% In addition, we expose batching statistics in info transaction block. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-10-11 22:41:31 +03:00
Shahar Mike	50a7f2bcb1	fix: Do not kill Dragonfly on failed `DFLY LOAD` (#3892 ) Today, some of the failures to load an RDB file passed via `--dbfilename` cause Dragonfly to terminate with an error code. This is ok and works as expected. The problem is that the same code path is used for `DFLY LOAD`, which means that if there's an error loading the file (such as corrupted file), Dragonfly will exit instead of returning an error code to the client. This change fixes that, by only exiting in the code path which loads on init. Note to reviewer: apparently we can't call `Future::Get()` more than once, as the first call resets the state of the future and drops the previously saved value, so we use a Fiber here instead.	2024-10-08 14:47:31 +03:00
Kostas Kyrimis	a5d34adc4c	chore: remove goto statements (#3791 ) * replace goto statements with lambda calls Signed-off-by: kostas <kostas@dragonflydb.io>	2024-09-25 16:08:31 +03:00
Shahar Mike	526bce4222	chore: Forbid replicating a replica (#3779 ) * chore: Forbid replicating a replica We do not support connecting a replica to a replica, but before this PR we allowed doing so. This PR disables that behavior. Fixes #3679 * `replicaof_mu_`	2024-09-24 13:42:22 +00:00
Roman Gershman	b7b4cabacc	chore: some renames + fix a typo in RETURN_ON_BAD_STATUS (#3763 ) * chore: some renames + fix a typo in RETURN_ON_BAD_STATUS Renames in transaction.h - no functional changes. Fix a typo in error.h following #3758 --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-09-23 13:16:50 +03:00
Roman Gershman	f1f8ee17dc	fix: make snapshotting process more responsive (#3759 ) * fix: improve BreakStalledFlowsInShard heuristic Before this change - we wrote in a single call whatever record chunks we pulled from the channel. This can be problematic for 1GB chunks for example, which might take 10sec to write. Lately we added a replication breaker on the master side that breaks the fully sync after a predefined threshold has passed. By default it was 10sec. To improve the robustness of this breaker, we now write chunks of upto 1MB and update last_write_time_ns_ more frequently. Also, we added more logs to make replication delays on both sides more visible. We also added logs of breaking the replication on the master sides. Unfortunately, this did not help making BreakStalledFlowsInShard more robust because now the problem moved to replica side which may take 20s+ seconds to parse huge values. Therefore, I increased the threshold for breaking the replication to 30s. Finally, instrument GetMetrics call as it takes sometimes more than 1 sec. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-09-22 17:05:28 +03:00
adiholden	4d38271efa	feat(server): introduce rss oom limit (#3702 ) * introduce rss denyoom limit Signed-off-by: adi_holden <adi@dragonflydb.io>	2024-09-22 13:28:24 +03:00
Andy Dunstall	b9ff6934e8	fix: fix s3 load snapshot (#3717 )	2024-09-17 07:17:24 +01:00
Roman Gershman	bdc578acef	chore: limit number of descriptors in the exec map (#3688 ) For some cases, this map can grow indefinitely. This change makes it less detailed by makes sure that number of possible keys is bounded. Still it can provide a good summary of nature of exec transactions. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-09-10 07:50:30 +00:00
Shahar Mike	b10a4a5348	feat(server): Support `CLIENT SETINFO` (#3673 ) Add support for `CLIENT SETINFO <LIB-NAME \| LIB-VER>` and also return that as part of `CLIENT LIST`, like Valkey. Fixes #3137	2024-09-09 11:03:05 +03:00
Shahar Mike	1306a91bda	chore: Add `CLIENT ID` command (#3672 ) We already adhere to all requirements, we just need to return the id when the command is issued :) Fixes #3651	2024-09-08 22:00:53 +03:00
Borys	a1e9ee1b6d	CmdArgParser improvement (#3633 ) * feat: add processing of tail args into CmdArgParser::Check * refactor: rename CmdArgParser::Switch to Map * feat: add CheckMap method into CmdArgParser	2024-09-05 15:30:54 +03:00
Borys	d40e9088ae	refactor: remove extra code from CmdArgParser (#3619 ) * refactor: remove extra code from CmdArgParser	2024-09-03 07:04:05 +00:00
Roman Gershman	dd0effac6f	feat: add slave_repl_offset to the replication section. (#3596 ) * feat: add slave_repl_offset to the replication section. In Valkey slave_repl_offset denotes the replication offset on replica site during stable sync phase. During fullsync phase it appears with 0 value. In Dragonfly this field appears only after full sync has completed, thus it allows to check whether Dragonfly reached stable sync phase. The value of this field describes the cumulative progress of all the replication flows and it does not directly correspond to master side metrics. In addition, this PR fixes the bug in wait_available_async() function in our replication tests. This function is intended to wait until a replica reaches stable state and it did by sending pings until they do not respond with LOADING error, hence the assumption is that the replica is in full sync state already. However it can happen that master_link_status is "up" but replica has not reached full sync state, and the PING will succeed just because wait_available_async() was called before full sync started. The whole approach of polling the state is fragile. Now we use `slave_repl_offset` explicitly to see if the replica reaches stable state. Signed-off-by: Roman Gershman <roman@dragonflydb.io> * chore: simplify wait_available_async * chore: comments --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-30 18:58:07 +03:00
Kostas Kyrimis	41f7b611d0	chore: enable -Werror=thread-safety and add missing annotations (part 2/2) (#3595 ) * add missing annotations * small mutex fixes * enable -Werror=thread-safety for clang builds --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-30 15:42:30 +03:00
Kostas Kyrimis	0705bbb536	feat(acl): add pub/sub (#3574 ) * add support for pub/sub * add tests --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-30 15:41:28 +03:00
Stepan Bagritsevich	a22eff15dc	fix(server_family): Remove search indexes during the FLUSHALL command (#3539 ) * fix(server_family): Add search indixes removing to the FLUSHALL command fixes dragonflydf#3532 --------- Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com> Signed-off-by: Stepan Bagritsevich <stefan@dragonflydb.io>	2024-08-30 08:26:14 +03:00
Borys	88229cf365	refactor: remove toUpper() from cmd_arg_parser (#3599 ) * refactor: remove usage of toUpper() from cmd_arg_parser * refactor: remove CmdArgParser::NextUpper	2024-08-29 15:19:52 +03:00
Roman Gershman	0ee52c9d35	chore: remove DflyVersion::VER0 (#3593 ) Stop supporting DflyVersion::VER0 from more than a year ago. In addition, rename Metrics fields to make them more clear General improvements and fix the reconnect metric. Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-28 18:21:53 +03:00
Kostas Kyrimis	839b1be82d	chore: add -Wthread-analysis and annotate (part 1/2) (#3502 ) * enable -Wthread-analysis * add missing annotations * small fixes --------- Signed-off-by: kostas <kostas@dragonflydb.io>	2024-08-26 18:22:38 +03:00
Stepan Bagritsevich	80c3579596	feat(server_family): Add backup/restore Prometheus metrics (#3520 ) * feat(server_family): Add backup/restore Prometheus metrics fixes dragonflydb#3210 --------- Signed-off-by: Stepan Bagritsevich <bagr.stepan@gmail.com>	2024-08-24 00:36:31 +03:00
Shahar Mike	ad3ebf61d2	feat(cluster): Allow appending RDB to existing store (#3505 ) * feat(cluster): Allow appending RDB to existing store The goal of this PR is to support the loadoing of multiple RDB files into a single server, like when migrating from a Valkey cluster to Dragonfly with a different number of nodes. It makes the following changes: * Removes `DEBUG LOAD`, as we already have `DFLY LOAD` * Adds `APPEND` option to `DFLY LOAD` (i.e. `DFLY LOAD <filename> APPEND`) that loads an RDB without first flushing the data store, overriding existing keys * Does not load keys belonging to unowned slots, if in cluster mode Fixes #2840	2024-08-15 14:56:40 +03:00
Roman Gershman	93f6773297	chore: reduce pipelining latency by reusing existing shard fibers (#3494 ) * chore: reduce pipelining latency by reusing existing shard fibers To prove the benefits, run `./dfly_bench --pipeline=50 -n 20000 --ratio 0:1 --qps=0 --key_maximum=1` Before: the average pipelining latency was 10ms After: the average pipelining latency is 5ms. Avg latency: pipelined_latency_usec / total_pipelined_squashed_commands Also, improved counting of squashed commands - to count actual squashed ones. --------- Signed-off-by: Roman Gershman <roman@dragonflydb.io>	2024-08-14 14:45:54 +03:00
Stepan Bagritsevich	c756023332	feat: Expose replica_reconnect_count for Prometheus metrics (#3370 )	2024-08-13 12:34:01 +02:00

1 2 3 4 5 ...

427 commits