-
Notifications
You must be signed in to change notification settings - Fork 926
use singleton ratelimiter for dynamically created clients #6192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR implements a new rate limiting mechanism by switching from setting QPS and Burst values directly on REST configurations to using a singleton ratelimiterGetter based on the token bucket algorithm. The changes span multiple components (agent, aggregated-apiserver, metrics-adapter, scheduler, controller-manager, etc.), ensuring that dynamically created clients consistently use the new rate limiter setup.
- Replace direct QPS/Burst assignment with flowcontrol.NewTokenBucketRateLimiter.
- Update client and dynamic client setup functions to accept a ClientOption containing the new RateLimiterGetter.
- Amend tests to align with the new ClientOption structure.
Reviewed Changes
File | Description |
---|---|
cmd/agent/app/agent.go | Updates kubeconfig construction to use RateLimiter. |
cmd/aggregated-apiserver/app/options/options.go | Switches REST config rate limiter to token bucket mechanism. |
cmd/metrics-adapter/app/options/options.go | Modifies both REST config and metrics controller to use the new limiter. |
cmd/scheduler-estimator/app/scheduler-estimator.go | Applies new rate limiter when building dynamic clients. |
cmd/descheduler/app/descheduler.go | Similar rate limiter update for client configuration. |
cmd/scheduler/app/scheduler.go | Replaces QPS/Burst assignment with rate limiter initialization. |
cmd/karmada-search/app/karmada-search.go | Updates client config in search component to use the rate limiter. |
cmd/controller-manager/app/controllermanager.go | Adjusts multiple client creation paths to use new ClientOption. |
cmd/webhook/app/webhook.go | Refactors webhook config to adopt rate limiter setup. |
pkg/metricsadapter/multiclient/client.go | Introduces nil-check logic before applying the new RateLimiterGetter. |
pkg/controllers/status/cluster_status_controller_test.go | Updates test client options to remove explicit QPS/Burst values. |
pkg/controllers/status/cluster_status_controller.go | Changes function signatures to pass ClientOption to dynamic client setup. |
pkg/controllers/status/work_status_controller.go | Modifies dynamic client creation to include the new ClientOption. |
pkg/search/controller.go | Adjusts dynamic client builder to pass nil for ClientOption where applicable. |
pkg/controllers/context/context.go | Adds ClientOption to the controller context structure. |
pkg/controllers/mcs/service_export_controller.go | Updates dynamic client setup to use ClientOption. |
pkg/controllers/multiclusterservice/endpointslice_collect_controller.go | Ensures dynamic client creation includes ClientOption. |
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Signed-off-by: zach593 <zach_li@outlook.com>
592517f
to
410c5d4
Compare
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #6192 +/- ##
==========================================
- Coverage 47.97% 47.95% -0.03%
==========================================
Files 674 675 +1
Lines 55841 55886 +45
==========================================
+ Hits 26789 26798 +9
- Misses 27305 27334 +29
- Partials 1747 1754 +7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
/remove-kind feature I think the issue with the rate limiter not working is more like a bug, please feel free to correct me if anyone thinks it should not be. |
/assign |
} | ||
clusterClientSet.KubeClient = kubeclientset.NewForConfigOrDie(clusterConfig) | ||
} | ||
return &clusterClientSet, nil | ||
} | ||
|
||
// NewClusterDynamicClientSetFunc is a function that returns a dynamic client for the given member cluster. | ||
type NewClusterDynamicClientSetFunc = func(clusterName string, client client.Client, clientOption *ClientOption) (*DynamicClusterClient, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between removing the equal sign and defining the function variable directly and the current definition method?
type NewClusterDynamicClientSetFunc func(clusterName string, client client.Client, clientOption *ClientOption) (*DynamicClusterClient, error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no difference in actual results. I personally prefer the way to use type alias (with equal sign) because it changes nothing, it just acts as an abbreviation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Genericly lgtm, ask @RainbowMango to help a look
/cc @RainbowMango
OK. Trying my best... I'm not sure if I will have enough time for this soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/assign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
pkg/util/membercluster_client_test.go:198
- [nitpick] Add tests for scenarios where ClientOption includes a non-nil RateLimiterGetter to ensure that the rate limiter is correctly applied to the client.
clientOption: &ClientOption{}
clusterConfig, err := BuildClusterConfig(clusterName, clusterGetter(client), secretGetter(client)) | ||
if err != nil { | ||
return nil, err | ||
} | ||
var clusterClientSet = DynamicClusterClient{ClusterName: clusterName} | ||
|
||
if clusterConfig != nil { | ||
if clientOption != nil { | ||
if clientOption.RateLimiterGetter != nil { | ||
clusterConfig.RateLimiter = clientOption.RateLimiterGetter(clusterName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider validating that clientOption.RateLimiterGetter is non-nil or providing a fallback default rate limiter so that client creation always enforces rate limiting.
clusterConfig.RateLimiter = clientOption.RateLimiterGetter(clusterName) | |
clusterConfig.RateLimiter = clientOption.RateLimiterGetter(clusterName) | |
} else { | |
clusterConfig.RateLimiter = flowcontrol.NewTokenBucketRateLimiter(10, 100) // Default rate limiter |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
// RateLimiterGetter is a struct to get rate limiter. | ||
type RateLimiterGetter struct { | ||
limiters map[string]flowcontrol.RateLimiter | ||
mu sync.Mutex | ||
qps float32 | ||
burst int | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This structure looks weird, as RateLimiterGetter
actually manages and holds multiple rate-limiter instances; it should not hold the qps
and burst
, which belong to a specific rate-limiter. Otherwise lead to unexpected behavior, think the following flow:
- SetLimits(10, 20)
- SetLimits(15, 30)
- GetRateLimiter("member-cluster") // the rate-limiter will be initilized with qps=15, burst=30
What type of PR is this?
/kind bug
What this PR does / why we need it:
see #6094, this PR add the a ratelimiterGetter singleton for dynamically created cluster clients, to let the rate limiter to more accurately limit the total qps of multiple dynamically created clients .
Which issue(s) this PR fixes:
Fixes #6094
Special notes for your reviewer:
I found it difficult to test using mock client, because the existing fake client implementations do not accept ratelimiter parameters. However, I added some tests in
pkg/util/ratelimiter_test.go
to simulate and illustrate the final effect.If we want to test the rate limit more completely, we may need to introduce envTest (a sub package of controller-runtime), but this requires more general considerations, because the impact is large, so it will not be expanded in this PR.
Does this PR introduce a user-facing change?: