-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a metric for eviction number in cluster queue #1955
Conversation
Hi @lowang-bh. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/assign @alculquicondor @astefanutti |
/ok-to-test |
cc @mimowo |
pkg/metrics/metrics.go
Outdated
@@ -107,21 +107,34 @@ The label 'result' can have the following values: | |||
}, []string{"cluster_queue"}, | |||
) | |||
|
|||
EvictedWorkloads = prometheus.NewGaugeVec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some test for the metric, either unit test or integration. I know we have some integration tests for metrics already, so extending an existing integration test to assert of the metrics sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have added the unit test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. My last ask would be that the unit test verifies ReportEvictedWorkloads
works ok, still it would be good to assert in integration tests to make sure it is called.
a865b91
to
8fe0aca
Compare
8fe0aca
to
a31c1d4
Compare
/assign @mimowo |
@lowang-bh is there any progress on the remaining comments:
Let us know if there are any complications with them, we would like to include this feature in 0.7 |
6793dd8
to
fb00e3f
Compare
e9bceb2
to
733992a
Compare
38abe80
to
8380b67
Compare
@@ -293,6 +294,9 @@ var _ = ginkgo.Describe("SchedulerWithWaitForPodsReady", func() { | |||
}) | |||
g.Expect(k8sClient.Status().Update(ctx, prodWl)).Should(gomega.Succeed()) | |||
}, util.Timeout, util.Interval).Should(gomega.Succeed(), "Job reconciler should add an Evicted condition with InactiveWorkload to the Workload") | |||
|
|||
// ReconcileGenericJob didn't run here | |||
// util.ExpectEvictedWorkloadsTotalMetric(prodClusterQ.Name, kueue.WorkloadEvictedByDeactivation, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this requires some investigation, or explanation why we don't see the metric bumped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l looked closer into this, and it appears to be fixed here: #2131. The metric should also be bumped into the new section in the workload_controller
now.
Additionally, IIUC we can now remove the block above, please do so: TODO: Once we move a logic to issue the Eviction with InactiveWorkload reason, we need to remove the below updates.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l looked closer into this, and it appears to be fixed here: #2131. The metric should also be bumped into the new section in the
workload_controller
now.
Done.
@lowang-bh friendly ping if you can find time to rebase and address the remaining comments. |
Signed-off-by: lowang-bh <[email protected]> Update pkg/metrics/metrics.go Co-authored-by: Aldo Culquicondor <[email protected]> Update pkg/metrics/metrics.go Co-authored-by: Aldo Culquicondor <[email protected]> add testcase about EvictedWorkloads, and record metrics when patch condition succeed Signed-off-by: lowang-bh <[email protected]> update metrics.md for docs Signed-off-by: lowang-bh <[email protected]> add EvictedWorkloadsTotal metrics test in integration Signed-off-by: lowang-bh <[email protected]>
Signed-off-by: lowang-bh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks
LGTM label has been added. Git tree hash: ecc5316bb277453c56086e3b0ab44fcc0eee124f
|
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, lowang-bh, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind feature
|
What type of PR is this?
/kind feature
What this PR does / why we need it:
Expose a metrics to record workloads eviction number in cluster queue
Which issue(s) this PR fixes:
Fixes #1741
Special notes for your reviewer:
Does this PR introduce a user-facing change?