[00:32:56] FIRING: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of cpu - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [01:02:56] RESOLVED: ToolforgeKubernetesCapacity: Kubernetes cluster k8s.tools.eqiad1.wikimedia.cloud:6443 in risk of running out of cpu - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesCapacity - https://grafana.wmcloud.org/d/8GiwHDL4k/kubernetes-cluster-overview?orgId=1 - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesCapacity [01:13:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [05:48:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [06:18:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [08:56:37] 10Tool-community-activity-alerts-system: Adding an authentication mechanism - OAuth to the tool so that only wikimedia users can login and add/change labels to the peaks in graph - https://phabricator.wikimedia.org/T400555 (10Kalli_navya) 03NEW [09:13:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [09:28:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [09:53:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [10:08:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [10:13:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [10:18:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [11:34:30] 10Tool-community-activity-alerts-system: Adding Annotation layer on the dashboard - https://phabricator.wikimedia.org/T398996#11037350 (10Nazalnihad) Added annotation input in hover for the label and update the database [12:15:37] 10Tool-community-activity-alerts-system: UI Enhancements in phase 1 - https://phabricator.wikimedia.org/T398998#11037358 (10ecoholic) 05Open→03Resolved [12:16:07] 10Tool-community-activity-alerts-system: Extending the dashboard to report abonormalities in editor counts - https://phabricator.wikimedia.org/T398999#11037359 (10ecoholic) 05Open→03In progress [12:17:59] 10Tool-community-activity-alerts-system: Experiment with alerting and anomaly detection techniques - https://phabricator.wikimedia.org/T399000#11037362 (10Aman03062006) 05Open→03In progress [12:18:03] 10Tool-community-activity-alerts-system: Adding Annotation layer on the dashboard - https://phabricator.wikimedia.org/T398996#11037363 (10Nazalnihad) 05Open→03Resolved a:03Nazalnihad [12:18:50] 10Tool-community-activity-alerts-system: Pandas to Polars migration - https://phabricator.wikimedia.org/T398997#11037366 (10Harikrshnaa) 05Open→03Resolved a:03Harikrshnaa [12:23:28] 10Tool-community-activity-alerts-system: Adding an authentication mechanism - OAuth to the tool so that only wikimedia users can login and add/change labels to the peaks in graph - https://phabricator.wikimedia.org/T400555#11037371 (10Tsrq1234) 05Open→03In progress a:03Tsrq1234 [12:26:08] 10Tool-community-activity-alerts-system: Adding an authentication mechanism - OAuth to the tool so that only wikimedia users can login and add/change labels to the peaks in graph - https://phabricator.wikimedia.org/T400555#11037374 (10naveen28204280) [13:48:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [13:49:01] 10Tool-inteGraality: Retrieving labels via SPARQL tanks query performance - https://phabricator.wikimedia.org/T400480#11037398 (10Nono314) It's always better to have as few statements as possible withn Group By, yes. This should be as efficient but dtill be standard sparql (without WITH block) : SELECT ?grouping... [14:04:56] FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:09:56] FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:19:56] RESOLVED: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:28:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [14:43:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [14:51:50] FIRING: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [14:56:51] RESOLVED: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:03:27] 10PAWS: [Bug] PAWS server not starting - https://phabricator.wikimedia.org/T400542#11037422 (10Anoop) [15:09:20] FIRING: [3x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:14:21] FIRING: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:19:21] RESOLVED: [2x] ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:52:04] 10Tool-community-activity-alerts-system: Experiment with alerting and anomaly detection techniques - https://phabricator.wikimedia.org/T399000#11037452 (10JJMC89) 05In progress→03Open [17:15:34] (03CR) 10Eugene233: "recheck" [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1134334 (owner: 10Ketulucas) [18:18:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [18:33:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [18:43:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [18:48:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [19:03:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [19:06:56] FIRING: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:11:56] RESOLVED: ProbeDown: Service tools-k8s-haproxy-5:30000 has failed probes (http_admin_toolforge_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-k8s-haproxy-5:30000 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [19:13:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [19:31:47] 10PAWS: [bug] - https://phabricator.wikimedia.org/T400580 (10cosmo) 03NEW [20:38:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [21:15:10] Change on 12wikitech.wikimedia.org a page Help:Toolforge/Running jobs was modified, changed by Dvorapa link https://wikitech.wikimedia.org/w/index.php?diff=2326910 edit summary: /* Job logs */ fix broken appearance [21:25:30] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance toolsbeta-harbor-2 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [22:03:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [23:28:34] FIRING: [3x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [23:33:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess [23:43:34] FIRING: [4x] ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-32 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcess