[00:07:29] FIRING: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:12:29] RESOLVED: PuppetAgentStaleLastRun: Last Puppet run was over 24 hours ago on instance tf-infra-test in project tf-infra-test - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [00:13:55] FIRING: MaxConntrack: Max conntrack at 80.02% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:16:28] FIRING: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:21:28] RESOLVED: InstanceDown: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:23:55] RESOLVED: MaxConntrack: Max conntrack at 80.52% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:29:25] FIRING: MaxConntrack: Max conntrack at 80.1% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:34:10] RESOLVED: MaxConntrack: Max conntrack at 80.1% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:45:40] FIRING: MaxConntrack: Max conntrack at 80.46% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:50:40] RESOLVED: MaxConntrack: Max conntrack at 80.11% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:52:25] FIRING: MaxConntrack: Max conntrack at 80.88% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:53:55] FIRING: MaxConntrack: Max conntrack at 90.06% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:54:03] 06cloud-services-team: MaxConntrack Netfilter: Maximum number of allowed connection tracking entries alert on cloudvirt1050:9100 - https://phabricator.wikimedia.org/T372693 (10phaultfinder) 03NEW [00:57:25] RESOLVED: MaxConntrack: Max conntrack at 89.65% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [00:58:55] RESOLVED: MaxConntrack: Max conntrack at 90.1% on cloudvirt1050:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_conntrack - https://grafana.wikimedia.org/d/oITUqwKIk/netfilter-connection-tracking - https://alerts.wikimedia.org/?q=alertname%3DMaxConntrack [03:22:47] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.ceph.osd.undrain_node [03:23:19] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.ceph.osd.undrain_node (exit_code=0) [03:52:02] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [03:54:52] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29687 bytes in 0.200 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [05:31:08] 10Cloud-VPS, 10Bitu, 06Infrastructure-Foundations: Find or create .deb package for mwclient 0.11.0 (or mwclient 0.10.0 with writeapi dependency removed) - https://phabricator.wikimedia.org/T372345#10071136 (10taavi) a:03taavi I've [[ https://tracker.debian.org/news/1556040/accepted-mwclient-0110-1-source-i... [05:51:22] 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.31 - https://phabricator.wikimedia.org/T372697 (10taavi) 03NEW [05:51:30] 10Toolforge: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.31 - https://phabricator.wikimedia.org/T372697#10071166 (10taavi) [05:51:31] 10Toolforge: [k8s,infra] Upgrade Toolforge to Uwubernetes (1.30) - https://phabricator.wikimedia.org/T362869#10071167 (10taavi) [10:39:53] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [10:59:20] PROBLEM - Wikitech-static main page has content on wikitech-static.wikimedia.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Wikitech-static [11:02:18] RECOVERY - Wikitech-static main page has content on wikitech-static.wikimedia.org is OK: HTTP OK: HTTP/1.1 200 OK - 29689 bytes in 7.399 second response time https://wikitech.wikimedia.org/wiki/Wikitech-static [11:32:09] (03open) 10raymond-ndibe: [jobs-cli] remove unknown keys from dump [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/64 (https://phabricator.wikimedia.org/T341066) [11:32:27] (03update) 10raymond-ndibe: [jobs-cli] remove unknown keys from dump [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/64 (https://phabricator.wikimedia.org/T341066) [11:40:17] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [12:30:09] 10Toolforge: disable-tool trying to archive toolsbeta accounts on tools NFS server - https://phabricator.wikimedia.org/T372701 (10taavi) 03NEW [12:35:15] (03update) 10raymond-ndibe: [jobs-cli] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/63 (https://phabricator.wikimedia.org/T341066) [12:36:26] (03update) 10raymond-ndibe: [jobs-cli] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-cli] (remove_unknown_keys_in_dump) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/63 (https://phabricator.wikimedia.org/T341066) [12:36:33] (03update) 10raymond-ndibe: [jobs-cli] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-cli] (remove_unknown_keys_in_dump) - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/63 (https://phabricator.wikimedia.org/T341066) [12:38:36] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [12:38:50] (03update) 10raymond-ndibe: [jobs-api] multi-replica support for continuous jobs [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/115 (https://phabricator.wikimedia.org/T341066) [19:18:57] (03PS1) 10Krinkle: Add missing space to error log prefix [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1063275 [19:20:53] (03CR) 10Krinkle: [C:03+2] Add missing space to error log prefix [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1063275 (owner: 10Krinkle) [19:21:16] (03Merged) 10jenkins-bot: Add missing space to error log prefix [labs/tools/fileprotectionsync] - 10https://gerrit.wikimedia.org/r/1063275 (owner: 10Krinkle) [23:22:40] 10Tools: Database query error in Stalk toy - https://phabricator.wikimedia.org/T372711 (10Dragoniez) 03NEW [23:27:34] 10Tools: Database query error in Stalk toy - https://phabricator.wikimedia.org/T372711#10071458 (10JJMC89) 05Open→03Invalid Issues are tracked at https://github.com/Pathoschild/Wikimedia-contrib/issues.