[00:40:13] 10Wikibugs: Update wikibugs's Gerrit ssh host keys - https://phabricator.wikimedia.org/T257383#9618077 (10bd808) If we land the currently proposed patches for {T359096} this task will become obsolete as the asyncssh config in that MR ignores host keys entirely. This is not risky in my opinion as the bot is only... [00:45:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [00:48:43] 10Wikibugs: Tasks with reduced visibility (logged-in-only) are reported incorrectly - https://phabricator.wikimedia.org/T90488#9618082 (10bd808) 05Open→03Declined This has likely been fixed by {T1176} for tasks that are visible to any authenticated account. Marking as declined however as I concur with T90488... [01:00:16] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [01:36:22] (HAProxyBackendUnavailable) firing: HAProxy service nova-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [01:36:43] 10Wikibugs: Some projects get lost - https://phabricator.wikimedia.org/T90267#9618110 (10bd808) 05Open→03Resolved With the current codebase `wikibugs2 -vv phorge --ask --start-from 6118017870098117805` generates this "useful_info" record: `lang=json {'url': 'https://phabricator.wikimedia.org/T87282#1055045',... [01:43:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [01:49:55] 10Wikibugs: Allow configuring which priority of tasks are sent to a channel - https://phabricator.wikimedia.org/T177798#9618114 (10bd808) p:05Triage→03Medium The ChannelFilter work done in {d8b582cba4636583e03e8cc37a818018ba56f753} should make this even easier. {543aa2b0b1052579918f23be4eaac7dde646ae7f} show... [01:51:29] 10Wikibugs: Ignore "changed the status of subtask" Phabricator transactions - https://phabricator.wikimedia.org/T357875#9618117 (10bd808) p:05Triage→03Medium [01:51:58] 10Wikibugs, 15User-bd808: Investigate producing a code quality report for GitLab based on flake8 - https://phabricator.wikimedia.org/T359685#9618119 (10bd808) p:05Triage→03Medium [01:56:09] 10Wikibugs, 07Developer Productivity: Hide postmerge jenkins-bot events in wikibugs - https://phabricator.wikimedia.org/T154094#9618125 (10bd808) [01:56:11] 10Wikibugs, 06Release-Engineering-Team: Exclude secondary jenkins-bot/PipelineBot messages from Gerrit in Wikibugs on IRC - https://phabricator.wikimedia.org/T201261#9618122 (10bd808) [01:57:33] 10Wikibugs: Wikibugs should ignore changes to the security field - https://phabricator.wikimedia.org/T105625#9618126 (10bd808) p:05Triage→03Medium [02:05:32] 10Wikibugs: Allow configuring which type of events are sent to a channel - https://phabricator.wikimedia.org/T116939#9618133 (10bd808) >>! In T264162#7057295, @LSobanski wrote: > Removing updates is blocked by T116939 Can you help flesh out the requirements here? I see "Remove update notifications" as your orig... [02:13:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:23:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [02:41:50] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:46:22] (HAProxyBackendUnavailable) resolved: HAProxy service nova-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [02:46:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [02:59:35] 10Wikibugs: Show when a task has been closed as a duplicate - https://phabricator.wikimedia.org/T128868#9618213 (10bd808) p:05Triage→03Medium `lang=pycon $ .tox/py39/bin/python3 Python 3.9.18 (main, Aug 24 2023, 18:16:58) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin Type "help", "copyright", "credits" or "li... [03:06:41] 10Wikibugs: Use case-insensitive sort for tags added to the irc log - https://phabricator.wikimedia.org/T90339#9618217 (10bd808) p:05Triage→03Medium I think `sorted(projects.keys(), key=str.lower)` in PhorgeMessageBuilder.build_project_text is all that is needed here. [03:08:51] (ProbeDown) firing: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:13:50] (ProbeDown) resolved: Service tools-static-14:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-14:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [03:45:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [04:03:43] 10Wikibugs: Wikibugs should not accidentally ping SREs by sending text "# page" - https://phabricator.wikimedia.org/T281105#9618255 (10bd808) p:05Lowest→03Low Rather than overthink this, how about just adding `.replace("#page", "# page")` to the existing transforms in `IRCMessageBuilder.escape()` and making... [04:06:57] 10Wikibugs: Print events in closed tasks in grey - https://phabricator.wikimedia.org/T140881#9618258 (10bd808) p:05Triage→03Low [04:20:01] (OpenstackAPIResponse) resolved: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [04:20:11] 10Wikibugs, 07Upstream: (grrrrit) Note (or maybe ignore) when a change is a rebase - https://phabricator.wikimedia.org/T59116#9618262 (10bd808) The #upstream bug is still open 10+ years later. I guess we should gather some representative events to see if something has changed that would allow the bot to find t... [04:35:17] 10Wikibugs: Match on usage of Additional Hashtags, so that project renames don't break the bot - https://phabricator.wikimedia.org/T87825#9618265 (10bd808) >>! In T87825#1022996, @valhallasw wrote: > After {T1176} is implemented, this is fairly trivial to add I see a way to do this by rewriting the channels.yam... [04:35:31] 10Wikibugs: Match on usage of Additional Hashtags, so that project renames don't break the bot - https://phabricator.wikimedia.org/T87825#9618266 (10bd808) p:05Medium→03Low [04:37:52] 10Wikibugs: Wikibugs comments twice, once to the wrong comment - https://phabricator.wikimedia.org/T132354#9618267 (10bd808) 05Open→03Invalid Please do reopen if and when a duplicate is found. We'll need to catch things in the debug logs within about 6 days to figure out the root cause. [04:41:08] 10Wikibugs: Wikibugs links sometimes to the creation event, not to the mentioned comment - https://phabricator.wikimedia.org/T129246#9618270 (10bd808) 05Open→03Resolved Possibly fixed by {T1177}. Please do reopen if not. We'll need to catch things in the debug logs within about 6 days to figure out the root... [05:14:18] 10Wikibugs: Allow filtering by CC - https://phabricator.wikimedia.org/T67429#9618278 (10bd808) This could be done as a "selector" with `ChannelFilter.channels_for`. See `wikibugs2.gerrit.async_main` & `wikibugs2.gerrit.branch_selector` for an example of how `"branch": "^wmf\/"` configuration in gerrit-channels.... [05:41:20] 10Wikibugs: Improve display of IRC logs for tasks with long tag names - https://phabricator.wikimedia.org/T161249#9618289 (10bd808) p:05Triage→03Low #3 happened accidentally when {T1176} was implemented. Almost immediately {T358653} asked that to be treated as a regression. A variation on that however could... [06:45:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [09:50:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [10:33:09] 10Wikibugs: Match on usage of Additional Hashtags, so that project renames don't break the bot - https://phabricator.wikimedia.org/T87825#9618339 (10valhallasw) My assumption at the time was that the API would not just provide the current tag but also a list of aliases, in which case the current matching logic c... [12:50:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [13:13:41] (CloudVPSDesignateLeaks) firing: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [13:23:41] (CloudVPSDesignateLeaks) resolved: (2) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [15:27:40] 10Tools: fatal: detected dubious ownership in repository at '/mnt/nfs/labstore-secondary-tools-project/pywikibot/public_html/core' - https://phabricator.wikimedia.org/T326469#9618428 (10valerio.bozzolan) By the way, I do not quite understand why, even if the repository is owned by root, this error is raised. It... [15:30:57] 05Grid-Engine-to-K8s-Migration: Migrate mbh from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319883#9618457 (10MBH) OK, I have solved most of errors, but got new. I have added last 5 tools to `all.sln` manually * https://github.com/Saisengen/wikibots/commit/ad4aad0f0e578af81... [15:31:13] 10Wikibugs: Match on usage of Additional Hashtags, so that project renames don't break the bot - https://phabricator.wikimedia.org/T87825#9618460 (10bd808) The API does provide the complete set of hashtags for a given project, but all of our current code and config is based on the display name of a project rathe... [15:31:29] (03CR) 10BryanDavis: [C: 04-2] "test" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [15:39:27] 10Wikibugs: Rethink anti-flooding protections - https://phabricator.wikimedia.org/T359753#9618464 (10bd808) I just did an accidental test of flood protections using only the irc3 bot and znc limits from the tools.wikibugs-testing deployment. The irc bot there had lost connectivity with redis overnight. When I re... [15:47:00] 10Wikibugs, 15User-bd808: wikibugs having a hard time staying connected to libera.chat IRC network - https://phabricator.wikimedia.org/T357729#9618467 (10bd808) In the tools.wikibugs-testing deployment the problem I am seeing now is the irc bot losing it's connection to redis rather than to znc & libera.chat.... [15:50:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [15:52:53] 10Wikibugs, 15User-bd808: Bot does not detect when ssh connection to Gerrit is interrupted - https://phabricator.wikimedia.org/T359096#9618470 (10CodeReviewBot) bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/14 gerrit: Use asyncssh [15:56:04] (03CR) 10BryanDavis: [C: 04-2] "Testing asyncssh on tools.wikibugs" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [15:58:26] 10Wikibugs, 15User-bd808: Bot does not detect when ssh connection to Gerrit is interrupted - https://phabricator.wikimedia.org/T359096#9618485 (10bd808) 05In progress→03Resolved [16:21:54] 10Wikibugs: Match on usage of Additional Hashtags, so that project renames don't break the bot - https://phabricator.wikimedia.org/T87825#9618497 (10valhallasw) D'oh! Yeah, looks like I completely overlooked that fairly essential step 😅 [18:55:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [19:10:24] 10Wikibugs: Wikibugs testing task - https://phabricator.wikimedia.org/T90594#9618586 (10bd808) test [19:11:30] (03CR) 10BryanDavis: [C: 04-2] "testing irc task changes locally" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [19:20:05] (03CR) 10BryanDavis: [C: 04-2] "test" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [19:34:47] (03CR) 10BryanDavis: [C: 04-2] "test" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [20:13:56] (SystemdUnitDown) firing: The service unit nova-fullstack.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [20:23:56] (SystemdUnitDown) resolved: The service unit nova-fullstack.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [21:00:55] (03CR) 10BryanDavis: [C: 04-2] "test" [labs/tools/wikibugs2] - 10https://gerrit.wikimedia.org/r/1008016 (https://phabricator.wikimedia.org/T90594) (owner: 10BryanDavis) [21:55:28] (PuppetStaleCertificates) firing: Found non-revoked Puppet certificates for 1 deleted instances on metricsinfra-puppetmaster-1 - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetStaleCertificates [22:41:41] (CloudVPSDesignateLeaks) firing: Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks [22:46:41] (CloudVPSDesignateLeaks) firing: (5) Detected 1 stray dns records - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/Designate_record_leaks - https://grafana.wikimedia.org/d/ebJoA6VWz/wmcs-openstack-eqiad-nova-fullstack - https://alerts.wikimedia.org/?q=alertname%3DCloudVPSDesignateLeaks