[00:10:03] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [00:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [00:11:03] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [00:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [01:05:20] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [01:10:20] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [03:10:03] (InstanceDown) firing: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [03:11:03] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [03:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [03:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [04:22:29] 10cloud-services-team: openstack: consider automating DB grants - https://phabricator.wikimedia.org/T346619 (10Andrew) a:05Andrew→03None [04:23:04] 10cloud-services-team, 10Goal, 10Patch-For-Review: Replace cinder-backup process with backy2 - https://phabricator.wikimedia.org/T344065 (10Andrew) 05Open→03Resolved [04:23:09] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal, 10Patch-For-Review: cloud NFS: figure out backups for cinder volumes - https://phabricator.wikimedia.org/T292546 (10Andrew) [04:23:40] 10cloud-services-team: radosgw+keystone chokes on projects with '-' in their id - https://phabricator.wikimedia.org/T341509 (10Andrew) [04:35:53] 10Cloud-VPS: Unable to delete volume group accounts-oauth - https://phabricator.wikimedia.org/T342381 (10Andrew) 05Open→03Resolved I finally gave up on the API and manually marked this deleted in the database. [04:37:11] 10cloud-services-team, 10Patch-For-Review: cinder-backup getting OOM-killed for large volumes - https://phabricator.wikimedia.org/T339830 (10Andrew) 05Open→03Resolved [04:37:14] 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal: cloud NFS: figure out backups for cinder volumes - https://phabricator.wikimedia.org/T292546 (10Andrew) [04:56:03] (TfInfraTestApplyFailed) resolved: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [05:05:03] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [06:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [06:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [07:13:27] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [08:25:11] 10Tool-refill: git warning about duplicate file names (case sensitive paths on case insensitive file system) - https://phabricator.wikimedia.org/T352363 (10Novem_Linguae) [09:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [09:14:00] 10Tool-refill: Chancetoreview string needs to use stronger language - https://phabricator.wikimedia.org/T352366 (10Novem_Linguae) [09:14:30] 10Tool-refill: Dependabot PRs need updating - https://phabricator.wikimedia.org/T352367 (10Novem_Linguae) [09:15:37] 10Tool-refill: Celery 5.2.7 - https://phabricator.wikimedia.org/T352368 (10Novem_Linguae) [09:16:45] 10Tool-refill: git warning about duplicate file names (case sensitive paths on case insensitive file system) - https://phabricator.wikimedia.org/T352363 (10Novem_Linguae) Looks like this was also filed at https://github.com/CurbSafeCharmer/refill/issues/226 by @AManWithNoPlan. Will keep this ticket as the main o... [09:18:25] 10Tool-refill, 10Internet-Archive: web.archive.org URLs should automatically be set as archive-urls - https://phabricator.wikimedia.org/T352370 (10Novem_Linguae) [09:20:04] 10Tool-refill: Left braces rendering incorrectly in preview sometimes - https://phabricator.wikimedia.org/T352371 (10Novem_Linguae) [09:21:15] 10Tool-refill: Replace cite pmid -> cite journal - https://phabricator.wikimedia.org/T352372 (10Novem_Linguae) [09:22:22] 10Tool-refill: Incorrect suffix checks - https://phabricator.wikimedia.org/T352373 (10Novem_Linguae) [09:23:30] 10Tool-refill: Incomplete multi-character sanitization - https://phabricator.wikimedia.org/T352374 (10Novem_Linguae) [09:24:57] 10Tool-refill: Should not fill login pages - https://phabricator.wikimedia.org/T352375 (10Novem_Linguae) [09:25:43] 10Tool-refill: Should not fill login pages - https://phabricator.wikimedia.org/T352375 (10Novem_Linguae) Reply by @Curb_Safe_Charmer: The original ref was `https://www.facebook.com/VimaMusicAwards {{bare URL inline|date=February 2022}} {{registration required}}` Entering https://www.facebook.com/VimaMusicAwar... [09:26:09] 10Tool-refill: Should not fill login pages - https://phabricator.wikimedia.org/T352375 (10Novem_Linguae) Reply by Pppery: Also affects Instagram (https://en.wikipedia.org/w/index.php?title=Gretel_Scarlett&diff=prev&oldid=1070702685) [09:34:42] 10Tool-refill: Replace cite pmid -> cite journal - https://phabricator.wikimedia.org/T352372 (10Novem_Linguae) Reply by @AManWithNoPlan: Citation Bot has fixes all these on English Wikipedia. Some other Wikis still have cite pmid and this should not be done on those Wikies [09:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [09:37:40] 10Tool-refill: Mangles dead links - https://phabricator.wikimedia.org/T352376 (10Novem_Linguae) [09:38:50] 10Tool-refill: Mangles dead links - https://phabricator.wikimedia.org/T352376 (10Novem_Linguae) Reply by @Curb_Safe_Charmer: While a diff is useful in reports (thanks) the first thing that has to be done in triage is to extract just the problematic URL and run that through reFill on its own, i.e. http://www.ba... [09:39:37] 10Tool-refill: Mangles dead links - https://phabricator.wikimedia.org/T352376 (10Novem_Linguae) Reply by @Pppery The CheckIfDead library used by InternetArchiveBot includes code to check for the specific case of redirecting to the domain root (https://github.com/wikimedia/DeadlinkChecker/blob/8841c56c7c5a1cff82f... [09:40:14] 10Tool-refill: Add ckbwiki to reFill tool - https://phabricator.wikimedia.org/T352377 (10Novem_Linguae) [09:42:03] 10Tool-refill: Poor HTTP error handling in refill-api - https://phabricator.wikimedia.org/T352378 (10Novem_Linguae) [09:42:47] 10Tool-refill: Poor HTTP error handling in refill-api - https://phabricator.wikimedia.org/T352378 (10Novem_Linguae) Reply by @Curb_Safe_Charmer: AntiCompositeNumber, on Mar 30, 2021, zhaofengli edited backend/refill/utils/init.py with the description 'backend: Add better request timeouts' (02cf975[[ https://git... [09:43:28] 10Tool-refill: deadurl is deprecated and will be unsupported soon - https://phabricator.wikimedia.org/T352379 (10Novem_Linguae) [09:44:26] 10Tool-refill: Add option not to covert citation format - https://phabricator.wikimedia.org/T352380 (10Novem_Linguae) [09:45:08] 10Tool-refill: Add option not to covert citation format - https://phabricator.wikimedia.org/T352380 (10Novem_Linguae) Reply by reporter123: Links of the form [http://example.com text] are generally expanded by default. These often not the target of my usage. None the less they will be expanded to full cite-web... [09:45:25] 10Tool-refill: Add option not to covert citation format - https://phabricator.wikimedia.org/T352380 (10Novem_Linguae) Reply by @Nemo_bis: Can you give an example? My experience is the opposite: even some references which are in my opinion "bare" (no title, for instance) are ignored by reFill because they are no... [09:46:04] 10Tool-refill: Correct date format for Serbian Wikipedia - https://phabricator.wikimedia.org/T352381 (10Novem_Linguae) [09:46:28] 10Tool-refill: Correct date format for Serbian Wikipedia - https://phabricator.wikimedia.org/T352381 (10Novem_Linguae) Reply by @Curb_Safe_Charmer: According to https://en.wikipedia.org/wiki/Date_and_time_notation_in_Serbia "the space is, however, optional, as dates could be written and without the space (for e... [09:48:39] 10Tool-refill: Make reFill available for Urdu - https://phabricator.wikimedia.org/T352382 (10Novem_Linguae) [09:49:18] 10Tool-refill: Make reFill available for Urdu - https://phabricator.wikimedia.org/T352382 (10Novem_Linguae) Reply by BukharanJC: https://en.m.wikipedia.org/wiki/User:Zhaofeng_Li/reFill I think this is not available in urdu see https://tools.wmflabs.org/refill/ [09:52:08] 10Tool-refill: Does not work well with other wikis - https://phabricator.wikimedia.org/T352383 (10Novem_Linguae) [09:52:51] 10Tool-refill: Does not work well with other wikis - https://phabricator.wikimedia.org/T352383 (10Novem_Linguae) Reply by @Nemo_bis: Thanks for testing it on non-Wikimedia wikis. Patches would be great! [09:52:56] 10Tool-refill: Does not work well with other wikis - https://phabricator.wikimedia.org/T352383 (10Novem_Linguae) Reply by CRGorman: I have yet to do anything with this since October 2017. When I get some time I will look at this again. [09:53:07] 10Tool-refill: Does not work well with other wikis - https://phabricator.wikimedia.org/T352383 (10Novem_Linguae) Reply by @Curb_Safe_Charmer : CRGorman it looks to me like you added this issue at a time when zeroserenity.com was based on the original version of reFill, and you've since upgraded it to reFill2. I... [09:53:51] 10Tool-refill: More WMF sites (e.g. q:zh:江泽民) - https://phabricator.wikimedia.org/T352384 (10Novem_Linguae) [09:54:45] 10Tool-refill, 10Internet-Archive: Take title from bare reference when no title found on page - https://phabricator.wikimedia.org/T352385 (10Novem_Linguae) [09:55:36] 10Tool-refill: Tool may accept citoid config at wiki and use localized versions of templates - https://phabricator.wikimedia.org/T352386 (10Novem_Linguae) [09:56:18] 10Tool-refill: Tool may accept citoid config at wiki and use localized versions of templates - https://phabricator.wikimedia.org/T352386 (10Novem_Linguae) Reply by @ricordisamoa E.g. API for citoid-template-type-map.json message https://cs.wikipedia.org/w/api.php?action=query&meta=allmessages&ammessages=citoid-... [10:09:40] 10Toolforge (Toolforge iteration 02): [tbs][builder] Explore adding support for third-party buildpacks - https://phabricator.wikimedia.org/T352389 (10Slst2020) [10:11:26] 10Toolforge (Toolforge iteration 02): [tbs][builder] Explore adding support for third-party buildpacks - https://phabricator.wikimedia.org/T352389 (10Slst2020) [11:13:27] (OpenstackAPIResponse) firing: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [11:19:27] 10Tool-refill: Automatic linking of work/publisher - https://phabricator.wikimedia.org/T352394 (10Novem_Linguae) [11:20:44] 10Tool-refill: Move options to results page - https://phabricator.wikimedia.org/T352395 (10Novem_Linguae) [11:22:23] 10Tool-refill: Default citation format - https://phabricator.wikimedia.org/T352396 (10Novem_Linguae) [11:22:57] 10Tool-refill: Detect and reconcile duplicates - https://phabricator.wikimedia.org/T352397 (10Novem_Linguae) [11:23:35] 10Tool-refill: Automate {{cite book}} from worldcat.org - https://phabricator.wikimedia.org/T352398 (10Novem_Linguae) [11:23:48] 10Tool-refill: Automate {{cite book}} from worldcat.org - https://phabricator.wikimedia.org/T352398 (10Novem_Linguae) Reply by owcz: This should work with Citoid [11:24:16] 10Tool-refill: Automate {{cite book}} from worldcat.org - https://phabricator.wikimedia.org/T352398 (10Novem_Linguae) Reply by @Nemo_bis : It should but it doesn't: it gives a cite web instead. Even after stripping down a ref so that it only contained a Worldcat URL, this is what I got: https://en.wikipedia.or... [11:25:17] 10Tool-refill: Publish all versions and allow user to select them - https://phabricator.wikimedia.org/T352399 (10Novem_Linguae) [11:25:57] 10Tool-refill: Edit description should state consolidation of redundant named citations - https://phabricator.wikimedia.org/T352401 (10Novem_Linguae) [11:27:44] 10Tool-refill: Generate named references - https://phabricator.wikimedia.org/T352402 (10Novem_Linguae) [11:28:13] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1007.eqiad.wmnet' (T348843) [11:28:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:28:20] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [11:28:29] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Novem_Linguae) [11:29:10] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Novem_Linguae) Reply by Artoria2e5: Consider using [[ http://timetravel.mementoweb.org/ | Memento ]] and its APIs -- archive.org and archive.is both speak their language. Also check out [[ https://en.wikipedia.or... [11:29:20] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Novem_Linguae) Reply by JohnRTitor: Consider using IA Bot after every time you refill bare reference. https://tools.wmflabs.org/iabot [11:30:41] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Novem_Linguae) Reply by @Nemo_bis: In my opinion this task can be declined, in the [[ https://en.wikipedia.org/wiki/Unix_philosophy | UNIX philosophy ]]: use InternetArchiveBot for this purpose, let reFill focus... [11:31:00] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Novem_Linguae) Reply by @waldyrious : > use InternetArchiveBot for this purpose Link for convenience: https://meta.wikimedia.org/wiki/InternetArchiveBot [11:31:47] 10Tool-refill: Format External links - https://phabricator.wikimedia.org/T352404 (10Novem_Linguae) [11:32:43] 10Tool-refill: Format External links - https://phabricator.wikimedia.org/T352404 (10Novem_Linguae) Reply by @Nemo_bis: I agree it would be nice to do both this and the biliography section. [11:33:46] 10Tool-refill: Format External links - https://phabricator.wikimedia.org/T352404 (10Novem_Linguae) Reply by @waldyrious : I'm not sure that's appropriate. From [[ https://en.wikipedia.org/wiki/Template:URL | Template:URL's documentation ]]: > Note: If you wish to display text instead of the URL (e.g. [[ http:/... [11:33:54] 10Tool-refill: total reformatting - https://phabricator.wikimedia.org/T352407 (10Novem_Linguae) [11:35:53] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) [11:36:22] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by razasyedh: Yeah the effort might not be worth it if the 9-digit form is more common. And bare WebCite URL's might be few (3,600 total Google hits vs archive.org's 35,000). If... [11:36:45] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by zhaofengli: Okay, I've [[ https://github.com/zhaofengli/refill-labsconf/commit/8bcebd724d2806df90027701a82de83e3eb319a5 | blacklisted ]] webcitation.org while we are looking... [11:36:56] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by @zhaofengli : WebCite pages are pretty problematic as there is likely no useful information in the URLs (most links only include a 9-digit ID), and the returned pages include... [11:37:10] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by owcz: If you already have a scrape running on the page, the original URL is displayed in WebCite archives at `/html/body/table/tbody/tr[1]/td[2]/a[1]` where `table[@class="to... [11:37:37] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by Artoria2e5: Consider checking out the workings of [[ https://en.wikipedia.org/wiki/User:Green_Cardamom/WaybackMedic_2.1 | Wayback Medic ]]. [11:38:42] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by @Green_Cardamom : I'm the author of WaybackMedic and can answer questions (User:Green Cardamom). The archive date can be obtained from the 9-digit ID .. it is a base62 number... [11:39:06] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by @Nemo_bis : Webcitation.org is once again winding down and (on the English Wikipedia) there's already wayback medic to handle the past URLs, so in my opinion this task can be... [11:39:51] 10Tool-refill, 10Internet-Archive: Properly handle WebCite links - https://phabricator.wikimedia.org/T352409 (10Novem_Linguae) Reply by @Curb_Safe_Charmer : @green_cardamom do you agree with @nemo_bis that this issue can be closed? [11:41:19] (HAProxyBackendUnavailable) firing: (12) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:41:24] 10Tool-refill: Celery 5.2.7 - https://phabricator.wikimedia.org/T352368 (10Novem_Linguae) Reply by @AManWithNoPlan: Weird. It is on PyPi. https://pypi.org/simple/celery/ python:3.6-alpine3.8 is pretty old. Perhaps you need a python 3.7 version. [[ https://github.com/CurbSafeCharmer/refill/pull/248 | PR #248 ]... [11:41:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [11:42:07] 10Tool-refill: git warning about duplicate file names (case sensitive paths on case insensitive file system) (capitalization) - https://phabricator.wikimedia.org/T352363 (10Novem_Linguae) [11:44:02] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1007.eqiad.wmnet' (T348843) [11:44:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:44:08] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [11:45:13] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1006.eqiad.wmnet' (T348843) [11:45:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:46:20] (HAProxyBackendUnavailable) firing: (13) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:46:40] (GaleraClusterSizeMismatch) resolved: Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [11:48:27] (OpenstackAPIResponse) resolved: Openstack API average response time is too high. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/OpenstackAPIResponse - https://grafana.wikimedia.org/d/UUmLqqX4k - https://alerts.wikimedia.org/?q=alertname%3DOpenstackAPIResponse [11:50:59] 10Quarry: [bug] query/77794: "This query was stopped" - https://phabricator.wikimedia.org/T352211 (10Novem_Linguae) [11:51:20] (HAProxyBackendUnavailable) resolved: (13) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:53:06] 10Tool-refill: Copy/paste all ReFill GitHub Issues to Phabricator, then turn off GitHub Issues - https://phabricator.wikimedia.org/T340502 (10Curb_Safe_Charmer) [11:55:55] 10Tool-refill: Copy/paste all ReFill GitHub Issues to Phabricator, then turn off GitHub Issues - https://phabricator.wikimedia.org/T340502 (10Curb_Safe_Charmer) [11:57:50] (HAProxyBackendUnavailable) firing: (23) HAProxy service cinder-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [11:58:10] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [11:59:01] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1006.eqiad.wmnet' (T348843) [11:59:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:59:06] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [12:01:55] (GaleraClusterSizeMismatch) resolved: (3) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [12:02:50] (HAProxyBackendUnavailable) firing: (26) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:03:05] (HAProxyBackendUnavailable) resolved: (26) HAProxy service cinder-api_backend backend cloudcontrol1006.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:03:52] 10Tool-refill: Copy/paste all ReFill GitHub Issues to Phabricator, then turn off GitHub Issues - https://phabricator.wikimedia.org/T340502 (10Curb_Safe_Charmer) [12:03:57] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudcontrol1005.eqiad.wmnet' (T348843) [12:04:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:04:08] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [12:06:33] 10Tool-refill: Copy/paste all ReFill GitHub Issues to Phabricator, then turn off GitHub Issues - https://phabricator.wikimedia.org/T340502 (10Novem_Linguae) [12:07:47] 10Tool-refill: Copy/paste all ReFill GitHub Issues to Phabricator, then turn off GitHub Issues - https://phabricator.wikimedia.org/T340502 (10Novem_Linguae) 05Open→03Resolved a:03Novem_Linguae The last two items unticked on the todo list will be taken care of when we turn off the https://github.com/refill-... [12:09:40] 10Toolforge: Toolforge Kubernetes quota requests.memory was reduced - https://phabricator.wikimedia.org/T352055 (10taavi) 05Open→03Resolved >>! In T352055#9363408, @taavi wrote: > 3. Manually bump quotas for affected tools. This was the winner in the WMCS team meeting yesterday. My understanding is that both... [12:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [12:14:33] (SystemdUnitDown) firing: The service unit nova-fullstack.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [12:17:05] (HAProxyBackendUnavailable) firing: (14) HAProxy service glance-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:17:40] (GaleraClusterSizeMismatch) firing: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [12:18:48] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudcontrol1005.eqiad.wmnet' (T348843) [12:18:51] (HAProxyBackendUnavailable) firing: (26) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:18:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:18:54] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [12:19:25] (GaleraClusterSizeMismatch) firing: (3) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [12:22:05] (HAProxyBackendUnavailable) firing: (18) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:22:40] (GaleraClusterSizeMismatch) resolved: (2) Galera in eqiad1 has 2 nodes - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/GaleraClusterSizeMismatch - https://grafana.wikimedia.org/d/galera-cluster-summary/wmcs-openstack-eqiad-galera-cluster-summary - https://alerts.wikimedia.org/?q=alertname%3DGaleraClusterSizeMismatch [12:23:49] (HAProxyBackendUnavailable) resolved: (13) HAProxy service cinder-api_backend backend cloudcontrol1005.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [12:24:33] (SystemdUnitDown) resolved: The service unit nova-fullstack.service is in failed status on host cloudcontrol1006. - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcontrol1006 - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitDown [12:27:16] 10Cloud-VPS (Quota-requests), 10MinT, 10Language-Team (Language-2023-October-December): Create large instance for MinT - https://phabricator.wikimedia.org/T352136 (10KartikMistry) >>! In T352136#9368052, @Andrew wrote: > We would like to help, but it's hard for us to understand what you're asking here for. >... [12:36:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [12:39:14] 10Quarry: [bug] query/77794: "This query was stopped" - https://phabricator.wikimedia.org/T352211 (10Boshomi_Phabricator) [12:41:07] 10Cloud-VPS (Quota-requests), 10MinT, 10Language-Team (Language-2023-October-December): Create large instance for MinT - https://phabricator.wikimedia.org/T352136 (10Slst2020) @KartikMistry if I'm understanding right, you are after more RAM and disk space. Could you give us specific numbers for both? [12:42:31] 10Quarry: [bug] query/77794: "This query was stopped" - https://phabricator.wikimedia.org/T352211 (10taavi) The wiki replicas were accidentally running on shorter-than-usual network timeouts due to the work going on in {T346947}. That's now been fixed, try again? [12:43:01] (03PS1) 10Btullis: Add dummy keytabs for new hadoop coordinators [labs/private] - 10https://gerrit.wikimedia.org/r/979088 (https://phabricator.wikimedia.org/T336045) [12:44:40] (03CR) 10Btullis: [V: 03+2 C: 03+2] Add dummy keytabs for new hadoop coordinators [labs/private] - 10https://gerrit.wikimedia.org/r/979088 (https://phabricator.wikimedia.org/T336045) (owner: 10Btullis) [13:17:57] 10Cloud-VPS (Quota-requests), 10MinT, 10Language-Team (Language-2023-October-December): Create large instance for MinT - https://phabricator.wikimedia.org/T352136 (10KartikMistry) >>! In T352136#9370894, @Slst2020 wrote: > @KartikMistry if I'm understanding right, you are after more RAM and disk space. Could... [13:42:58] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudweb.set_maintenance (T348843) [13:43:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:43:04] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [13:43:51] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudweb.set_maintenance (exit_code=0) (T348843) [13:43:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:45:07] 10Data-Services, 10Quarry: [bug] query/77794: "This query was stopped" - https://phabricator.wikimedia.org/T352211 (10taavi) 05Open→03Resolved a:03taavi [13:45:19] 10Toolforge (Toolforge iteration 02): [maintain-harbor] Manage project quotas via maintain-harbor - https://phabricator.wikimedia.org/T352417 (10Slst2020) [13:55:20] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudnet1005.eqiad.wmnet' (T348843) [13:55:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:55:26] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [13:59:10] 10Toolforge (Toolforge iteration 02): [maintain-harbor] Manage project quotas via maintain-harbor - https://phabricator.wikimedia.org/T352417 (10Slst2020) 05Open→03In progress a:03Slst2020 [13:59:13] 10Toolforge (Toolforge iteration 02): [tbs] Improve Harbor quota handling and docs - https://phabricator.wikimedia.org/T351092 (10Slst2020) [14:00:38] 10Toolforge (Toolforge iteration 02): maintain-harbor: code refactor for readability and quality - https://phabricator.wikimedia.org/T351277 (10Slst2020) 05Open→03Resolved [14:01:58] 10Toolforge: [envvars,maintain-kubeusers] create and populate envvars for common service names - https://phabricator.wikimedia.org/T347141 (10Slst2020) [14:03:06] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudnet1005.eqiad.wmnet' (T348843) [14:03:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:03:12] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [14:05:40] (NeutronAgentDown) firing: (2) Neutron neutron-linuxbridge-agent on cloudnet1005 is down - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#Networking_failures - https://grafana.wikimedia.org/d/wKnDJf97z/wmcs-neutron-eqiad1 - https://alerts.wikimedia.org/?q=alertname%3DNeutronAgentDown [14:16:14] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node on host 'cloudnet1006.eqiad.wmnet' (T348843) [14:16:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:16:20] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [14:24:00] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudcontrol.upgrade_openstack_node (exit_code=0) on host 'cloudnet1006.eqiad.wmnet' (T348843) [14:30:25] !log admin fran@wmf3169 START - Cookbook wmcs.openstack.cloudweb.unset_maintenance (T348843) [14:30:44] !log admin fran@wmf3169 END (PASS) - Cookbook wmcs.openstack.cloudweb.unset_maintenance (exit_code=0) (T348843) [14:38:40] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1067.eqiad.wmnet' (T348843) [14:38:46] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [14:43:28] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1067.eqiad.wmnet' (T348843) [14:45:10] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1066.eqiad.wmnet' (T348843) [14:45:16] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [14:49:56] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1066.eqiad.wmnet' (T348843) [14:56:29] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1060.eqiad.wmnet' (T348843) [14:56:36] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:01:55] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1060.eqiad.wmnet' (T348843) [15:01:56] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1061.eqiad.wmnet' (T348843) [15:02:16] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:02:43] (CephSlowOps) firing: Ceph cluster in eqiad has 54 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [15:02:48] 10cloud-services-team: CephSlowOps Ceph cluster in eqiad has slow ops, which might be blocking some writes - https://phabricator.wikimedia.org/T352436 (10phaultfinder) [15:05:45] (CephClusterInWarning) firing: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [15:07:37] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1061.eqiad.wmnet' (T348843) [15:07:38] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1062.eqiad.wmnet' (T348843) [15:07:42] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:09:50] 10Grid-Engine-to-K8s-Migration: Migrate steve-adder from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320062 (10komla) >>! In T320062#9322023, @Aklapper wrote: > @komla Who is "we" and who are you addressing? Yes, I was referring to Legoktm [15:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [15:12:43] (CephSlowOps) resolved: Ceph cluster in eqiad has 74 slow ops - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephSlowOps - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephSlowOps [15:12:57] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1062.eqiad.wmnet' (T348843) [15:12:58] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1063.eqiad.wmnet' (T348843) [15:13:03] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:15:37] (CephClusterInWarning) resolved: Ceph cluster in eqiad is in warning status - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/CephClusterInWarning - https://grafana.wikimedia.org/d/P1tFnn3Mk/wmcs-ceph-eqiad-health?orgId=1&search=open&tag=ceph&tag=health&tag=WMCS - https://alerts.wikimedia.org/?q=alertname%3DCephClusterInWarning [15:17:22] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1063.eqiad.wmnet' (T348843) [15:17:23] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1064.eqiad.wmnet' (T348843) [15:22:05] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1064.eqiad.wmnet' (T348843) [15:22:07] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1065.eqiad.wmnet' (T348843) [15:22:11] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:26:50] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1065.eqiad.wmnet' (T348843) [15:31:39] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1050.eqiad.wmnet' (T348843) [15:31:45] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:36:57] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1050.eqiad.wmnet' (T348843) [15:36:58] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1051.eqiad.wmnet' (T348843) [15:37:03] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:38:14] 10Tool-refill, 10Internet-Archive: Wayback Machine - https://phabricator.wikimedia.org/T352403 (10Pppery) 05Open→03Declined >>! In T352403#9370579, @Novem_Linguae wrote: > Reply by @Nemo_bis: > > In my opinion this task can be declined, in the [[ https://en.wikipedia.org/wiki/Unix_philosophy | UNIX philos... [15:41:03] (InstanceDown) firing: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:41:12] 10Tool-refill: deadurl is deprecated and will be unsupported soon - https://phabricator.wikimedia.org/T352379 (10Pppery) 05Open→03Resolved a:03AManWithNoPlan This was done in https://github.com/CurbSafeCharmer/refill/pull/224 [15:41:37] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1051.eqiad.wmnet' (T348843) [15:41:38] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1052.eqiad.wmnet' (T348843) [15:42:06] 10Tool-refill: Incomplete multi-character sanitization - https://phabricator.wikimedia.org/T352374 (10Pppery) That link is a 404 for me. Is there anything left to do here? [15:44:49] 10Tool-refill: Incomplete multi-character sanitization - https://phabricator.wikimedia.org/T352374 (10Novem_Linguae) Looks like TNT ran a code security tool and it turned up some issues. In this case it's related to "Incomplete multi-character sanitization". Would be good to re-run the tool then copy paste the r... [15:46:41] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1052.eqiad.wmnet' (T348843) [15:46:42] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1053.eqiad.wmnet' (T348843) [15:46:47] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [15:51:17] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1053.eqiad.wmnet' (T348843) [15:51:19] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1054.eqiad.wmnet' (T348843) [15:55:57] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1054.eqiad.wmnet' (T348843) [15:55:58] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1055.eqiad.wmnet' (T348843) [15:56:02] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:00:37] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1056.eqiad.wmnet' (T348843) [16:05:32] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1056.eqiad.wmnet' (T348843) [16:05:33] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1057.eqiad.wmnet' (T348843) [16:05:38] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:10:37] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1057.eqiad.wmnet' (T348843) [16:10:38] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1058.eqiad.wmnet' (T348843) [16:10:43] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:15:51] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1058.eqiad.wmnet' (T348843) [16:15:52] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1059.eqiad.wmnet' (T348843) [16:15:57] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:20:58] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1059.eqiad.wmnet' (T348843) [16:21:04] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:29:50] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1047.eqiad.wmnet' (T348843) [16:29:56] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:34:46] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1047.eqiad.wmnet' (T348843) [16:34:47] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1048.eqiad.wmnet' (T348843) [16:39:50] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1048.eqiad.wmnet' (T348843) [16:39:51] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1049.eqiad.wmnet' (T348843) [16:39:56] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:45:04] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1049.eqiad.wmnet' (T348843) [16:45:11] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:49:20] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1040.eqiad.wmnet' (T348843) [16:54:27] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1040.eqiad.wmnet' (T348843) [16:54:29] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1041.eqiad.wmnet' (T348843) [16:54:33] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [16:59:15] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1041.eqiad.wmnet' (T348843) [16:59:16] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1042.eqiad.wmnet' (T348843) [17:01:52] 10Grid-Engine-to-K8s-Migration: Migrate steve-adder from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T320062 (10Legoktm) >>! In T320062#9355901, @Izno wrote: > https://steve-adder.toolforge.org/ seems to have gone down, I'd guess as a result of this task. :) Oops sorry, I me... [17:03:43] 10PAWS: tofu state file to object storage - https://phabricator.wikimedia.org/T352164 (10rook) https://github.com/opentofu/opentofu/issues/947 [17:04:16] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1042.eqiad.wmnet' (T348843) [17:04:17] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1043.eqiad.wmnet' (T348843) [17:04:21] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:09:01] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1043.eqiad.wmnet' (T348843) [17:09:02] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1044.eqiad.wmnet' (T348843) [17:14:07] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1044.eqiad.wmnet' (T348843) [17:14:08] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1045.eqiad.wmnet' (T348843) [17:14:13] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:16:03] (InstanceDown) resolved: Project toolsbeta instance toolsbeta-bastion-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [17:19:25] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1045.eqiad.wmnet' (T348843) [17:19:31] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:19:58] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1001.eqiad.wmnet' (T348843) [17:21:03] (PuppetAgentStaleLastRun) firing: Last Puppet run was over 24 hours ago on instance toolsbeta-bastion-6 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [17:25:21] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1001.eqiad.wmnet' (T348843) [17:25:23] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1002.eqiad.wmnet' (T348843) [17:25:27] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:30:37] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1002.eqiad.wmnet' (T348843) [17:30:39] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirtlocal1003.eqiad.wmnet' (T348843) [17:30:44] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:36:06] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirtlocal1003.eqiad.wmnet' (T348843) [17:36:12] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:37:07] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1001.eqiad.wmnet' (T348843) [17:42:18] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1001.eqiad.wmnet' (T348843) [17:42:20] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1002.eqiad.wmnet' (T348843) [17:42:24] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:42:57] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1031.eqiad.wmnet' [17:47:10] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1002.eqiad.wmnet' (T348843) [17:47:11] !log fnegri@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt-wdqs1003.eqiad.wmnet' (T348843) [17:47:58] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1031.eqiad.wmnet' [17:48:15] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1032.eqiad.wmnet' [17:48:18] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1033.eqiad.wmnet' [17:48:22] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1034.eqiad.wmnet' [17:51:03] (PuppetAgentStaleLastRun) resolved: Last Puppet run was over 24 hours ago on instance toolsbeta-bastion-6 in project toolsbeta - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [17:52:18] !log fnegri@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt-wdqs1003.eqiad.wmnet' (T348843) [17:52:23] T348843: [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 [17:53:17] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1032.eqiad.wmnet' [17:53:32] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1034.eqiad.wmnet' [17:53:38] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1033.eqiad.wmnet' [17:55:19] (HAProxyBackendUnavailable) firing: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [17:59:13] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1035.eqiad.wmnet' [17:59:15] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1036.eqiad.wmnet' [17:59:19] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1037.eqiad.wmnet' [18:00:19] (HAProxyBackendUnavailable) resolved: HAProxy service neutron-api_backend backend cloudcontrol1007.private.eqiad.wikimedia.cloud is down - https://wikitech.wikimedia.org/wiki/HAProxy - TODO - https://alerts.wikimedia.org/?q=alertname%3DHAProxyBackendUnavailable [18:04:17] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1035.eqiad.wmnet' [18:04:20] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1036.eqiad.wmnet' [18:04:41] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1037.eqiad.wmnet' [18:05:23] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1038.eqiad.wmnet' [18:05:28] !log andrew@cloudcumin1001 admin START - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack on host 'cloudvirt1039.eqiad.wmnet' [18:10:40] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1038.eqiad.wmnet' [18:10:45] !log andrew@cloudcumin1001 admin END (PASS) - Cookbook wmcs.openstack.cloudvirt.live_upgrade_openstack (exit_code=0) on host 'cloudvirt1039.eqiad.wmnet' [18:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [18:33:59] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1-Q2), 10Goal: Upgrade cloud-vps openstack to version 'Antelope' - https://phabricator.wikimedia.org/T341285 (10Andrew) [18:34:01] 10Cloud-VPS, 10cloud-services-team (FY2023/2024-Q1-Q2): [openstack] Upgrade eqiad1 cluster to Antelope - https://phabricator.wikimedia.org/T348843 (10Andrew) 05In progress→03Resolved I upgraded the last few cloudvirts so this is done. [18:44:39] vivian-rook opened https://github.com/toolforge/paws/pull/355 [18:59:35] (PuppetAgentDisabled) firing: (2) Puppet agent disabled on instance quarry-dev-03 in project quarry - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentDisabled [18:59:35] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance quarry-dev-03 in project quarry - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun [19:36:24] vivian-rook closed https://github.com/toolforge/paws/pull/355 [19:59:03] (TfInfraTestApplyFailed) firing: Terraform failed to apply/create the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestApplyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestApplyFailed [20:18:10] 10Grid-Engine-to-K8s-Migration: Migrate dplbot from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319701 (10russblau) Hi. Please do not shut down this tool in the first stage of the Grid phaseout. (FYI, although this tool has three maintainers, I am the only one who is active.... [21:11:03] (TfInfraTestDestroyFailed) firing: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [21:31:03] (TfInfraTestDestroyFailed) resolved: Terraform failed to destroy the resounces on tf-bastion - https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/TfInfraTestDestroyFailed - https://prometheus-alerts.wmcloud.org/?q=alertname%3DTfInfraTestDestroyFailed [21:34:03] (InstanceDown) resolved: Project tf-infra-test instance tf-infra-test is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [21:59:35] (PuppetAgentDisabled) firing: (2) Puppet agent disabled on instance quarry-dev-03 in project quarry - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentDisabled [21:59:35] (PuppetAgentStaleLastRun) firing: (2) Last Puppet run was over 24 hours ago on instance quarry-dev-03 in project quarry - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentStaleLastRun