[01:03:15] !log deployment-prep Added RLazarus as a project member [01:03:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [05:51:20] Do I see that Cloud VPS now supports Fedora for some reason? [09:06:44] !log melos@tools-sgebastion-10 tools.stewardbots Restarted StewardBot [09:06:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [09:10:51] !log lucaswerkmeister@tools-sgebastion-10 tools.bridgebot Double IRC messages to other bridges [09:10:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [09:16:44] !log superpes@tools-sgebastion-10 tools.stewardbots Restarted SULWatcher which had quit from IRC [09:16:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [10:03:49] !log superpes@tools-sgebastion-10 tools.stewardbots Restarted Stewardbot and SULWatcher which had quit from IRC [10:03:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [11:03:54] codesearch seems to be having some issues (a lot of 500 Internal Server Error at https://codesearch-backend.wmcloud.org/_health) [11:36:15] yep, it seems the VM is badly out of resources [11:36:19] https://www.irccloud.com/pastebin/9AuFTEiM/ [11:37:56] probably some time early this morning https://grafana-rw.wmcloud.org/d/0g9N-7pVz/cloud-vps-project-board?orgId=1&var-project=codesearch&var-instance=All [11:38:04] I'll try to reboot the instance [11:38:39] !log codesearch reboot codesearch8 as it's locked out of resources [11:38:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Codesearch/SAL [11:58:59] codesearch seems to be back [12:04:58] !log admin restarted nova-api on cloudcontrol1005 as it was very slow [12:05:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:07:19] !log admin restarted nova-api on cloudcontrol100* as it was very slow [12:07:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [12:18:52] !log superpes@tools-sgebastion-10 tools.stewardbots Restarted Stewardbot which had quit from IRC [12:18:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [12:20:46] indeed, thank you! (re @wmtelegram_bot: codesearch seems to be back) [12:35:29] heyo, I will ask something without knowing at all what I am saying since not an expert at all: is it possible to call somehow Harbor within GitLab CI/CD pipeline to use heroku-builder:22? [12:37:48] Also, 2nd problem, I am getting "ERROR: --mount not explicitly specified on a build service based tool", even when "mount: none" is specified in a "service.template" file at the root of my repo [12:43:17] !log superpes@tools-sgebastion-10 tools.stewardbots Restarted Stewardbot which had quit from IRC [12:43:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [13:43:27] !log taavi@tools-sgebastion-11 tools.wikibugs toolforge jobs restart redis2irc [13:43:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [14:59:16] !log removing wmf-auto-restart-cron from all VMs without cron via cumin - https://gerrit.wikimedia.org/r/c/operations/puppet/+/1007328/ T358343 [14:59:18] taavi: Unknown project "removing" [14:59:18] T358343: wmf_auto_restart_cron.service failing in Cloud VPS bookworm instances - https://phabricator.wikimedia.org/T358343 [14:59:28] !log admin removing wmf-auto-restart-cron from all VMs without cron via cumin - https://gerrit.wikimedia.org/r/c/operations/puppet/+/1007328/ T358343 [14:59:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [16:00:25] Lofhi: the service.template has to be in the tool home, not the git repository (it's a bit of a chicken-and-egg issue if it's in the repository, we are working on a long-term solution for that) [16:03:41] @harej: you are probably seeing a VM base image that has leaked out of one of the automated $something as a service OpenStack products that we have been adding. I think there is a phab task about the leaking somewhere. I can't remember if the Fedora bits are from Trove or Magnum. [16:03:55] https://wikitech.wikimedia.org/wiki/Help:Trove_database_user_guide and https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Magnum [16:05:29] I think it's magnum, fedora coreos iirc [16:11:35] dcaro: I thought the service.template could also be in some well-known source code directories? [16:11:44] e.g. in lexeme-forms I only have one in www/python/src/ AFAICT [16:12:11] (but don’t ask me what happens if several of the well-known directories have different service.template files ^^) [16:12:49] oh, I see, but Lofhi was talking about the build service… [16:12:56] ok that’s different [16:12:58] yep, I meant in the remote git repo (not in the tool home), like in https://gitlab..../service.template (the repo you build from), not under $HOME/* [16:12:59] yep [16:13:47] because the point is you don’t want to have any directories like www/python/src in the tool home directory anymore if it’s just running the image… [16:22:12] I don't think its bad if there is a git clone on the bastion, it's just not involved in the build->run process with custom images [17:44:57] admin.toolforge.org appears to be down [17:56:03] JJMC89: it must have healed itself pretty quickly. None of the automated alerts for it went off, but I do see an alert about tools-k8s-haproxy-3 being sad that has sense resolved itself. [17:56:15] *since [17:56:35] yea, looks good now [17:58:04] bd808: do you know if there is a tool that shows prod db lag (not cloud replag)? [17:59:19] hmmm... tendril used to, but something replaced that I think... /me looks [18:00:04] https://wikitech.wikimedia.org/wiki/MariaDB/monitoring links to https://grafana.wikimedia.org/d/000000303/mysql-replication-lag [18:00:24] https://noc.wikimedia.org/db.php just tells you to ask MediaWiki for it [18:00:28] orchestrator, which is unfortunately private [18:01:37] well something's lagged - mentioned it in -operations but don't know which db it is [18:04:09] db2194 was very sad per https://grafana.wikimedia.org/goto/XgOHKRAIk?orgId=1 [18:36:54] admin it's been flapping today, I started looking, there's some lighthttpd errors and some php errors, I think it's related to the libraries being a bit old (the slim version used is from 2016) [18:37:11] the buildpack is not working though (probably also because of the versions), so I'm still debugging [20:46:02] dcaro: there's a phab bug about admin being based on an ancient slim framework shim that I wrote for some other WMF projects that have now been decommed. The TL;DR is that we need to port away from that library, rewrite from scratch, or figure out a complete replacement for the admin webservice (like maybe just redirecting to Striker?) [20:52:42] Ack, striker that sounds like the next best thing to me