[08:32:25] !log admin cleanup neutron agents for cloudvirt1017 (decom) [08:32:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:34:55] !log admin cleanup neutron agents for cloudvirt1021/1022 (decom) [08:34:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:58:31] !log admin disabled tool "wb" by clicking the disable button at https://toolsadmin.wikimedia.org/tools/id/wb T328693 [10:58:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:58:36] T328693: [toolsdb] Migrate "s54518__mw" db to Trove - https://phabricator.wikimedia.org/T328693 [13:56:42] !log admin depooling cloudweb1003 before switch upgrade [13:56:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:06:31] #wikimedia-cloud Wikimedia Cloud Services (wikitech.wikimedia.org) | Status: toolsdb + other services under maintenance | Ask questions here, but please provide links and context. | More details and channel logs at https://wikitech.wikimedia.org/wiki/Help:IRC | Code of Conduct applies: https://www.mediawiki.org/wiki/CoC [14:08:13] The pre-announced switch maintenance is happening now, expect toolsdb failure (and other random failures) for the next 20 minutes or so. [15:08:08] since around 14:20-14:30 my k8s jobs have stated to fail due to not being able to connect to en.wikipedia.org [15:16:25] Not sure if related but I can't auth to Quarry with error The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application. [15:17:12] filed as https://phabricator.wikimedia.org/T333370 [15:17:29] T333043 for Quarry [15:17:30] T333043: [bug] "Internal Server Error" when logging into Quarry - https://phabricator.wikimedia.org/T333043 [15:18:51] FYI andrewbogott ^^ incase related to the maintenance [15:19:01] rook, any idea what's up with Quarry? Maybe something is in an inconsistent state after the downtime [15:20:28] JJMC89: do you have the exact error message you've been getting? [15:20:47] andrewbogott: quarry/oauth login issues are a known unrelated thing unfortunately [15:20:55] It's been doing that for awhile. So probably not. I was assuming it was something with oauth. [15:21:03] ah, ok. It's broken for me too if that matters. [15:21:05] taavi: python is showing: requests.exceptions.ConnectionError: HTTPSConnectionPool(host='en.wikipedia.org', port=443): Max retries exceeded with url: /w/api.php (Caused by NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')) [15:21:18] thanks, `Temporary failure in name resolution` was what I was looking for [15:21:44] are you still seeing those or was that temporary? [15:21:56] Herzog have you tried logging in 6 times? That seems to work for some people [15:22:06] exactly six? [15:22:11] lol :) [15:22:18] just had one a few seconds ago, taavi [15:22:22] hmmm. looking [15:22:33] Well I've tried reloading the page various times, Rook - not sure if I reached the magic number [15:24:43] well, now it worked Rook [15:25:44] Sorry I was a little embarrassed to suggest that, but it has been shown to work... [15:26:41] Rook: I'm not sure if it'd help but https://meta.wikimedia.org/wiki/Special:OAuthListConsumers?name=Quarry&publisher=&stage=-1 has 2 OAuth clients approved. Shall I disable the older one? [15:27:10] 1.5 that is [15:28:34] It really should only be using one of them, hopefully the newer one. So it shouldn't matter that the other is still active. Let me see what we have in the configs [15:29:06] we also have more old ones https://meta.wikimedia.org/wiki/Special:OAuthListConsumers?name=SQL+Quarry&publisher=&stage=-1 [15:29:16] https://versions.toolforge.org/ says we have 0 wikis, and I also see some “Temporary failure in name resolution” in its error.log [15:29:22] probably the same issue? [15:29:23] if not in use, I'd like to disable them [15:29:30] (it would be trying to resolve noc.w.o i believe) [15:30:48] yes, we're looking [15:43:10] herzog: I believe you can disable everything but the newest one (https://meta.wikimedia.org/w/index.php?title=Special:OAuthListConsumers/view/cb1d846203c919792257399726bc0245) [15:43:48] Rook: ack - leaving the latest and disabling everything else [15:43:58] Thanks! [15:48:53] toolforge network issue update: we have identified a fix, working on rolling it out everywhere [15:51:32] Rook: done :) [15:58:39] !log tools.stewardbots Stop sulwatcher k8s deployment until network issues are resolved [15:58:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [16:02:38] !log tools.stewardbots Start SULWatcher bots again [16:02:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [16:05:20] !log tools.stewardbots Restart StewardBot [16:05:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [16:32:06] Hi! I'm maintainer of vector-dark tool on Toolforge. Recently it started showing 502 Bad gateway, so I decided to try to restart it (webservice stop && webservice start). However the second command fails: File "/usr/lib/python3/dist-packages/requests/adapters.py", line 529, in send [16:32:07]     raise ReadTimeout(e, request=request) [16:32:07] requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='k8s.tools.eqiad1.wikimedia.cloud', port=6443): Read timed out. (read timeout=10) [16:32:15] What can be the reason? [16:33:03] Oh, It succeeded just now. Sorry for any trouble :D [16:33:18] Msz2001: sorry, we just rebooted the whole toolforge worker node fleet [16:33:34] Okay [16:46:44] we believe toolforge is back up [16:51:06] !log tools.bridgebot try restarting after network issues (it stopped bridging #wikimedia-cloud to Telegram, at least, haven’t investigated further) [16:51:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [16:52:52] (ok, somehow the bot caught up with the missed messages but sent them out of order o_O) [16:53:09] but at least things are working again, thanks all [19:11:48] !log tools.quickcategories unset EXPECTED_DATABASE_ERROR again [19:11:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [20:19:20] !log clouddb-services deleted old osmdb servers clouddb1003 and clouddb1004. Both have been shut down for some time. [20:19:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Clouddb-services/SAL