[03:06:12] !log deltaquad@tools-sgebastion-10 tools.stewardbots ./stewardbots/StewardBot/manage.sh restart # RC reader not reading RC [03:06:16] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [06:30:11] !log bsadowski1@tools-sgebastion-10 tools.stewardbots Restarted StewardBot/SULWatcher because of a connection loss [06:30:14] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [10:56:19] Hello, I noticed that I had a job stuck for a few days, so I forced a restart. The error message I got said to please report it to the Toolforge admins. [10:56:44] So here I am. The error is "requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='api.svc.tools.eqiad1.wikimedia.cloud', port=30003): Read timed out. (read timeout=30)" [10:57:23] And it happened in the webapp that hangs from https://jorobot.toolforge.org [11:00:50] Joutbis: you got the error when restarting the job, or the job showed that error on it's logs after a restart? [11:01:05] (looks like the former) [11:19:23] !log admin update spicerack to 8.5.0 on cloudcumin2001 [11:19:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [11:54:21] !log lucaswerkmeister@tools-sgebastion-10 tools.bridgebot Double IRC messages to other bridges [11:54:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [12:19:43] !log bsadowski1@tools-sgebastion-10 tools.stewardbots Restarted StewardBot/SULWatcher because of a connection loss [12:19:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [12:36:48] Sorry I had to leave the computer. Yes, the error message was when I restarted the job [13:03:33] Joutbis11: I took a look briefly. I saw a 499 error on the jobs-api side, which could indicate that a brief unavailability happened in the kubernetes side [13:08:03] Can that be why the job got stuck? It hadn't happened before in a few years... [13:08:48] Anyway, just wanted to mention that in case it was some kind of a system problem. I do have some debugging to do on my side. [13:37:28] !log deltaquad@tools-sgebastion-10 tools.stewardbots ./stewardbots/StewardBot/manage.sh restart # RC reader not reading RC [13:37:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [16:50:26] !log devtools - rebooted unreachable puppetmaster-1003 - was "no route to host" - but is back now, log had a " /dev/sdb: Can't open blockdev" as well [16:50:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Devtools/SAL [16:57:46] !log devtools - puppetmaster-1003 reachable again but service fails to start and puppetserver-deploy-code fails [16:57:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Devtools/SAL [16:58:27] !log copypatrol copypatrol-backend-prod-01 deploy 1622949..02f58af [16:58:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [17:02:36] !log copypatrol copypatrol-backend-prod-01 deploy 02f58af..43bc784 [17:02:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [17:07:29] !log copypatrol copypatrol-backend-prod-01 deploy 43bc784..dfb436b [17:07:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Copypatrol/SAL [17:51:01] !log devtools - added Notice: /Stage[main]/Profile::Labs::Cindermount::Srv/Cinderutils::Ensure[cinder_on_srv]/Exec[prepare_cinder_volume_/srv]/returns: executed successfully [17:51:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Devtools/SAL [17:51:07] oops, wrong log [17:52:19] !log devtools - added profile::labs::cindermount::srv to puppetmaster-1003 in horizon to get missing cinder volume - T360470 [17:52:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Devtools/SAL [17:52:22] T360470: Update devtools project puppetmaster - https://phabricator.wikimedia.org/T360470 [20:55:19] !log bd808@tools-bastion-12 tools.wikibugs-testing Restarted irc task to test changes from MR!31 [20:55:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs-testing/SAL [21:08:11] !log anticomposite@tools-sgebastion-10 tools.stewardbots SULWatcher/manage.sh restart # SULWatchers disconnected [21:08:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [21:10:21] !log bd808@tools-bastion-12 tools.wikibugs Built new image from git hash 406e9e18. [21:10:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [21:12:32] !log bd808@tools-bastion-12 tools.wikibugs Restarted irc task to pick up new container image from git hash 406e9e18. (T360353) [21:12:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [21:13:35] Not sure if this is the right place but citeoid seems to be down [21:14:55] !help citeoid is down [21:14:55] If you don't get a response in 15-30 minutes, please email the cloud@ mailing list -- https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication [21:15:28] not the right place, no. [21:15:51] I think this has been reported at https://phabricator.wikimedia.org/T362379 already [21:15:54] @bd808 where would be? [21:16:45] PotsdamLamb: #wikimedia-operations and #wikimedia-tech are the IRC channels that expect to get reports of production service outages. [21:18:53] @bd808 thanks [21:23:09] I pinged about the open T362379 bug report in the Editing team's Slack channel. [21:23:09] T362379: citoid errors inserting new ref in VE - https://phabricator.wikimedia.org/T362379 [21:26:28] bd808 how many IM softwares do you use? [21:29:19] PotsdamLamb: "many" [21:30:46] I have two irc networks, slack, and signal open at the moment. At various times I also use Telegram, Discord, and Zulip. [21:32:40] citoid has been broken for three days? That seems serious [21:38:15] That is a lot bd [21:39:27] PotsdamLamb: folks in the movement ask for help in lots of places :) [22:25:45] lol. So I do have a ? for you @bd808. In creating a bot, if I need to pass custom parameters like logging onto simple do I need to do a local image and not the main image maintained by WMF? [22:26:45] https://phabricator.wikimedia.org/T362103 [22:27:10] everytime I run it, it tries to log me into test [22:29:35] I wonder if simplewiki isn't in the auth matrix for some reason? [22:29:42] * bd808 looks at https://gitlab.wikimedia.org/toolforge-repos/pywikibot-buildservice [22:32:15] https://gitlab.wikimedia.org/toolforge-repos/pywikibot-buildservice/-/blob/toolforge/user-config.py?ref_type=heads looks like it would have all the stuff needed, unless there is something special about how simplewiki needs to be configured in pywikibot. [22:33:28] simplewiki isn't special - the log in the task correctly shows getting the page from simplewiki [22:36:02] the issue is missing i10n for simplewiki in that script [22:37:19] ah, there you go then PotsdamLamb. JJMC89 knows infinitely more about pwb than I do. :) [22:38:12] Is it possible to use conda in PAWS? (instead of / in addition to pip https://wikitech.wikimedia.org/wiki/PAWS/Python_with_Pip ) [22:39:16] @JJMC889 I thought it was added? I sent it in the ticket or did I send the wrong info? [22:39:46] no - xqt proposed a patch and asked you to review it [22:42:43] JJMC89, I did not see where I was asked to review it; I see where Added to reviewer: [22:42:43] Derick A. [22:43:10] https://phabricator.wikimedia.org/T362103#9706266 [22:44:22] @Tilman: I don't think so. In a terminal inside PAWS `which conda` returns nothing. [22:45:47] JJMC89 I do not know how to do or access that. Can you instruct me or link me to something please? [22:47:32] the patch is https://gerrit.wikimedia.org/r/1018947 [22:53:48] JJMC89 Our notes goes above references. It looks like in the patch they are after. [22:54:34] don't tell me - respond to xqt [22:54:40] but it looks like at line 262 it takes care of that if I am reading it right [22:54:53] no I was preparing ? sorry [22:56:10] I will post to the ticket to make sure. Thanks for the help [22:59:17] Thanks for your help and the links @JJMC89. Very much appreciated. I am not getting any emails on that ticket though so I will check later tonight.