[10:53:13] How long should a volume detach take to run? [10:54:02] (I've been waiting for 30 mins and counting...) [10:58:39] stw: almost instant! but I guess it depends on any client activity going on? [10:59:24] I mean, I was doing a parallel delete of a web proxy and a dns recordset too, but I wouldn't have thought that would have conflicted [11:01:23] accounts-oauth-g-database is the volume in question; attached to accounts-oauth-g; my aim is to destroy both the instance and the volume anyway. [11:54:56] !paws log Bump requests from 2.27.1 to 2.31.0 in minesweeper [12:57:05] with the recent bookworm announcement i wanted to spin up another g3.cores16.ram36.disk20 instance for test purposes but it seems that flavor is gone? [12:58:19] also what's up with g3.cores16.ram32.disk20 having the same specs as g3.cores8.ram32.disk20? (8 vcpus) [13:00:44] gifti: T337010 [13:00:44] T337010: cloud vps: fix flavor g3.cores16.ram32.disk20 id 37ed9aaa-35b2-4141-8bc4-272ec8bbc303 - https://phabricator.wikimedia.org/T337010 [13:55:52] stw: there's an upstream bug that breaks volume detaching, I need to document it better. In the meantime I will try to clean things up. Can you reconfirm that you want the volume accounts-oauth-g-database deleted? [13:57:18] stw, T338262 [13:57:19] T338262: Cinder volume stuck in Detaching state - https://phabricator.wikimedia.org/T338262 [14:27:28] !log tools rebooted tools-harbor-1 as it was not responding [14:27:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:56:17] Rook IDK if you're here and if this is the right place to chat. Unfortunatelly, the new jupyter-lab 4.0 is misbehaving just like the old one, but I think I narrowed it down. Maybe you'd like to see it yourself. Nothing's urgent. I've prepared a few files for you in my home folder (under rook), and I can send you my notes - either here or privately. [14:56:18] LMK. [14:56:45] hmm...Let me have a look [14:58:05] OK. I'll paste my notes here, in case someone else wants to take a look as well: [14:58:06] Jupyterlab 4.0 didn't really help: I've tested my files in two browsers on Linux, and in Chrome on Windows. Same annoying thing is happening: after running a number of cells, the view resets itself to the topmost cell; table of contents becomes irresponsive and can be brought back to life only by randomly turning on and off "show heading number", [14:58:06] "show first-level heading number", "show output headings", "collapse all headings". [14:58:07] It also autosaves on every change, too often, even when autosave is off, and save button is always inactive [14:58:07] The button does say "in collaborative mode, the doc is saved automatically after every change", but I haven't found a way to turn it off. I wonder if the collaborative mode is causing all the problems after all. [14:58:08] When I re-open the file, it'll say "there's another collaborative session accessing the same file", even though there isn't another session [14:58:08] I've installed jupyterlab 4.0 locally on my laptop and haven't observed anything like that - so far. [14:58:09] I've made a folder with an ipynb file, and three json/csv files that my code needs to load. It looks like the notebook needs to be of some size, so reducing this to a minimal example isn't that easy. This looks like some communication problem to me. [14:58:09] Run the whole notebook line-by-line. If nothing weird happens, go back somewhere in the middle and re-run. It takes up to 150-200 cell executions, though sometimes even 10 is enough. [15:13:47] I've made a copy of the files and am running them, I went all the way through a step at a time then back to the middle a few times now, over 400 cell executions at this point. I'm not seeing it pop back to the top. I didn't see the collaboration mode in this case, though I did see that once on another file that I had, I figured it was caching from the upgrade referencing a past that does not exist to it any longer. Though if it is still [15:13:47] happening that would be a problem [15:15:33] Hm... Your TOC expanded?   Save button (diskette) enabled? [15:17:03] Yes TOC is expanded, and it would appear the save is stuck in collaborative mode the mouseover text gives "In collaborative mode, the document is saved automatically after every change (Ctrl+S)" [15:17:39] In the toolbar on the left, I have "File browser", "Running terminals and kernels", "Collaboration" [15:18:09] Is the collab mode something I can turn off or is it server-side? [15:18:12] I wonder if it is the collaborative mode that is causing the issue. Actually could it be memory, jupyterlab seems to believe it has 3G when it has 2G... [15:19:00] Could be memory. There are some big pandas tables in there. [15:19:34] The collaborative mode is new to me. The last I knew about it was the sharing a link with a token, though this might be separate from that function [15:19:52] Which is an excessively long way of me saying "I don't know" :p [15:20:59] Yeah, I don't remember seeing it before.   But it has to do with some javascript they're running to keep the notebook and the TOC in sync [15:21:01] Oh yeah! `jupyter_collaboration` It was complaining about that when I was trying to build it [15:21:23] It was just added with the move to 4, bit more information here https://jupyterlab.readthedocs.io/en/latest/user/rtc.html [15:22:53] that might be a little too much for us. IDK. It's saving the file way too often imo [15:23:45] I agree, I added it because an error message was wanting it. Might not be something we really need. Though I don't think it is causing the problem you're seeing, as that was happening on jupyterlab 3.6.3 as well, which did not have the collaboration library [15:24:11] That's correct. [15:25:59] Ok, doesn't appear to be memory, we really are giving out 3, and aren't getting close to exhausting it. [15:26:32] NFS might be too slow for the collaboration sqllite save [15:26:41] I noticed something locally while tinkering that required clearing cookies. Do you see the same if you connect from a private browser window? Maybe stop your server before switching to the private window? [15:26:55] When I was on Chrome@Windows, keeping the TOC collapsed (and unnumbered) made it possible to run the code for a much longer time; I thought that'd help, but then it started happening there as well [15:27:52] I switched between browsers, and Chrome@Windows was the one I never used before. I also shut down all kernels and logout [15:28:13] `.jupyter_ystore.db` has 75MB for PonoRoboT at this point. [15:29:07] I deleted a number of those yesterday. They were getting huge, yeah... 100s of MB I think [15:29:13] I'm seeing if I get the same error when building without collaboration [15:43:02] Seeing a lot of            "Uncaught (in promise) Error: Minified React error #409; visit https://reactjs.org/docs/error-decoder.html?invariant=409 for the full message or use the non-minified dev environment for full errors and additional helpful warnings."    in my browser console, as this is happening rn [15:44:49] Hmm, I wonder what that is... [15:47:58] "Cannot update an unmounted root." apparently [15:48:37] each of those is followed by a hundred of: [15:48:38]     render https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/3935.905285b8e22c337968ed.js?v=905285b8e22c337968ed:2 [15:48:38]     renderDOM https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:39]     renderDOM https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:39]     onUpdateRequest https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:40]     processMessage https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:40]     b https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:41]     o https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:41]     w https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:42]     n https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:48:42]     promise callback*37192/n https://hub-paws.wmcloud.org/user/PonoRoboT/static/lab/jlab_core.b473ae48d19e9025bb00.js?v=b473ae48d19e9025bb00:1 [15:49:01] Ponor: use a pastebin please :) [15:49:36] Ponor: what's the version of your browser? [15:49:43] sorry, running from a browser and not sure where to find it [15:50:59] It's Firefox 144 64b for Fedora [15:51:56] Pardon me. Firefox 114. [15:52:07] Ah thanks [15:52:31] Alright so well updated [16:06:24] when i open up PAWS notebooks within subfolders, there's an odd `RTC:` that's getting pre-pended to the directory name, which then makes the public paws link incorrect (example: https://public-paws.wmcloud.org/User:Isaac_(WMF)/RTC:Annotation%20Gap/v2_eval_wikidata_quality_model.ipynb which should be https://public-paws.wmcloud.org/User:Isaac_(WMF)/Annotation%20Gap/v2_eval_wikidata_quality_model.ipynb ). just me (though it's happening [16:06:24] on chrome and firefox)? want me to create a phab ticket? [16:06:29] I wonder if we're getting comparability issues between the single user containers and the hub container, as they have different versions [16:07:19] isaacj: sure a ticket would be good [16:07:30] :thumbs up: not urgent because i can work around [16:07:38] 👍 [16:11:38] T338973 [16:11:38] T338973: PAWS generating odd directory names that break the public links - https://phabricator.wikimedia.org/T338973 [16:11:50] don't hesitate to let me know if I can help with any additional testing etc. [16:11:54] thanks! [16:19:01] I'm seeing the RTC: too. On top of Launcher tab.    The "won't save" bug  (Icaacj reported it as well) didn't show up now that everything's being saved in the collab mode, though files sometimes cannot be loaded unless you duplicate them and open up the exact copy. [17:48:55] !log tools.lexeme-forms deployed e9112d022e (l10n updates: es) [17:48:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [18:16:00] !log paws Rolling jupyterlab back to last known working version T338981 [18:16:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [18:16:08] T338981: [Rollback] paws to jupyterlab 4.0 - https://phabricator.wikimedia.org/T338981 [18:19:24] I see where the [18:29:31] Bit fishing here − heritage/erfgoedbot has been having troubles. I checked the logs and since April 7th I have lots of DB-related errors, see https://phabricator.wikimedia.org/T338987 Would someone top of mind whether something might have changed with ToolsDB ~2months ago ? [18:31:43] “The new VM is ready, and we plan to point all tools to use it on Apr, 6 2023 at 17:00 UTC.” ref T301949 & T333471 − probably not a coincidence :) [18:33:23] ``` [18:33:24] monuments_db: [18:33:26] server: 'tools-db' [18:33:27] db_name: 's51138__heritage_p' [18:33:29] ``` [18:33:30] [18:33:32] → Looks like I need to change the hostname [18:45:46] yes, but I doubt the out-of-date hostname is causing the errors in the task [18:46:44] @JeanFred: my suspicion is that the mariadb version upgrade changed the default `sql_mode`, which means that queries which were previously silently dropping invalid data (or causing just warnings) are now producing actual errors [18:50:35] Thanks for the hint. Replaced 10.0 with 10.4.28 in my docker-compose setup, will try to see if that happens locally too. (re @wmtelegram_bot: @JeanFred: my suspicion is that the mariadb version upgrade changed the default `sql_mode`, which means that queries whi...) [19:27:43] ``` [19:27:44] docker-compose run --rm bot python -m erfgoedbot.update_database -countrycode:ir -langcode:fa -log [19:27:45] (1366, "Incorrect double value: '' for column `s51138__heritage_p`.`monuments_ir_(fa)`.`lat` at row 1") [19:27:47] ``` [19:27:48] [19:27:50] At least I can reproduce locally :) [19:28:14] andrewbogott: confirmed, volume accounts-oauth-g-database (749c0759-5bae-4b4b-8f7a-b01119a1838e) and instance accounts-oauth-g (a1cb2517-4af5-4e7b-ae9e-a17c5d5c9281) can both be unceremoniously deleted :) [19:28:56] Interestingly, the error I got back was a "couldn't find resource" error, but I'm not sure if that's just Terraform making something up or an actual error from the OpenStack APIs. [19:29:10] "Error detaching openstack_compute_volume_attach_v2 a1cb2517-4af5-4e7b-ae9e-a17c5d5c9281/749c0759-5bae-4b4b-8f7a-b01119a1838e: couldn't find resource" [19:30:13] yeah, there might be terraform issues but I'm saving that for when the commandline works [20:53:55] Adam you think if you give verbal permission here a friendly cloud admin could give me access to the wikibase-registry project? [20:57:59] addshore: ^ [21:07:18] Hola [21:07:25] Hola [21:07:44] If a cloud admin is up for it, yes, I approve on both my irc and telegram :) [21:28:04] @harej, @Adam: That endorsement might be more powerful if Adam was actually a member of the project. ;) [21:28:25] o_O [21:28:40] I hereby endorse the request as well [21:28:53] (or if I remember, I’ll do it tomorrow when I’m back at my work device) [21:29:19] (“members” is just confusing OpenStack naming and they can still add other people, right? there’s no higher “admins” level?) [21:31:24] !log wikibase-registry Added harej as project member per IRC/Telegram approval by existing member lucaswerkmeister-wmde [21:31:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikibase-registry/SAL [21:31:33] thx :) [21:53:26] stw: I got that volume detached, I'll leave the actual deletion to you. It should work now, I think! [21:54:17] Cheers, I'll give it a go [21:57:14] woohoo [21:57:17] "Apply finished, 2 destroyed." Thanks muchly for that :) [21:57:24] thank you [22:03:55] @harej: yw