[01:10:22] PROBLEM - MariaDB sustained replica lag on m1 on db2160 is CRITICAL: 9.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321 [01:12:29] RECOVERY - MariaDB sustained replica lag on m1 on db2160 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321 [10:28:33] I've sent a calendar invite to the team alias- it is not a meeting, just a reminder of the row maintenance (for me it is very helpful to have it on the calendar)- let me know if not helpul for you and I will remove you [10:43:20] I'd already been doing that in my personal calendar :) [13:18:54] (though I just link to the phab item, rather than duplicating the table from it) [14:02:40] unrelated, but did you see power stuff on db2163, Amir? [14:03:39] usually it is just a loose cable to ask to tight to papaul, I can help with that if needed [14:11:42] jynus: is there a reason transferpy isn't available on PyPi? Asking because my cookbook is stuck in CI hell because the CI can't get transferpy. volans suggested I ask the author to upload to pypi... [14:12:07] you can tell CI to download from a repo [14:12:19] from setup.py? [14:12:22] (as a quick patch) [14:12:54] https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/905595 <-- failing CR [14:13:08] we are in the middle of something going on, I will attend you later [14:14:27] thanks, appreciate it [14:15:31] sorry, no incident at the moment, but I see scary stuff, later help [14:52:47] Emperor: this is how I reference a specific tag from a repo (for wmfmariadbpy): https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/transferpy/+/refs/heads/master/setup.py#14 [14:53:26] not sure if that helps you in any way [14:55:56] IMHO we should't include things directly from gerrit/github but proper releases of them [15:20:35] volans: I can sortof see that, but in prod transferpy comes from the Debian package that is already installed on the cumin nodes, so this is about letting the CI run [15:20:45] ...ideally without putting more yaks on this already very delayed piece of work [15:21:15] are the gerrit tags at least signed? [15:22:32] pass [15:22:34] at least add a comment on the line before that explains [15:24:09] Emperor: and please add the deb dependency here too: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/spicerack.pp#30 [15:24:52] ack [15:26:38] also that means that everytime there is a new release you'll have to manually update the reference [15:26:47] in setup.py [15:28:10] Mmm, and/or persuade jynus they want to release on pypi too :) [15:30:38] Emperor: (possibly a question for jynu.s) but is there a benefit to using trasnferpy over e.g. sftp [15:31:01] when I asked about how best to achieve this when I started, I was told that transferpy was the tool for the job. [15:31:19] (this == copy the container dbs to a cumin node for processing from a cookbook) [15:32:41] Emperor: ill prefixz this iwith that may well be the correct answer :). but who, and do you know if there is a benefit e.g. speed simplicity etc [15:33:36] ISTR it might well have been volans but it was quite a long time ago now; I think speed & simplicity for a small number (6 per invocation) of root-owned largeish (~100M each) files [15:34:18] it's certainly reasonably simple to drive - a bit of setup then about 2 lines of code [15:35:31] ago> 13 Feb according to my notes [15:35:40] Emperor: ack thanks [15:37:31] CI at least happy now [15:45:50] Emperor: cool and thanks, ftr im in know way blockign etc just wanted to get an idea of the genral problem solution for how we may want to change spicerack in the future [15:46:56] Sure; I've sent my CR to u.random for a review of the swifty bits, but I think comments from you or v.olans as the cookbook maestros would also be good (be gentle, it's my first cookbook) [15:47:42] but next I need to get back to the container that has consistent but wrong listings... [15:54:52] Emperor: sorry I am not givving you enough attention, but it become an almost real incident [15:55:53] 's OK [16:06:52] Emperor: ack ill give it a pass through tomorrow [16:07:21] TY [16:19:08] I've updated T327253 but I think the answer for wikipedia-ja-local-public.21 is to delete the 22 objects consistently in the codfw listing that don't exist, leaving the 45 objects that exist in both codfw and eqiad [16:19:09] T327253: >=27k objects listed in swift containers but not extant - https://phabricator.wikimedia.org/T327253 [16:28:02] "Deletions will continue until container consistency improves" [20:55:40] PROBLEM - MariaDB sustained replica lag on s6 on db1131 is CRITICAL: 3 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1131&var-port=9104 [20:57:20] RECOVERY - MariaDB sustained replica lag on s6 on db1131 is OK: (C)2 ge (W)1 ge 0.4 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1131&var-port=9104