[00:17:11] PROBLEM - MariaDB sustained replica lag on s4 on db1244 is CRITICAL: 1873 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1244&var-port=9104 [00:25:11] RECOVERY - MariaDB sustained replica lag on s4 on db1244 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1244&var-port=9104 [08:07:59] Hi folks - could I get a +1 to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1182084 please? removing the now-drained thanos-be2005 from the thanos-swift rings so it can have its disk controller swapped [08:20:27] done [08:32:20] TY :) [10:07:24] Last dump for x1 at eqiad (db1216) taken on 2025-08-26 00:00:06 is 82 GiB, but the previous one was 68 GiB, a change of +20.7 % [10:09:32] jynus: ah, you're back :) Good vacations? [10:10:11] average [10:13:21] federico3: Amir1: I received more emails about "private data found" in s3, did you discover what is the issue? [10:18:05] I understood it has been fixed, when did you receive the emails? [10:19:07] 4 hours ago [10:19:18] 6:45 UTC [10:19:29] sorry 5:45 UTC [10:33:02] jynus: That could be T351911 but that should make it much smaller, not bigger [10:33:03] T351911: Compress cx_corpora.cxc_content - https://phabricator.wikimedia.org/T351911 [10:33:19] dhinus: :/ I look into it. I'm sure I did fix it [10:39:50] Better ticket T399084 [10:39:51] T399084: Compress data on cxc_corpora table in production - https://phabricator.wikimedia.org/T399084 [10:51:02] I realized what's going on, fixing it now [10:51:05] for good [10:51:13] (the private data) [11:07:05] Amir1: thanks :) [11:27:36] After I edited /etc/network/interfaces I did a git diff 🤦 [11:31:14] etckeeper? :D [11:33:24] no just force of habit, like when I wrote :wq to close a browser tab [11:35:56] that's a vile habit ;p [11:49:56] still better than emacs [11:50:06] https://www.irccloud.com/pastebin/S59ZOEn9/ [11:50:27] so cx_corpora doesn't seem to be the reason behind explosion of x1 [12:08:26] Emperor: and I hear it's difficult to quit [12:17:59] :) [12:21:06] BTW that's what https://github.com/tridactyl/tridactyl is for (vim-like firefox) [12:45:21] Amir1: can I start 2025/change_afl_defaults_T401906.py on s6 in codfw, then s2 and s5? [12:50:37] Sure! [13:01:25] (Let it finish on s6 first. Just in case bugs happen) [13:13:35] zabe: just to confirm, we don't write to the old columns of categorylinks anymore everywhere except commonswiki and enwiki [13:14:05] If so, I want to start dropping the columns before it's merged in code since it's a massive impact [13:14:35] (also, we should also exclude testcommonswiki which is in s4 too but who cares :D) [13:15:01] yes, we are only writing to the old columns (cl_to and cl_collation) on enwiki and commons by now [13:15:33] 🎉 [13:16:31] commons is read new since yesterday [13:16:37] this time without explosions [13:17:57] well, the switchover script of s4 broke halfway through leaving the whole section on RO for several minutes instead of the usual 30s. This section is cursed [13:24:01] Can I get a +1 to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1182142 please? Prep for adding 4 new eqiad ms frontends. Should be an easy review :) [13:26:02] Emperor: {{done}} [13:29:40] TY :) [13:40:34] Amir1: created a patch for dropping cl_to and cl_collation - https://phabricator.wikimedia.org/T402925 [13:40:44] Thanks!!!!! [13:40:48] it is wikishared.cx_corpora: https://phabricator.wikimedia.org/P81762 [13:41:26] jynus: ah, it was the dump, not snapshot [13:42:39] It is very very likely caused by that change to start compressing HTML blobs but I'm very confused as this should bring it down, not up. Only thing I can think of is double compression or (more likely) they are not removing the old rows? [13:44:11] Pinged them on the ticket [13:44:22] Thanks for the investigation Jaime! [13:44:24] I don't have any operational problem with it [13:44:36] I just wanted the dbas to be fully aware [13:44:42] before I silenced the alert [13:44:51] which I will do now [13:45:17] yeah, that's good to know our initiatives are doing the opposite effect