[02:12:11] (03PS1) 10Eileen: Add confirm modal to batch merge in deduper [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/750651 (https://phabricator.wikimedia.org/T296042) [02:12:25] 10Fundraising Sprint Xenomorph Petting Zoo, 10Fundraising-Backlog, 10fundraising sprint Wireless Zipline, 10fundraising sprint Yeet-coaster, and 2 others: Add a second layer of confirmation to the batch merge deduper button - https://phabricator.wikimedia.org/T296042 (10Eileenmcnaughton) a:03Eileenmcnaugh... [02:14:45] 10Fundraising Sprint Technical debt house of horrors, 10Fundraising Sprint Visual C Saw, 10Fundraising Sprint Xenomorph Petting Zoo, 10Fundraising-Backlog, and 10 others: update the docs on the civi-acoustic import and export scripts - https://phabricator.wikimedia.org/T286934 (10Eileenmcnaughton) [02:15:14] 10Fundraising Sprint Technical debt house of horrors, 10Fundraising Sprint Visual C Saw, 10Fundraising Sprint Xenomorph Petting Zoo, 10Fundraising-Backlog, and 10 others: update the docs on the civi-acoustic import and export scripts - https://phabricator.wikimedia.org/T286934 (10Eileenmcnaughton) [02:37:40] 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Proposal - import master suppression list to CiviCRM & put in a group - https://phabricator.wikimedia.org/T298381 (10Eileenmcnaughton) [02:45:27] 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Acoustic MG import disabled? - https://phabricator.wikimedia.org/T298382 (10Eileenmcnaughton) [02:53:56] 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Proposal - add an activity when we unsubscribe a contact via queue - https://phabricator.wikimedia.org/T298383 (10Eileenmcnaughton) [15:44:40] 10Fundraising Sprint Xenomorph Petting Zoo, 10Fundraising-Backlog, 10fundraising sprint Wireless Zipline, 10fr-donorservices, 10MW-1.38-notes (1.38.0-wmf.13; 2021-12-13): applepay donations TY email doesn’t have the donor's name - https://phabricator.wikimedia.org/T296881 (10SHust) Latest CID affected: 5... [19:02:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 40.11, 29.05, 15.29 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:07:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 39.21, 35.41, 21.86 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:12:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 35.92, 36.42, 26.14 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:15:58] AndyRussG, dstrine, ejegg|away, jgleeson|mobile: the stupid over repetitive alerts are back [19:16:47] thanks RhinosF1, i'll ack that and see what's up [19:17:01] Ty ejegg|away [19:17:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 32.01, 34.45, 28.22 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:17:20] :) [19:17:21] * RhinosF1 gets pinged by them so always sees quick and hates how often they go off [19:19:41] oops, I seem not to have rights to ack that one [19:19:54] RhinosF1: thx!! btw in case this is useful, as I guess you know, these are about events on the FR cluster, which is in most regards quite separate form the main production cluster... just a thought, depending on what sorts of pings may be useful for u, maybe you'd like see if there's a way to turn off your pings for fundraising cluster alerts? again just a thought [19:20:17] hmm, no dwisehaupt or Jeff_Green on IRC right now [19:20:43] ejegg: probably just a runaway query for something, no? [19:20:57] AndyRussG: maybe, or maybe a replication problem [19:21:17] RhinosF1: the FR SREs to have their own phone-based pings to hear about anything that does actually become critical [19:21:59] AndyRussG: unfortunately stupid irc doesn't let me do per channel pings [19:22:04] RhinosF1: are you able to ack that alert and note that fr-tech is investigating? [19:22:07] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 27.76, 30.13, 28.06 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:22:08] and there are also a bunch of e-mail alerts sent only to FR engineers for other possible issues, so we do mostly noice when things are blowing up [19:22:23] *notice [19:22:41] What I find strangest is the fr alerts go off constantly [19:22:50] Most alerts only go off when there's a state change [19:23:50] RhinosF1: my Icinga account doesn't have permissions to ack that, it turns out. Are you able to? [19:24:07] You can put 'fr-tech is investigating' in the comment [19:24:12] ejegg: I'm just a normal person. I can find someone [19:24:32] RhinosF1: ejegg also I think it's not urgent to ack? other than RhinosF1's ping issue? [19:25:08] ah, OK, I thought maybe getting pinged would on alert would mean you had ack rights. We can contact our ops team via other means [19:25:09] RhinosF1: just saying, if you can tolerate a few pings, for us it's not urgent enough to wake up any vacationing SREs elsewhere, I think [19:25:26] AndyRussG: ok [19:25:37] RhinosF1: hugely appreciated and apologies that it's a bother [19:25:47] I did raise on another task that alerts shouldn't really go off every 5 minutes [19:26:04] * RhinosF1 mutes the channel [19:26:48] RhinosF1: maybe we can somehow change the IRC message to have some FR-specific string that you can include in a regex for ping setup? [19:27:04] 10fundraising-tech-ops, 10observability: check_mysql / load on fr* is extremely spammy - https://phabricator.wikimedia.org/T296811 (10RhinosF1) 05Resolved→03Open a:05Jgreen→03None [19:27:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 28.41, 29.20, 28.22 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [19:27:15] 10fundraising-tech-ops, 10observability: check_mysql / load on fr* is extremely spammy - https://phabricator.wikimedia.org/T296811 (10RhinosF1) check_load is the same [19:27:46] AndyRussG: very unlikely as it's based on CRITICAL: [19:27:52] I muted the channel for now [19:30:34] AndyRussG: I guess frdb1003 is the fr-analytics db server? [19:30:52] so it could be a superset refresh gone awry? [19:32:11] PROBLEM - check_load on frdb1003 is CRITICAL: CRITICAL - load average: 12.77, 23.01, 26.10 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [20:02:11] RECOVERY - check_load on frdb1003 is OK: OK - load average: 0.48, 0.45, 4.06 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frdb1003&service=check_load [23:33:01] 10Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Move attempt to guess contact's country & language from silverpop to import scripts to CiviCRM - https://phabricator.wikimedia.org/T298400 (10Eileenmcnaughton)