[00:51:07] (03PS1) 10Cstone: Add a new recurring that will fail [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269084 [00:51:23] larssandergreen: ^ that will put a recurring that will fail onto the donations queue [00:51:34] but you do have to run it from smashpig container do you have one of those locally? [00:51:37] (03CR) 10CI reject: [V:04-1] Add a new recurring that will fail [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269084 (owner: 10Cstone) [00:52:42] (03PS2) 10Cstone: Add a new recurring that will fail [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269084 [01:10:54] (03PS1) 10Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269090 [01:10:58] (03CR) 10Ejegg: [C:03+2] Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269090 (owner: 10Ejegg) [01:12:10] (03Merged) 10jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269090 (owner: 10Ejegg) [01:14:33] ok so I want to turn off recurring charges for a run to deploy that [01:14:40] the :21 run won't happen [01:17:46] nice, near the start of the run you see the O.G.s [01:18:01] running the .67s and .68s now [01:18:11] and... done [01:18:32] yeah the OGs are hadrcore [01:18:38] ooh, first run had some .93s [01:18:51] ejegg: do you think we should change the (auto) max failures reached to show how many that is now? [01:18:55] .93?! [01:19:10] i guess that can be a follow on [01:19:31] gotta be ported over from ingenico [01:20:43] !log fundraising civicrm upgraded from e60321bb to d8d3871c [01:20:44] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [01:20:52] ok, let's do some short runs [01:21:50] ok, those are looking good [01:22:05] since it's :21 I'll just do a full run and watch the output a bit [01:22:17] and I'll turn it back on in settings [01:22:41] do we want to split it [01:22:44] while youre in there [01:22:50] ooh yeah [01:22:58] let's get the new median cr_id [01:23:00] it almost ran out of time on the 6th [01:23:11] it was off for one hour but still [01:23:35] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: recurring dlocal gravy charge failed due to cvv issue - https://phabricator.wikimedia.org/T422773 (10AnnWF) 03NEW [01:23:56] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: recurring dlocal gravy charge failed due to cvv issue - https://phabricator.wikimedia.org/T422773#11802722 (10AnnWF) https://wikimedia.slack.com/archives/C070F1DVBRN/p1775612954569329 @Ejegg already slacked gravy for detail [01:25:24] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: recurring dlocal gravy charge failed due to cvv issue - https://phabricator.wikimedia.org/T422773#11802724 (10Ejegg) These are not going to be possible to rescue - we just need to avoid creating any more. Here's the task to do that: {T422566} [01:25:36] oh actually, wait, they had a successful charge once? [01:27:17] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: recurring dlocal gravy charge failed due to cvv issue - https://phabricator.wikimedia.org/T422773#11802728 (10Ejegg) Sorry, I should have read the description of this task better - very interesting that they actually got a successful recurring charge thro... [01:30:48] (03CR) 10Cstone: [C:03+2] "Thanks for all the work on this! [1,2] behaves locally and matches our current setup." [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1267104 (https://phabricator.wikimedia.org/T413905) (owner: 10Ejegg) [01:30:59] woohoo cstone ! [01:31:04] yeah looks good ejegg [01:31:21] im gona eat then look at the rest of the chain [01:32:12] thanks! [01:34:23] ok, nothing being sent to pending here [01:34:42] I'll re-enable the job and push that, then come up with the config for the split jobs [01:35:18] the hard part: do we call them recurring_smashpig_charge_older and _newer? [01:38:49] ok, that's re-enabled, just in case I get distracted [01:38:59] ah yeah naming is hard hah [01:39:33] _ogs and _yungblud ? [01:40:02] _classical & _modern [01:40:30] gonna go with _older and _newer [01:40:51] current halfway point is 2275990 [01:52:55] ok cstone I just pushed the split configuration [01:52:59] (03Merged) 10jenkins-bot: Add setting for arbitrary charge retry cadence [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1267104 (https://phabricator.wikimedia.org/T413905) (owner: 10Ejegg) [01:53:15] for review [01:53:24] (not to civicrm) [01:57:42] cstone: shoot, I'm going to have to get to bed before the current run is over and we can get the split config out [01:57:57] I'll revert that and just paste the configs into an etherpad for review [01:58:43] yeah thanks ejegg i just got back to the comp we can do it tomo [02:00:35] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: Recurring charge job not getting through that days charges in 24 hours (February 28 and future problem) - https://phabricator.wikimedia.org/T418824#11802766 (10Ejegg) Configuration for review: https://etherpad.wikimedia.org/p/recurringChargeSplitConfig [02:01:33] ok, pushed the revert and pasted a link to the etherpad with config ^^ [02:01:36] good night! [03:02:44] (03CR) 10Cstone: [C:03+2] Add setting for minimum days between charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268299 (https://phabricator.wikimedia.org/T413905) (owner: 10Ejegg) [03:03:01] (03CR) 10CI reject: [V:04-1] Add setting for minimum days between charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268299 (https://phabricator.wikimedia.org/T413905) (owner: 10Ejegg) [03:03:02] (03CR) 10CI reject: [V:04-1] Skip retry calculations for un-retryable [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268583 (owner: 10Ejegg) [03:03:03] (03CR) 10CI reject: [V:04-1] Use API4 to update contribution recur [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268608 (owner: 10Ejegg) [09:58:18] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2021 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 11 hours - memory use is 261.25M (peak 288.24M, 3.33% of max, fragmentation 1.04%), connected_slaves is 5, donations is 2, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 72, jobs-paypal is 4, payments-antifraud is 3, payments-init is 20, recurring is 78, refund is 0, [09:58:18] ibe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:03:16] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2202 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 34 keys, up 22 days 11 hours - memory use is 261.40M (peak 288.24M, 3.33% of max, fragmentation 1.04%), connected_slaves is 5, donations is 7, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 36, jobs-paypal is 71, payments-antifraud is 1, payments-init is 1, recurring is 67, refund is 0, [10:03:16] ibe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:08:16] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2384 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 11 hours - memory use is 262.26M (peak 288.24M, 3.34% of max, fragmentation 1.04%), connected_slaves is 5, donations is 4, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 4, jobs-paypal is 214, payments-antifraud is 0, payments-init is 14, recurring is 771, refund is [10:08:16] cribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:13:16] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2534 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 11 hours - memory use is 262.67M (peak 288.24M, 3.35% of max, fragmentation 1.04%), connected_slaves is 5, donations is 104, jobs is 0, jobs-adyen is 1, jobs-dlocal is 0, jobs-gravy is 56, jobs-paypal is 222, payments-antifraud is 5, payments-init is 2, recurring is 808, refund i [10:13:16] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:18:16] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2732 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 28 keys, up 22 days 11 hours - memory use is 262.99M (peak 288.24M, 3.35% of max, fragmentation 1.04%), connected_slaves is 5, donations is 5, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 39, jobs-paypal is 207, payments-antifraud is 5, payments-init is 22, recurring is 1634, refund i [10:18:16] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:23:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 2843 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 25 keys, up 22 days 11 hours - memory use is 263.55M (peak 288.24M, 3.36% of max, fragmentation 1.04%), connected_slaves is 5, donations is 586, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 0, jobs-paypal is 196, payments-antifraud is 3, payments-init is 0, recurring is 847, refund is [10:23:19] scribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:28:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3000 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 26 keys, up 22 days 11 hours - memory use is 263.51M (peak 288.24M, 3.36% of max, fragmentation 1.04%), connected_slaves is 5, donations is 22, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 63, jobs-paypal is 166, payments-antifraud is 4, payments-init is 9, recurring is 1743, refund i [10:28:19] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:33:17] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3148 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 28 keys, up 22 days 11 hours - memory use is 264.89M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 1272, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 26, jobs-paypal is 211, payments-antifraud is 3, payments-init is 3, recurring is 721, refund [10:33:17] ubscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:38:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3303 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 29 keys, up 22 days 11 hours - memory use is 263.56M (peak 288.24M, 3.36% of max, fragmentation 1.04%), connected_slaves is 5, donations is 65, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 2, jobs-paypal is 179, payments-antifraud is 4, payments-init is 16, recurring is 1633, refund i [10:38:19] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:43:17] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3472 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 26 keys, up 22 days 11 hours - memory use is 264.74M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 906, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 57, jobs-paypal is 192, payments-antifraud is 5, payments-init is 5, recurring is 841, refund i [10:43:17] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:48:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3638 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 25 keys, up 22 days 11 hours - memory use is 263.79M (peak 288.24M, 3.36% of max, fragmentation 1.04%), connected_slaves is 5, donations is 10, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 33, jobs-paypal is 143, payments-antifraud is 3, payments-init is 18, recurring is 1702, refund [10:48:19] ubscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:52:49] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: Undefined array key "order_id" Errors in PendingQueueConsumer.php - https://phabricator.wikimedia.org/T422807 (10jgleeson) 03NEW [10:53:17] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3775 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 27 keys, up 22 days 12 hours - memory use is 264.63M (peak 288.24M, 3.37% of max, fragmentation 1.04%), connected_slaves is 5, donations is 711, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 5, jobs-paypal is 205, payments-antifraud is 4, payments-init is 2, recurring is 807, refund is [10:53:17] scribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [10:58:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 3931 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 12 hours - memory use is 264.19M (peak 288.24M, 3.36% of max, fragmentation 1.04%), connected_slaves is 5, donations is 16, jobs is 0, jobs-adyen is 1, jobs-dlocal is 0, jobs-gravy is 43, jobs-paypal is 158, payments-antifraud is 7, payments-init is 13, recurring is 1656, refund [10:58:19] ubscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:03:17] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4124 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 12 hours - memory use is 265.70M (peak 288.24M, 3.39% of max, fragmentation 1.04%), connected_slaves is 5, donations is 1206, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 24, jobs-paypal is 179, payments-antifraud is 3, payments-init is 1, recurring is 810, refund [11:03:17] ubscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:04:23] (03PS1) 10Jgleeson: Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269400 (https://phabricator.wikimedia.org/T422807) [11:05:53] fr-tech ^^ that fixes the pending queue consumer failures bug - however, the volume of pending messages has me worried, so I think we might need to roll Civicrm back to stem the flow before deploying the fix [11:08:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4290 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 29 keys, up 22 days 12 hours - memory use is 264.46M (peak 288.24M, 3.37% of max, fragmentation 1.04%), connected_slaves is 5, donations is 0, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 0, jobs-paypal is 199, payments-antifraud is 0, payments-init is 10, recurring is 1666, refund is [11:08:19] scribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:13:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4457 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 12 hours - memory use is 265.50M (peak 288.24M, 3.39% of max, fragmentation 1.04%), connected_slaves is 5, donations is 876, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 15, jobs-paypal is 213, payments-antifraud is 5, payments-init is 4, recurring is 795, refund i [11:13:19] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:18:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4611 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 12 hours - memory use is 264.84M (peak 288.24M, 3.37% of max, fragmentation 1.04%), connected_slaves is 5, donations is 10, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 6, jobs-paypal is 195, payments-antifraud is 2, payments-init is 11, recurring is 1653, refund i [11:18:19] bscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:21:22] fr-tech I think this issue is actually a non-issue - apart from the missing order_id field, which we have a fix for. Today's volume bump is due to the recurring charge job processing payments and now deciding which queue to push the transaction to based on its status. Previously, we were not checking the status and instead just pushing them all to the donation queue. Now that we're checking the status, we're hitting a [11:21:22] scenario where we don't get an immediate final status, resulting in a pending state (e.g., capturing). The charge job then picks the pending queue when making its target queue decision, which is fine because the transactions are eventually captured. [11:23:14] also, I don't want to rollback Civicrm as it looks like ejegg|away deployed the new recurring work last night, so I'm not sure what state that is, and wouldn't want to risk introducing issues with that [11:23:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4744 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 12 hours - memory use is 265.53M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 635, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 7, jobs-paypal is 181, payments-antifraud is 2, payments-init is 2, recurring is 810, refund is [11:23:19] scribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:24:16] let's see, maybe we should also patch the pending queue consumer just to use the invoice_id so that the downstream consumer can handle the missing field and quiet down the failures [11:24:41] I could probably self-merge that to get it out and avoid those 4744 pending messages triggering more failures [11:28:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 4889 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 12 hours - memory use is 265.26M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 19, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 100, jobs-paypal is 199, payments-antifraud is 1, payments-init is 12, recurring is 1615, refund [11:28:19] subscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:33:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5085 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 32 keys, up 22 days 12 hours - memory use is 266.64M (peak 288.24M, 3.40% of max, fragmentation 1.04%), connected_slaves is 5, donations is 1106, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 87, jobs-paypal is 178, payments-antifraud is 4, payments-init is 3, recurring is 906, refund [11:33:19] ubscribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:38:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5223 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 31 keys, up 22 days 12 hours - memory use is 265.34M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 0, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 3, jobs-paypal is 124, payments-antifraud is 3, payments-init is 11, recurring is 1721, refund is [11:38:19] scribe is 0 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:42:31] (03PS1) 10Jgleeson: Fall back to invoice_id when order_id missing in pending consumer [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269411 (https://phabricator.wikimedia.org/T422807) [11:43:17] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5408 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 28 keys, up 22 days 12 hours - memory use is 266.11M (peak 288.24M, 3.39% of max, fragmentation 1.04%), connected_slaves is 5, donations is 998, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 59, jobs-paypal is 8, payments-antifraud is 2, payments-init is 0, recurring is 194, refund is [11:43:17] cribe is 1 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:44:53] fr-tech I'm going to self-merge that fix ^ and push it out to smashpig standalone to help quiet down the failures [11:45:34] (03CR) 10Jgleeson: [C:03+2] "self-merging due to pending failures and no one about." [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269411 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [11:46:09] (03Merged) 10jenkins-bot: Fall back to invoice_id when order_id missing in pending consumer [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269411 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [11:48:19] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5583 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 29 keys, up 22 days 12 hours - memory use is 264.64M (peak 288.24M, 3.37% of max, fragmentation 1.04%), connected_slaves is 5, donations is 4, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 19, jobs-paypal is 15, payments-antifraud is 4, payments-init is 9, recurring is 235, refund is 0 [11:48:19] ribe is 1 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:50:13] (03PS1) 10Jgleeson: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - 10https://gerrit.wikimedia.org/r/1269415 [11:51:14] (03CR) 10Jgleeson: [C:03+2] Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - 10https://gerrit.wikimedia.org/r/1269415 (owner: 10Jgleeson) [11:51:43] (03Merged) 10jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/SmashPig] (deployment) - 10https://gerrit.wikimedia.org/r/1269415 (owner: 10Jgleeson) [11:53:18] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5702 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 13 hours - memory use is 264.65M (peak 288.24M, 3.37% of max, fragmentation 1.04%), connected_slaves is 5, donations is 2, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 3, jobs-paypal is 21, payments-antifraud is 6, payments-init is 4, recurring is 28, refund is 0, [11:53:18] be is 1 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [11:53:30] !log SmashPig upgraded from 5c083891 to 100101fb [11:53:31] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [11:53:37] ok let's see if that quiets down those failmails [11:55:47] gonna grab some lunch - family here. will check back soon [11:58:18] PROBLEM - check_redis on frqueue1003 is CRITICAL: CRITICAL: pending is 5875 2000 - REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 13 hours - memory use is 264.95M (peak 288.24M, 3.38% of max, fragmentation 1.04%), connected_slaves is 5, donations is 16, jobs is 0, jobs-adyen is 2, jobs-dlocal is 0, jobs-gravy is 66, jobs-paypal is 15, payments-antifraud is 0, payments-init is 13, recurring is 72, refund is [11:58:18] cribe is 1 https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [12:03:18] RECOVERY - check_redis on frqueue1003 is OK: OK: REDIS 7.0.15 on 127.0.0.1:6379 has 1 databases (db0) with 30 keys, up 22 days 13 hours - memory use is 259.36M (peak 288.24M, 3.31% of max, fragmentation 1.04%), connected_slaves is 5, donations is 2, jobs is 0, jobs-adyen is 0, jobs-dlocal is 0, jobs-gravy is 27, jobs-paypal is 20, payments-antifraud is 4, payments-init is 3, pending is 0, recurring is 38, refund is 0, unsubscribe is 0 ht [12:03:18] nga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=frqueue1003&service=check_redis [12:08:15] Ahh jgleeson|lunch I'm up too early if you need any more help, ejegg|away was worried about this happening last night and tested it but the charge job starts with the oldest so no gravy's were charged with those tests [12:16:35] (03CR) 10Damilare Adedoyin: [C:03+2] Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269400 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [12:17:43] are we actually getting those recurrings into civi from gravy? [12:18:06] one from an hour ago isnt there [12:19:40] damilare: jgleeson|lunch ^ I'm gona pause the charge job until we know that [12:19:56] ok sounds good [12:20:32] we were literally talking about splitting the charge job last night which would have surfaced this sooner [12:23:34] okay thats paused but I was a minute too slow at it so the :21 job started [12:23:38] so 500 more in this state [12:26:52] ok they are in captured in the gravy console so worst case they will come in on the audit tomorrow [12:28:43] seems some are coming in from gravy into civi [12:28:58] is the ipn delay a bit then? [12:29:50] I just checked a couple of cc recurrings [12:30:04] https://civicrm.wikimedia.org/civicrm/contact/view?reset=1&cid=68670294 this one isnt there yet [12:30:13] which one's did you see not yet in civi [12:30:19] from an hour ago [12:30:28] https://civicrm.wikimedia.org/civicrm/contact/view?reset=1&cid=66385559 [12:30:37] these were the two I looked at lemme look at an older one [12:31:15] ok lemme check the logs too [12:32:11] heres an older one Payment successful - invoice_id: 233777149.9 with status: Pending and processor_id: cb0704b2-928a-43e2-aff1-3a2c1e316e10 [12:32:55] hmmi dont see that one either https://civicrm.wikimedia.org/civicrm/contact/view?reset=1&cid=33407835 [12:34:12] ok looks like we did get the message in the listener [12:34:26] for the first one, lemme check the next [12:34:35] have we been ignoring them for this situation? [12:35:02] im gona tell DR they are not in civi while we figure this out because they are eagle eyed and someone will notice :) [12:43:00] (03Merged) 10jenkins-bot: Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269400 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [12:43:25] ok looks that transaction was added to the pending queue without the order_id so when the success message came the Gravy RecordCaptureJob could not find it [12:44:52] ah okay and i half alseep misread what jgleeson|lunch had said [12:44:57] seems Jacks fix got the PTR to pick it up but since it was already successful it dropped it [12:45:20] ahh hah [12:45:58] i think we can just revert what elliott and wenjun changed last night, if we want to wait a couple hours for them to get up too [12:46:10] we can split the charge job like we were talking about last night to catch up [12:46:39] im gona go sleep for one more hour hah [12:47:37] np [12:48:00] deploying the smashpig patch should fix it too [12:48:10] back [12:48:31] wb jgleeson and thanks for the quick fix [12:48:47] np - do we have new bugs ha [12:50:01] nah c.stone just stopped the recurring job so no more recurrings are affected [12:50:18] oh right [12:50:27] thanks for merging that other patch [12:50:35] that should fix the issue at source I think [12:51:04] there will have been ones lost which are unrelated to the fix [12:51:20] as the lack of an order_id will mean the recordcapture job failed [12:51:21] some donations did not make it through because they were successful before ptr got them and the request charge job could not find them in the pending table without the order id [12:51:33] yes exactly [12:51:53] yeah those ones were already lost i guess so the audit will help us there [12:52:01] yeah ^ [12:52:30] nothing like a good meltdown to remind everyone how it all works :D [12:52:52] we can also revert elliotts and wenjuns change from last night, this was all for pix recurrings hwich i dont think we have many of yet [12:53:10] we can leave the charge job off till everyone wakes up cause I wanted to split it last night too [12:53:17] so we can catch up with that later today [12:54:02] cstone: i don't think we need to revert as the push to pending is actually more accurate, it just gets there a different way. previously we were hard coding the donations queue which is technically wrong as some of them might not have been settled [12:54:11] so the audit would have cleaned those up too, from the other end [12:54:19] But we still update everything in civi for the recurrings [12:54:26] Like the next scheduled date [12:54:42] Which I don't think we have a flow for it it's not captured [12:54:56] yeah that's a good point. [12:54:57] So maybe add that in if we don't revert [12:55:11] we should diagram this and the edge cases [12:55:23] let's see what diagrams do we have around recurring [12:55:54] https://wikitech.wikimedia.org/wiki/Fundraising/Data_and_flow/Recurring [12:55:55] I guess this would help with the ach issue but what status should the recurring charge job show for that [12:56:00] looks like a perfect place [12:56:03] Ok I need to sleep one more hour haha but I'll be back [12:56:20] ah right - sleep tight ha! [13:00:25] (03PS1) 10Jgleeson: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269446 [13:00:34] I'll push that civicrm fix out [13:00:43] thanks for the merge damilare [13:00:49] np [13:04:46] actually I'm gonna cherry pick the fix. that deploy also includes ejegg|away recurring schedule updates https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/1267104 which I thought had already gone out [13:04:57] I don't wanna send that out if it hasn't already gone [13:06:01] (03PS1) 10Jgleeson: Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269447 (https://phabricator.wikimedia.org/T422807) [13:06:04] (03Abandoned) 10Jgleeson: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269446 (owner: 10Jgleeson) [13:06:17] (03CR) 10Jgleeson: [C:03+2] Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269447 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [13:07:30] (03Merged) 10jenkins-bot: Add order_id to pending queue messages from recurring charges [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269447 (https://phabricator.wikimedia.org/T422807) (owner: 10Jgleeson) [13:08:57] !log civicrm upgraded from d8d3871c to 3d3c0a62 [13:08:58] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [13:14:48] ok looks like the recurrings created from the affected transactions are failing to import due to the missing donations [13:15:06] they're in the damaged queue now [13:17:04] oh wait that doesn't make sense... these are from one time donations today [13:18:05] huh [13:18:42] some recurring signups are being pushed to the damaged queue because the initial donations didn't come in [13:19:15] it isn't related to the recurring charge job issue you were fixing [13:19:37] ah right [13:20:03] I wonder if the pending queue backlog was preventing the donation being captured [13:21:39] possibly the RecordCaptureJob couldn't find the transaction in the pending table [13:22:09] and by the time the PTR got to it was already successfuul [13:22:19] and by the time the PTR got to it, it was already successfuul [13:29:36] 👍 [13:31:41] 06Fundraising-Backlog, 06FR-donorrelations: diagnose Trustly donation failure - https://phabricator.wikimedia.org/T422724#11804407 (10MBeat33) 05Open→03Resolved a:03MBeat33 Closing this, but if we get confirmation of something buggy from the donor may reopen. [13:34:26] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 06FR-donorrelations: diagnose Trustly donation failure - https://phabricator.wikimedia.org/T422724#11804419 (10jgleeson) [13:37:27] arg! looks like that pending thing caused a mess? [13:37:30] sorry fr-tech [13:37:37] looking at the fix [13:37:53] np ejegg. it's fixed and the affected records will get backfilled by the audit. summary sent in email [13:38:55] dang, missing 'order_id' [13:39:04] yep [13:39:51] ...and just dropped, not even set to 'damaged [13:39:52] ' [13:40:31] I can snag those from the logs and get them to the donations queue, to clear it up before waiting for the audit [13:40:32] I don't think they were dropped ejegg [13:40:43] they still got saved but with an empty order_id [13:40:45] jgleeson: oh, are they in pending without the order_id? [13:40:47] ah [13:41:09] the record capture job uses that field so that's where the "drop" would be [13:45:49] ugh, so my tests last night were bad because it was still just hitting the older non-gravy donations [13:46:43] hah, i would have caught it if I had deployed that split job config! [13:47:04] ok, so I would still prefer to re-send these messages to 'donations' [13:47:12] just to make sure they get recorded in the usual way [13:47:26] undo your change? [13:47:50] oh you mean rescue the ones from today [13:47:59] the audit will send them to the donations queue I think [13:50:51] OK, I re-queued them all to the donations queue [13:51:40] right, i wasn't sure if the audit would get all the details right if it can't find the extra stuff in the pending table due to the missing order_id [13:51:43] that was quick [13:52:00] yeah, I made sure to log the whole JSON blob before sending to pending, so I just grepped them out [13:52:07] and used PopulateQueueFromDump [13:52:14] nice [13:52:26] actually ejegg [13:52:31] some of those will have been patched [13:52:37] with the first fix at the consumer [13:52:48] fix at the consumer? [13:52:51] i.e. the log lines won't all be bad ones [13:52:54] lemme pull the latest [13:52:59] yeah let me share a link [13:53:12] that's fine, the donations queue consumer can discard the duplicate [13:53:12] https://phabricator.wikimedia.org/T422807 [13:53:15] s [13:53:28] hotfix was https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/SmashPig/+/1269411 [13:53:35] source fix was https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/1269400 [13:54:03] hotfix added order_id on the fly to already queued messages destined to fail [13:54:12] ok, great, that should work [13:54:48] so what I was expecting was that for anything coming back as pending we would quickly get the capture IPN [13:55:10] and that would get it into the donations queue almost as fast as directly sending it there [13:55:32] but would account for the possibility of caputure or ach installment failing [13:55:51] anyway, let me look at the logs to see if that was actually happening for any of them [13:56:27] I think this is a good change (sending to pending) as it more accurately reflects the flow [13:56:40] I think it was intended just for pix? but as a byproduct we got it for all [13:57:50] yeah, I was worried it would catch other things, so last night I did a couple slow-starts then monitored a whole run of the charge job [13:57:53] and didn't see any [13:58:01] of course, those were all going through Adyen [13:58:05] since we charge older ones first [13:58:15] I figured we were fine and went to sleep [13:58:27] yh it would be useful for the trustly ach flow as well [13:58:43] but at 8:something UTC when it hit the gravy charges it started to get plenty of pendings [13:59:26] i really should have expected that as we see plenty of pending when we try to capture in sandbox [14:00:17] ok, so let's see if anything got processed via ipn jobs from the pending queue [14:02:32] in related news: I was reminded myself how the audit pushed stuff to the queue here https://github.com/wikimedia/wikimedia-fundraising-crm/blob/979e930f203f881c16ff2079124474339ee048c0/ext/wmf-civicrm/Civi/WMFAudit/BaseAuditProcessor.php#L1445 [14:02:57] I was thinking, this is a big file, but 50% of it is comments ha [14:03:05] good problems [14:03:34] reminding* [14:03:56] ok, 462 of them DID get processed via IPN messages [14:05:06] huh, I don't see anything in pending with null order_id [14:05:19] sorry, lemme read your email, maybe you cleared those out? [14:05:42] I didn't clear any out [14:07:26] ejegg: I looked at this code https://github.com/wikimedia/wikimedia-fundraising-SmashPig/blob/697224613c770ec312d87a3913622093aa8f3311/Core/QueueConsumers/PendingQueueConsumer.php#L27 [14:08:15] oh maybe it defaults to '' [14:08:17] let me see [14:08:40] nope, nothing with '' either [14:09:58] hmm, looking up some of the messages from the failmails by gateway_txn_ids and I'm not seeing anything either [14:10:44] looks like that undefined array key must have caused a fatal error [14:11:49] or at least one the queue consumer's error handling didn't know to handle by kicking out to 'damaged' [14:11:50] also looking [14:13:16] PendingDatabase validation just requires one of order_id or gateway_txn_id [14:13:23] and all these seem to have the gateway_txn_id [14:13:43] I thought those were PHP warnings [14:14:04] i.e. non-fatal [14:14:11] yeah, I'm surprised too [14:14:18] that it even alerted [14:14:39] that made more sense to me [14:14:44] we have undefined array keys peppered around the logs still, but not causing failmail [14:14:57] do you see them actually stored in the table though? [14:15:01] I'll look for a few more [14:15:14] just looking for a failed trxn id [14:16:48] I think the record capture job might be doing it [14:16:52] 2026-04-09 09:05:02,079 INFO 2026-04-09T09:05:02+00:00 [WARNING] {SmashPig-Gravy-QueueJobRunner::SPCID-0444502230} Could not find donor details for authorization Reference '8de60e07-6535-4c66-83b3-0025e36121e5' and order ID '246045244.1'. [14:17:08] that's a symptom of the row not being there though ^^^ [14:17:25] oh wait there [14:17:30] anyway, it doesn't delete rows any more, just marks them resolved [14:17:44] the PTR got there [14:17:46] ah right [14:17:59] it looks like the PTR deal with it. does that remove them maybe [14:18:10] no, just marks them is_resolved [14:18:24] 2026-04-09 11:40:03,065 ERROR civicrm.wmf.INFO: Payment is not awaiting approval - current status: complete for order_id: 246045244.1 [14:18:33] "pending_id":141657525 [14:18:44] hmm well it must have been there to be picked up [14:18:50] I'll try running one of those through the un-patched pending qc [14:19:06] `/var/log/process-control/gravy_pending_resolver/gravy_pending_resolver-20260409-114002.log` [14:19:06] jgleeson: but that's a .1 order_id [14:19:22] it wouldn't be a recurring charge if it's a .1, right? [14:19:56] ohh man [14:20:02] this is my eyes [14:20:09] look at this email and you will understand [14:20:13] FAILMAIL - ALERT: civi1002 (no_provider) SmashPig-ConsumePendingQueue::SPCID-0174012897 [14:20:22] there's a good one above it (the one I'm looking into) [14:20:25] grrr [14:20:30] ok lemme look at the bad one [14:21:49] yeah, lots of extraneous data in those failmails [14:21:54] look up 230803219.12 [14:25:38] jgleeson: right, it looks like it never got inserted into pending [14:25:56] the IPN came in but the recordcapturejob didn't find details [14:27:10] and then it finally came into the donations queue consumer when I re-queued them all around 13:50 UTC [14:29:19] ohhhhh the PendingQueueConsumer is not like all the civi-based ones [14:29:47] yeah I can't find it either [14:30:03] it doesn't have the same error handling since it's pure SmashPig code [14:30:12] no WmfException stuff [14:30:43] anyway, I got a bad message into the queue, let me try running it [14:32:05] aha, it's in the isTransactionFailed() check [14:32:18] SmashPig\Core\DataStores\PaymentsInitialDatabase::fetchMessageByGatewayOrderId(): Argument #2 ($orderId) must be of type string, null given, called in /srv/smashpig/Core/DataStores/PaymentsInitialDatabase.php on line 24 [14:32:40] so typesafety gives us this uncatchable thing [14:32:47] and the message is just lost [14:33:18] I'll reply to the email with these details [14:34:17] ah right [14:34:44] I guess they were lost then so the audit would have pulled them in as a last resort [14:35:17] although you've manually backfilled them which is handy [14:40:43] yeah, the audit would have recorded them oddly absent the details from the pending table [14:46:35] ok, so these 'recurring queue run before donations queue' are completely unrelated to that? [14:47:37] damilare: i see in the backscroll that you were looking at those [14:48:02] do you want me to puzzle over that, as I'm on the chaos crew again this sprint? [14:48:03] they were due to the backlog of the pending ejegg [14:48:07] ohhhh [14:48:08] sure [14:48:31] right, so the pending qc actually quit each time it hit a bad message, huh? [14:48:43] and then it doesn't run again for another whole 10 minutes [14:49:16] dang, it should really be a service-type thing that just restarts every time it quits [14:49:24] by the time the capture success webhook message came in for the new one time donations from the listener, the pending table hadn't been populated with the new transaction [14:49:39] yeah damilare that's pretty bad [14:49:58] since the record capture job just drops it in that case, right? [14:50:19] ok, I'll try to replay the affected messages [14:50:34] (the jobs for the initial donations) [14:51:04] yh [14:51:31] looks like no phab for that particular issue yet, right? [14:51:38] are you also using the PopulateQueueFromDump for those [14:51:46] the replay [14:51:50] damilare: yeah [14:51:50] no, I'd write one now [14:52:07] sure, thanks! [14:55:26] man, that pending queue consumer needs to be more reliable [14:55:35] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: MISSING_PREDECESSOR failmail on Recurring Sign-up messages - https://phabricator.wikimedia.org/T422839 (10Damilare) 03NEW [14:55:42] given that so many things can go wrong when it's not running [14:55:44] thanks damilare [14:56:33] * ejegg is tempted to run the darn thing in a while true; loop... [15:06:40] 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10FR-Donor-portal: Donor Portal - handling of PayPal donations while edits are not possible - https://phabricator.wikimedia.org/T421962#11804960 (10Damilare) a:03Damilare [15:24:38] (03PS1) 10Ejegg: Handle Errors in queue consumer, not just Exceptions [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269501 (https://phabricator.wikimedia.org/T422839) [15:24:52] I think ^^^ would have mitigated the impact there [15:26:01] the bad pending messages would still have caused failmail, but would have gone to 'damaged' instead of being dropped, and the show (queue processor) would have gone on [15:32:49] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: Undefined array key "order_id" Errors in PendingQueueConsumer.php - https://phabricator.wikimedia.org/T422807#11805253 (10Ejegg) Thanks so much for these fixes, Jack! And sorry for the noise. After your first fix started getting the messages started int... [15:35:39] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: Undefined array key "order_id" Errors in PendingQueueConsumer.php - https://phabricator.wikimedia.org/T422807#11805289 (10Ejegg) Note this related phab for a problem caused by the pending queue being repeatedly killed due to bad messages: {T422839} This... [15:37:10] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 13Patch-For-Review: MISSING_PREDECESSOR failmail on Recurring Sign-up messages - https://phabricator.wikimedia.org/T422839#11805305 (10Ejegg) Thanks for figuring that out, Damilare! And again, sorry for the mess. So the root cause was the bad pending... [15:37:13] (03CR) 10Damilare Adedoyin: [C:03+2] Handle Errors in queue consumer, not just Exceptions [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269501 (https://phabricator.wikimedia.org/T422839) (owner: 10Ejegg) [15:38:06] lol even officewiki has a Donate link in the header [15:38:10] (03Merged) 10jenkins-bot: Handle Errors in queue consumer, not just Exceptions [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269501 (https://phabricator.wikimedia.org/T422839) (owner: 10Ejegg) [15:38:19] thanks for the CR Dami! [15:42:04] I copied the email replies to the phabs to avoid them being buried [15:42:30] make sense, thanks for figuring that out too [15:43:28] heh, PHP7 has only been around 10 yrs or so, better late than never :P [15:45:13] k, let me smoketest that with another bad pending message [15:48:29] yep, ends up in the damaged store now [15:48:48] Well, that probably merits a smash-pig tag [15:49:12] jgleeson: I'll revert your temp consumer fix too, ok? [15:49:26] sorry ejegg just on the departures call [15:49:31] will check backlog [15:49:52] (03PS1) 10Ejegg: Revert "Fall back to invoice_id when order_id missing in pending consumer" [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269510 [15:50:29] was just going to do that revert before tagging ^^^ jgleeson [15:50:47] since your fix to the Civi job should ensure they all have order_id now [15:50:58] yeah thanks ejegg [15:51:01] :) [15:51:06] (03CR) 10Ejegg: [C:03+2] Revert "Fall back to invoice_id when order_id missing in pending consumer" [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269510 (owner: 10Ejegg) [15:51:39] (03Merged) 10jenkins-bot: Revert "Fall back to invoice_id when order_id missing in pending consumer" [wikimedia/fundraising/SmashPig] - 10https://gerrit.wikimedia.org/r/1269510 (owner: 10Ejegg) [15:54:55] 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Add custom fields to Direct Mail activities for segment and ask - https://phabricator.wikimedia.org/T422728#11805452 (10Lars) Scheduled for Q4 maintenance window. [16:08:09] 03Fundraising Sprint - Floor is Lava, 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 07fr-current-sprint, 05FY25-26 WE3.5 Donor Identification and recognition: CiviCRM is connected to MediaWiki - https://phabricator.wikimedia.org/T416950#11805500 (10Jdlrobson-WMF) [16:11:27] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: TypeError for extraData from getApplePayErrors - https://phabricator.wikimedia.org/T422855 (10AnnWF) 03NEW [16:12:46] (03PS1) 10Wfan: Do not trim if undefined [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1269515 (https://phabricator.wikimedia.org/T422855) [16:27:40] (03PS2) 10Wfan: Do not trim if undefined [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1269515 (https://phabricator.wikimedia.org/T422855) [16:30:18] (03PS11) 10Ejegg: Add setting for minimum days between charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268299 (https://phabricator.wikimedia.org/T413905) [16:30:18] (03PS7) 10Ejegg: Skip retry calculations for un-retryable [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268583 [16:30:18] (03PS5) 10Ejegg: Use API4 to update contribution recur [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268608 [16:30:18] (03PS5) 10Ejegg: Remove unneeded query [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268609 [16:43:13] (03CR) 10Damilare Adedoyin: [C:03+2] Do not trim if undefined [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1269515 (https://phabricator.wikimedia.org/T422855) (owner: 10Wfan) [16:43:26] thanks Dami~ [16:43:33] best wish for your passport [16:44:27] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 13Patch-For-Review: TypeError for extraData from getApplePayErrors - https://phabricator.wikimedia.org/T422855#11805714 (10AnnWF) a:03AnnWF [16:45:47] (03Merged) 10jenkins-bot: Do not trim if undefined [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1269515 (https://phabricator.wikimedia.org/T422855) (owner: 10Wfan) [16:46:17] (03PS12) 10Ejegg: Add setting for minimum days between charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268299 (https://phabricator.wikimedia.org/T413905) [16:46:30] fixed a grammatical error in the setting description ^^ [16:47:42] :) thanks wfan [16:55:24] (03CR) 10CI reject: [V:04-1] Remove unneeded query [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268609 (owner: 10Ejegg) [16:58:36] ah dang, all of those rebased ones are going to fail [16:58:53] needs one more constructor update in the test [16:59:47] 06Fundraising-Backlog: Look into what async capture means for the smashpig recurring charge job - https://phabricator.wikimedia.org/T422863 (10Cstone) 03NEW [17:01:05] (03PS13) 10Ejegg: Add setting for minimum days between charges [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268299 (https://phabricator.wikimedia.org/T413905) [17:01:05] (03PS8) 10Ejegg: Skip retry calculations for un-retryable [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268583 [17:01:05] (03PS6) 10Ejegg: Use API4 to update contribution recur [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268608 [17:01:06] (03PS6) 10Ejegg: Remove unneeded query [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268609 [17:01:23] split jobs seem to be running fine [17:01:45] as expected, nothing picked up by the _older job [17:02:25] gonna grab some lunch then will get to the promised code review [17:03:24] oh and i think we can try re-queueing those damaged 'MISSING PRECECESSOR' messages too, I just want to spot check a few IDs [17:37:30] (03CR) 10Ejegg: [C:03+2] Remove unneeded details from export test [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1268688 (https://phabricator.wikimedia.org/T416948) (owner: 10Lars SG) [17:37:52] (03CR) 10Ejegg: [C:03+2] Change AF_lifetime_usd_total in Acoustic export to both_funds_lifetime_usd_total [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1268637 (https://phabricator.wikimedia.org/T422533) (owner: 10Lars SG) [17:39:26] (03Merged) 10jenkins-bot: Remove unneeded details from export test [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1268688 (https://phabricator.wikimedia.org/T416948) (owner: 10Lars SG) [17:39:55] (03Merged) 10jenkins-bot: Change AF_lifetime_usd_total in Acoustic export to both_funds_lifetime_usd_total [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1268637 (https://phabricator.wikimedia.org/T422533) (owner: 10Lars SG) [17:50:45] 03Fundraising Sprint - Floor is Lava, 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, and 3 others: Add token for recurring donation failure link - https://phabricator.wikimedia.org/T419046#11805953 (10Lars) See {T422164} for implementation of country url param from s... [17:51:33] (03CR) 10Ejegg: [C:03+2] "Looks good! Comment is just an idea for a future patch." [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1269019 (https://phabricator.wikimedia.org/T422689) (owner: 10Lars SG) [17:53:07] (03Merged) 10jenkins-bot: Merge foundation and endowment latest fields for Acoustic export [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1269019 (https://phabricator.wikimedia.org/T422689) (owner: 10Lars SG) [17:57:04] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog: Recurring charge job not getting through that days charges in 24 hours (February 28 and future problem) - https://phabricator.wikimedia.org/T418824#11805964 (10Ejegg) We have deployed the split job configuration, with one job running the older 1/2 of recu... [18:12:18] larssandergreen: it looks like this patch just got stuck in CI limbo: https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/1267094 [18:12:36] but I see on the linked task 'Requires T422221: Converts SearchKit batch import mappedRow keys to match standard import format.' [18:12:37] T422221: Converts SearchKit batch import mappedRow keys to match standard import format. - https://phabricator.wikimedia.org/T422221 [18:13:31] and that 22222 task seems to still be open. So should the already-merged-upstream thing be C-1'ed for now, to keep it from merging till 422221 is done? [18:14:17] (03CR) 10Lars SG: "recheck" [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1267094 (https://phabricator.wikimedia.org/T421978) (owner: 10Lars SG) [18:20:50] ejegg: thanks for the all the review. That last one is fine to merge, without T422221 it just doesn't do anything as we ignore this importType except for a couple minor bits that run for all imports [18:20:51] T422221: Converts SearchKit batch import mappedRow keys to match standard import format. - https://phabricator.wikimedia.org/T422221 [18:20:59] ok, cool [18:21:15] thanks for the explanation [18:27:14] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Backfill location type for emails for existing contacts using Apple, Google, Amazon, Venmo or PayPal - https://phabricator.wikimedia.org/T420992#11806038 (10AnnWF) a:03AnnWF [18:35:14] 06Fundraising-Backlog, 06Fundraising-Tech-Roadmap, 10Wikimedia-Fundraising-CiviCRM, 06FR-donorrelations, and 2 others: Add global Do not contact button to contact summary - https://phabricator.wikimedia.org/T404989#11806062 (10SHust) Hi @SBurnett-WMF! Do you have any updates from legal? [18:47:55] (03CR) 10Wfan: [C:03+2] Fetch enabled countries on Gravy to determine which currencies are unsupported and requires fallback [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1251522 (https://phabricator.wikimedia.org/T419704) (owner: 10Damilare Adedoyin) [18:49:54] (03Merged) 10jenkins-bot: Fetch enabled countries on Gravy to determine which currencies are unsupported and requires fallback [extensions/DonationInterface] - 10https://gerrit.wikimedia.org/r/1251522 (https://phabricator.wikimedia.org/T419704) (owner: 10Damilare Adedoyin) [18:50:07] thanks wfan [18:51:04] (03PS1) 10Lars SG: Merge branch 'master' into deploy [wikimedia/fundraising/tools] (deploy) - 10https://gerrit.wikimedia.org/r/1269575 [18:56:49] (03CR) 10Lars SG: [C:03+2] Merge branch 'master' into deploy [wikimedia/fundraising/tools] (deploy) - 10https://gerrit.wikimedia.org/r/1269575 (owner: 10Lars SG) [18:57:19] (03Merged) 10jenkins-bot: Merge branch 'master' into deploy [wikimedia/fundraising/tools] (deploy) - 10https://gerrit.wikimedia.org/r/1269575 (owner: 10Lars SG) [19:00:50] !log tools upgraded from 986f7f83 to 9bff5f07 [19:00:51] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [19:53:30] larssandergreen: I'm getting ALLL the contribution IDs when I try to make a search task, not just the selected ones [19:53:40] can't seem to find a property with just the selected ones [19:54:18] hmm, it is posted with the ajax request [19:55:46] Looks like it should be $this->_contributionIds, is that what you're using? [19:55:55] And this is in QF or SK? [19:56:09] ok, now i see it in _submitValues['id'] [19:56:15] larssandergreen: QF [19:56:30] in _contributionIds I get all the ones from the search [19:56:37] the selected ones are just in _submitValues [19:56:42] I'll work with that anyway [19:59:18] The class your task form should extend: https://github.com/civicrm/civicrm-core/blob/master/CRM/Contribute/Form/Task.php [19:59:34] Should be loading those for you, if that's not happening correctly, might be worth investigating why [20:00:31] ok, will look [20:01:01] working with _submitValues hits a problem once you actually submit the form too, as the real 'submit' doesn't have those IDs in the list [20:17:48] larssandergreen: $values['radio_ts'] is not set when selecting things from searchkit results [20:18:24] so getSelectedIDs returns [] [20:18:31] and it just falls back to the full list [20:18:42] (in calculateIDS) [20:23:54] ejegg: this one works in SK: CRM_Contribute_Form_Task_PDF [20:25:35] 10fundraising-tech-ops, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:rack/setup/install frdb1008 - https://phabricator.wikimedia.org/T414374#11806465 (10Dwisehaupt) [20:26:05] 10fundraising-tech-ops, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:rack/setup/install frdb1008 - https://phabricator.wikimedia.org/T414374#11806468 (10Dwisehaupt) Host is built and databases cloned. Closing. [20:27:11] ejegg: but it extends CRM_Case_Form_Task which extends CRM_Core_Form_Task, so maybe it's getSelectedIDs that works in there and the contribution specific one is busted? [20:33:17] oh weird, ok [20:34:24] 10fundraising-tech-ops, 06DC-Ops, 10ops-eqiad, 06SRE: Q3:rack/setup/install frdb1008 - https://phabricator.wikimedia.org/T414374#11806507 (10Dwisehaupt) 05Open→03Resolved [20:38:03] (03CR) 10Lars SG: "see comment" [wikimedia/fundraising/tools] - 10https://gerrit.wikimedia.org/r/1269019 (https://phabricator.wikimedia.org/T422689) (owner: 10Lars SG) [20:42:33] 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Merge foundation and endowment latest fields for Acoustic export - https://phabricator.wikimedia.org/T422689#11806538 (10Lars) silverpop database updates for this and {T422533} ` -- silverpop_export_stat ALTER TABLE silverpop_export_stat... [21:02:29] (03PS3) 10Ejegg: Permissions check for refund form [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1268662 [21:02:29] (03PS3) 10Ejegg: WIP searchTask version of refund form [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269005 (https://phabricator.wikimedia.org/T421277) [21:02:56] larssandergreen: I slightly hackishly got it to work ^^^ [21:03:07] Still needs a lot of polish, see TODO in commit message [21:05:41] nice, i'll take a look [21:08:02] 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Merge foundation and endowment latest fields for Acoustic export - https://phabricator.wikimedia.org/T422689#11806607 (10Dwisehaupt) New table layouts: ` MariaDB [silverpop]> describe silverpop_export_stat; +--------------------------------+--------------... [21:17:51] ok, with recurring charges at a lull I'm going to deploy the retry cadence config [21:23:07] oh hah, we need to update the frmon graph [21:23:20] showing no charges since it was scraping from the old job's output [21:23:34] right, not at a lull yet [21:23:42] let's see how many left for today [21:25:52] 514 supposedly [21:26:22] i'll prep the deploy for after this run [21:27:57] oh, anyone want to review this one quick so I only have to reconcile settings the one time? https://gerrit.wikimedia.org/r/c/wikimedia/fundraising/crm/+/1268299?usp=dashboard [21:28:53] actual lines of code changed is like 6 [21:29:14] fr-tech ^^ [21:30:53] 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Merge foundation and endowment latest fields for Acoustic export - https://phabricator.wikimedia.org/T422689#11806668 (10Lars) [21:31:57] (03CR) 10Ejegg: WIP searchTask version of refund form (031 comment) [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269005 (https://phabricator.wikimedia.org/T421277) (owner: 10Ejegg) [21:34:06] (03PS1) 10Ejegg: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269662 [21:34:18] well, maybe safest to deploy one settings change at a time [21:34:28] ejegg: I was ready to plus to that second one last night [21:34:36] But yeah one at a time [21:34:49] Probably is good hah [21:36:27] Plus to hah [21:39:27] (03CR) 10Ejegg: [C:03+2] Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269662 (owner: 10Ejegg) [21:39:30] cool cool [21:40:25] (03Merged) 10jenkins-bot: Merge branch 'master' into deployment [wikimedia/fundraising/crm] (deployment) - 10https://gerrit.wikimedia.org/r/1269662 (owner: 10Ejegg) [21:43:59] Failing at signing in on gerrit on my phone ejegg I stopped for food on the way back from X-rays cause the traffic lights have no power and it was a chaos situation hah [21:44:07] But when I get home I'll get that other one [21:49:56] !log fundraising civicrm upgraded from 3d3c0a62 to eb188fa2 [21:49:57] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [21:51:07] hmm, managed reconcile didn't show the new setting on the page, just tried flush [21:51:57] ok, now it shows on the page [21:52:06] but I think reconcile must have created it in the db [21:52:27] anyway, there's been at least one failure rescheduled with the new code now [21:52:30] let's check the date [21:53:16] "next_sched_contribution_date": "2026-04-10 21:51:20", [21:53:18] looks good [21:55:33] (03PS1) 10Lars SG: Test updates to switch from lifetime_usd_total to lifetime_including_endowment [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269669 (https://phabricator.wikimedia.org/T422533) [21:55:38] ok, that's the charge run failed [21:55:49] I mean finished! [21:55:51] not failed [21:56:12] there were some normal failed charged in it, and they were rescheduled as they were supposed to be :) [21:56:48] (03PS2) 10Lars SG: Test updates to switch from lifetime_usd_total to lifetime_including_endowment [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269669 (https://phabricator.wikimedia.org/T422533) [21:56:50] (03CR) 10Ejegg: [C:03+2] Test updates to switch from lifetime_usd_total to lifetime_including_endowment [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269669 (https://phabricator.wikimedia.org/T422533) (owner: 10Lars SG) [22:02:56] ok, I'm going to head home. Might be back on later, might not [22:03:22] 06Fundraising-Backlog, 10fundraising-tech-ops, 10Observability-Alerting: Update firewall rules to allow frtech hosts to send alerts to production alertmanger - https://phabricator.wikimedia.org/T422888 (10Dwisehaupt) 03NEW [22:04:58] 06Fundraising-Backlog, 10fundraising-tech-ops, 10Observability-Alerting, 13Patch-For-Review: Update firewall rules to allow frtech hosts to send alerts to production alertmanger - https://phabricator.wikimedia.org/T422888#11806737 (10Dwisehaupt) [22:09:34] (03PS1) 10Lars SG: Update wmfcivicrm Upgrader for lifetime_including_endowment [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269674 (https://phabricator.wikimedia.org/T422533) [22:11:21] 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Remove un-needed separate endowment/foundation calculated fields from Civi & Acoustic - https://phabricator.wikimedia.org/T418885#11806744 (10Lars) For Q4 maintenance window, remove all the disabled custom fields. [22:18:35] (03CR) 10CI reject: [V:04-1] Test updates to switch from lifetime_usd_total to lifetime_including_endowment [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269669 (https://phabricator.wikimedia.org/T422533) (owner: 10Lars SG) [22:19:40] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Fill in missing CiviCRM core caching - https://phabricator.wikimedia.org/T422552#11806751 (10Lars) a:03Lars [22:19:53] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Add a campaign to email activities for double opt in emails - https://phabricator.wikimedia.org/T422395#11806752 (10Lars) a:03Lars [22:24:35] 03Fundraising Sprint - Floor is Lava, 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, and 2 others: Add frequency, native currency amount and USD amount to recurring pause and cancel activities from Donor Portal - https://phabricator.wikimedia.org/T421733#11806760 (... [22:27:27] 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, 13Patch-For-Review: Add is_eligible_for_donor_portal bool to Acoustic export - https://phabricator.wikimedia.org/T422670#11806783 (10Lars) a:03Lars [22:27:35] 03Fundraising Sprint - Floor is Lava, 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM: Check for merged contact if no match on contact id in SaveContact - https://phabricator.wikimedia.org/T420263#11806784 (10Lars) 05Open→03Resolved [22:28:22] 03Fundraising Sprint - Floor is Lava, 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, 10FR-Donor-portal: Add token for CiviCRM emails to conditionally add Donor Portal link - https://phabricator.wikimedia.org/T419437#11806786 (10Lars) a:03Lars [23:28:32] (03PS1) 10Wfan: Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - 10https://gerrit.wikimedia.org/r/1269714 [23:28:41] (03CR) 10Wfan: [C:03+2] Merge branch 'master' into deployment [extensions/DonationInterface] (deployment) - 10https://gerrit.wikimedia.org/r/1269714 (owner: 10Wfan) [23:29:39] 06Fundraising Tech - Chaos Crew, 06Fundraising-Backlog, 07payments-orchestration: general gravy validation error as bad_request - https://phabricator.wikimedia.org/T421958#11806881 (10AnnWF) p:05Triage→03High [23:31:35] !log payments-wiki upgraded from 064a770e to c017d7e7 [23:31:36] Logged the message at https://wikitech.wikimedia.org/wiki/Fundraising/SAL [23:58:04] (03PS1) 10Lars SG: Add form to edit cancel reason for cancel / completed recurrings. [wikimedia/fundraising/crm] - 10https://gerrit.wikimedia.org/r/1269727 (https://phabricator.wikimedia.org/T389197) [23:58:40] 03Fundraising Sprint G - 2026, 06Fundraising-Backlog, 10Wikimedia-Fundraising-CiviCRM, 06FR-donorrelations, and 2 others: Feature Request: Ability to Edit / Change recurring cancellation reason in Civi dropdown - https://phabricator.wikimedia.org/T389197#11806971 (10Lars) a:03Lars