[07:04:55] started a backfill of wikidata in codfw using: python3 cirrus_reindex.py codfw wikidatawiki backfill 2024-05-07T23:30:00 2024-05-14T16:15:00 [10:40:45] lunch [12:52:11] o/ [14:27:52] quick errand [15:00:44] will be 5mins late for the wed meeting [16:20:35] break, back in ~30 [16:31:16] dcausse: pfischer: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1031966 [16:56:37] back [17:28:13] re: airflow build, hatch seems to be working much better than virtualenv [17:32:37] and the answer is ... $wgCirrusSearchWriteClusters is null, even though in ext-CirrusSearch.php it's set to [] [17:33:18] * ebernhardson wonders if some sort of default value behaviour is turning the empty array into the default null value [18:31:33] * ebernhardson apparently hasn't dug through config loading in a few years...so many steps :P [18:39:35] finally...indeed thats exactly whats happening:// Optimistic: If the global is not set, or is an empty array, replace it entirely. [18:39:51] so our empty array gets replaced with the default, just needs to switch to provide_default [21:27:19] random fun things, while spark understands hdfs:///foo to mean use the default hadoop cluster, pyarrow doesn't understand it :P Not a big deal but gotta add analytics-hadoop to a couple urls [21:27:53] default hadoop utils seem to understand that too, seems like a pyarrow bug [23:30:49] discolytics updates yesterday didn't 100% work, made some updates and verified it's now working as expected (i hope)