[09:24:49] ebernhardson: that's with airflow templating I guess? Ya IIRC this was one of the reasons we decided no to use airflow templatting on SQL [09:24:55] we only use hive parameterization [09:25:09] and, actually, we are trying not to use Hive (MapReduce) at all [09:25:17] instetad using SparkSQLOperator [09:52:32] cloudelastic is not happy [09:53:09] seems like we banned 2 nodes cloudelastic1001-cloudelastic-omega-eqiad OR cloudelastic1005-cloudelastic-omega-eqiad [11:00:12] Lunch [11:00:35] dcausse: that might be a left over from the switch maintenance [11:02:15] seems like it's related to T329073 [11:02:16] T329073: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 [11:16:03] lunch [14:10:15] dcausse what status was it in? [14:11:22] o/ [14:11:29] inflatador: yellow so I did nothing [14:11:42] Ah OK, np. [14:12:14] we'll unban once the switch maintenance is over. the main eqiad cluster is yellow too ATM [14:13:32] ok [14:51:53] Switch maintenance is over, unbanning nodes [14:58:01] OK, we should be good on cloudelastic and the prod eqiad clusters [15:02:03] dcausse once again I stupidly set up a pairing session without inviting you. If you can join now LMK, if not we can pick it up tomorrow or whenever works for you [15:02:17] :) [15:02:22] I'm around [15:02:26] https://meet.google.com/oxm-jadj-jtp if you can make it [15:32:27] \o [15:35:02] o/ [15:49:17] ebernhardson thanks for hanging in there. I'm going to work out but I'll work on the next airflow patch once I get back in ~40m [17:08:26] inflatador: kk [17:14:20] inflatador: looks like only 2 wdqs servers are pooled in eqiad [17:18:35] repooling all of them [17:19:22] for some reasons only 1004 & 1006 were repooled [17:47:14] dcausse thanks, forgot to finish the job before I worked out. Back [17:47:50] np! [17:50:49] just repooled the wdqs-internal hosts too [17:51:57] thanks! [18:34:08] seeing some partitions in refined events, going to unblock the streaming_updater airflow tasks in priority [18:39:24] lunch, back in ~40 [18:44:03] done ores_predictions_hourly & mediawiki_revision_recommendation_create_hourly [18:45:57] hm transfer_to_es_hourly.upload_to_swift failed today at 6am :/ [18:47:06] need to run, will check later or tomorrow [18:55:20] hmm, i wonder if the upload failed because of something to do with the hdfs read-only maintenance. will look [18:56:47] yea that looks like it, going to reset the execution and let it try now that read-only is over: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby. [19:02:44] runs fine after resetting [19:17:08] ryankemper, inflatador: looks like our airflow instance should be ready: T326193 [19:17:09] T326193: Airflow upgrade (refactor deb creation + version bump + switch to PostgreSQL) - https://phabricator.wikimedia.org/T326193 [19:17:33] yea we set it up this morning, i'm currently poking things and will probably turn on the first dag today to see how it goes [19:18:14] i'll probably switch over from migrating things to deploying the dags we've already migrated and dropping them out of the old installation [19:18:31] ebernhardson: cool! [20:01:44] is it possible to somehow use CirrusSearch to query for articles where images are used without alt text? my first attempt in Special:Search with `insource:|alt=*` (predictably) didn't get anywhere. [20:02:33] or would it be more reasonable to have a weighted flag that gets set in a PageSaveCompleteHook / maintenance script, iterating over image usages in an article and identifying ones which lack an `alt` property? [20:03:01] (not urgent, just idle questions!) [20:09:39] kostajh: hmm, plausibly some sort of wikitext magic could work but that only works for one-off queries, we can't really support any volume of regex searches (there is a limit of 10 concurrent insource regex queries). If we wanted to do those kinds of queries regularly it would indeed need some sort of tag attached to the page [21:56:09] OK, I banned elastic1060-66 (Row D hosts) in all clusters in preparation for https://phabricator.wikimedia.org/T322082 tomorrow , will downtime/depool these hosts shortly as well [23:20:52] ryankemper: could i get you to scap deploy /srv/deployment/airflow-dags/search? the artifacts should at least work now, hopefully that means it will finish the deploy [23:29:49] ebernhardson: sure, ~6m [23:40:26] ebernhardson: seems happy, phab paste upcoming [23:41:17] ebernhardson: https://phabricator.wikimedia.org/P45330 [23:41:30] cool! indeed i see a bunch of turned off dags on an-airflow1005 now [23:41:54] thanks [23:54:03] Groceries, back in ~35m