[01:30:13] PROBLEM - Check unit status of monitor_refine_event on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [06:42:32] (03PS5) 10Joal: Update / fix HQL jobs [analytics/refinery] - 10https://gerrit.wikimedia.org/r/801416 [06:48:57] (03PS1) 10Joal: Fix sqoop page table removing page_restrictions [analytics/refinery] - 10https://gerrit.wikimedia.org/r/802422 [07:18:12] (03CR) 10Joal: [WIP] Add projectview hql scripts to analytics/refinery/hql path. (034 comments) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/797240 (https://phabricator.wikimedia.org/T309023) (owner: 10Snwachukwu) [07:19:04] (03CR) 10Joal: [V: 03+2 C: 03+2] "Self-merging bug for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/802422 (owner: 10Joal) [07:19:57] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for dpeloy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/801416 (owner: 10Joal) [07:22:04] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/798643 (owner: 10Joal) [07:26:47] !log Deploy refinery using scap [07:26:49] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:07:53] !log Deploy refinery onto HDFS [08:07:55] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:47:34] !log Merging 2 airflow spark3 jobs now that their refinery counterpart is dpeloyed [08:47:36] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [08:54:43] !log Deploy airflow with spark3 jobs [08:54:45] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:46:55] !log Manually mark interlaguage historical tasks failed in airflow [09:46:57] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:56:14] !log Relaunch sqoop after having deployed a corrective patch [09:56:16] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:07:07] RECOVERY - Check unit status of refinery-sqoop-whole-mediawiki on an-launcher1002 is OK: OK: Status of the systemd unit refinery-sqoop-whole-mediawiki https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [12:15:17] !log Deploy interlanguage fix to airflow [12:15:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:36:58] !log Kill interlanguage-daily oozie job after successfull move to airflow [12:37:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [12:44:18] !log Remove wrongly formatted interlanguage data [12:44:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:19:32] !log Drop and recreate wmf_raw.mediawiki_page table (field removal) [13:19:34] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [13:47:59] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [13:52:42] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [14:01:25] 10Data-Engineering, 10Data-Catalog, 10SRE, 10serviceops, and 2 others: New Service Request: DataHub - https://phabricator.wikimedia.org/T303049 (10JMeybohm) >>! In T303049#7898695, @JMeybohm wrote: > I finally managed to verify and document the steps needed to put a service under Ingress. I did also update... [14:02:19] !log Start browser_general_daily on airflow [14:02:21] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [14:03:28] Gone for kids team [14:03:32] See ou at standup [14:06:31] a-team: I'm working on a hotfix for the sqoop job and will be deploying refinery as soon as I can verify [14:19:43] (03CR) 10Milimetric: "Heh, I had no idea that you can link to a bug without the T prefix, but this is the bug we actually want: https://phabricator.wikimedia.or" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/802422 (owner: 10Joal) [14:25:35] joal: I was chasing your coattails all morning and not realizing it. Thanks for restarting the sqoop, I should've read the IRC backlog [14:44:38] np milimetric - was away for kids and therefore I've not seen your message :S [14:45:03] I really think we should define rules for communication - IRC + slack becomes cumbersome [14:45:09] indeed [14:45:28] joal: so Amir hasn't responded yet, but basically he's running that page_restrictions drop on shards in production [14:45:36] hm [14:45:39] and whenever he runs it on the next shard, it'll break the view (since it's a fullview) [14:45:46] so I told him to please hold off, but haven't had confirmation [14:45:53] There are chances sqoop breaks for us in the middle :( [14:45:57] so the job could fail at any moment because of that, unless we get to him first :) [14:45:58] Ack milimetric [14:46:03] makes perfect sense [14:46:07] Thanks for letting me know [14:46:16] k, we're synced up then. Anything left for us to deploy then? [14:46:48] milimetric: I deployed my refinery/airflow stuff earlier today, and I think Sandra deployed with mforns yesterday [14:46:55] So we're probably good [14:47:08] Need to get back to kids - Back at about standup time [14:47:14] sorry! [14:47:15] later [14:47:22] no worries ;) [14:47:32] ok, SandraEbele, let's catch up when you can about deploy [14:49:52] millimetri [14:51:20] Milimetric: I’m deploying Wikistats UI with Marcel now. [14:57:43] oh cool, great [14:57:50] and the two refinery jobs? [14:57:59] (remember to update https://etherpad.wikimedia.org/p/analytics-weekly-train) [15:34:38] (03PS1) 10Mforns: Release 2.9.5 [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/802549 [15:49:20] (03CR) 10Mforns: [V: 03+2 C: 03+2] "Deploying!" [analytics/wikistats2] - 10https://gerrit.wikimedia.org/r/802549 (owner: 10Mforns) [15:50:16] !log deployed wikistats 2.9.5 [15:50:18] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:59:40] 10Data-Engineering, 10Data-Persistence (Consultation): Move Mediawiki QueryPages computation to Hadoop - https://phabricator.wikimedia.org/T309738 (10Ladsgroup) Let me know if I can be of any use on the AQS side, I'll do the mediawiki side (if that's fine). [19:49:46] 10Data-Engineering-Radar, 10Cassandra, 10Generated Data Platform: AQS multi-datacenter cluster expansion - https://phabricator.wikimedia.org/T307641 (10Eevans) [19:52:16] 10Data-Engineering-Kanban, 10Patch-For-Review: The effect of sqooping large tables on mediawiki history - https://phabricator.wikimedia.org/T309806 (10Milimetric) [20:28:11] Starting build #13 for job wikimedia-event-utilities-maven-release-docker [20:28:56] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Jclark-ctr) a:05Jclark-ctr→03Cmjohnson stat1009 B1 U17 cableid 1181 port 5 [20:29:12] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q3:(Need By: TBD) rack/setup/install stat1009 - https://phabricator.wikimedia.org/T299466 (10Jclark-ctr) [20:30:22] Project wikimedia-event-utilities-maven-release-docker build #13: 09SUCCESS in 2 min 11 sec: https://integration.wikimedia.org/ci/job/wikimedia-event-utilities-maven-release-docker/13/