[00:12:52] 10Analytics, 10Analytics-Wikistats, 10Cloud-Services: Wikistats New Feature - https://phabricator.wikimedia.org/T286359 (10Mohd.shafiul) [01:40:38] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [01:51:34] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [03:21:26] !log Rerun webrequest descendent jobs for 2021-07-08T10:00 problem [03:21:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [04:04:18] PROBLEM - Check unit status of produce_canary_events on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:15:12] RECOVERY - Check unit status of produce_canary_events on an-launcher1002 is OK: OK: Status of the systemd unit produce_canary_events https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [04:29:52] PROBLEM - Check unit status of monitor_refine_event_sanitized_analytics_immediate on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_event_sanitized_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [11:09:31] (03PS1) 10Andrew-WMDE: Add aggregations for template data usage in TemplateWizard [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/703838 (https://phabricator.wikimedia.org/T272589) [11:14:52] (03PS2) 10Andrew-WMDE: Add aggregations for template data usage in VE's template dialog [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/703753 (https://phabricator.wikimedia.org/T272589) [13:45:12] hello joal, making some coffee, i see a bunch of flappy alerts, will look into those [13:45:23] i had hoped the druid load would be done now that refine of netflow was good [13:45:31] i didn't finsih migrating events on test, we could do that too [14:35:48] ok flappy alerts look fine [14:38:12] Hi ottomata :) [14:38:16] hello! :) [14:38:56] joal going to do event on test cluster [14:39:01] ottomata: ack! [14:39:11] i havebn't done the move yet, but i think i can just move the top level directories around, right? [14:39:14] mv event event_camus [14:39:21] mv event_gobblin event [14:39:32] RECOVERY - Check unit status of monitor_refine_event_sanitized_analytics_immediate on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_event_sanitized_analytics_immediate https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [14:39:42] That's what I have done ottomata [14:40:14] ok [14:40:38] ottomata: I won't stay long this evening, I have already spent quite some time this morning restarting oozie jobs due to me not being careful enough yesterday with the time problem [14:40:54] ottomata: I can help now with events if ou wish :) [14:41:17] ok! [14:41:26] i tihnk i can do it joal [14:41:28] maybe just look over [14:41:28] https://gerrit.wikimedia.org/r/c/operations/puppet/+/703786 [14:44:21] (03PS1) 10Ottomata: Finalize event_default_test gobblin migration [analytics/refinery] - 10https://gerrit.wikimedia.org/r/703852 (https://phabricator.wikimedia.org/T271232) [14:45:34] (03PS2) 10Ottomata: Finalize event_default_test gobblin migration [analytics/refinery] - 10https://gerrit.wikimedia.org/r/703852 (https://phabricator.wikimedia.org/T271232) [14:46:02] (03CR) 10Ottomata: [V: 03+2 C: 03+2] Finalize event_default_test gobblin migration [analytics/refinery] - 10https://gerrit.wikimedia.org/r/703852 (https://phabricator.wikimedia.org/T271232) (owner: 10Ottomata) [14:48:28] joal what is [14:48:29] NOTE: Don't forget to delete the first partially imported hour [14:48:30] aboutu again? [14:48:33] maybe doesn't matter for events? [14:48:47] hm i guess if refine finds it that hour woudl be missing dat [14:48:48] a [14:48:49] hm ok [14:48:49] ottomata: the first hour imported is partial [14:48:56] so I deleted it [14:49:50] cool [14:49:54] makes sense [15:03:13] ok ottomata - stopping for today - I'll have a neye later in case, hopefully nothing breaks this weekend [15:03:23] ok great [15:03:29] thanks joal [15:03:36] we can try to finish up with events in prod next week [15:03:48] That's be great ottomata :) [15:03:59] have a good weekend! [15:04:06] Have a good weekend you too! [15:12:31] 10Analytics, 10Analytics-Kanban, 10Patch-For-Review: Replace Camus by Gobblin - https://phabricator.wikimedia.org/T271232 (10Ottomata) Status report! Test cluster is fully migrated to gobblin, including events. Prod cluster webrequest and netflow are migrated to gobblin. Next week will we do eventlogging_l... [15:34:26] (03CR) 10WMDE-Fisch: [C: 03+1] Add aggregations for template data usage in VE's template dialog [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/703753 (https://phabricator.wikimedia.org/T272589) (owner: 10Andrew-WMDE) [15:36:54] (03CR) 10WMDE-Fisch: [C: 03+1] Add aggregations for template data usage in TemplateWizard (031 comment) [analytics/reportupdater-queries] - 10https://gerrit.wikimedia.org/r/703838 (https://phabricator.wikimedia.org/T272589) (owner: 10Andrew-WMDE) [15:53:25] 10Quarry: Find somewhere else (not NFS) to store Quarry's resultsets - https://phabricator.wikimedia.org/T178520 (10dcaro) A SWIFT endpoint would help here? We (WMCS) are planning on adding one soonish on top of the CEPH cluster. [17:32:58] (03PS1) 10Ottomata: Add event_default gobblin job [analytics/refinery] - 10https://gerrit.wikimedia.org/r/703866 (https://phabricator.wikimedia.org/T271232)