[00:08:22] <icinga-wm>	 PROBLEM - Check unit status of eventlogging_to_druid_network_internal_flows-sanitization_daily on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_network_internal_flows-sanitization_daily https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[00:08:28] <icinga-wm>	 PROBLEM - Check unit status of eventlogging_to_druid_network_internal_flows_daily on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_network_internal_flows_daily https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[06:36:52] <wikibugs>	 (03PS10) 10AGueyte: WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415)
[06:37:15] <wikibugs>	 (03CR) 10AGueyte: WIP: Basic ipinfo instrument setup (034 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[06:37:36] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[08:43:11] <moritzm>	 FYI, I'll be rebooting the VMs running Turnilo/Hue/Yarn in the next ~ 15 minutes for a maintenance task of our virtualisation cluster, each individual downtime should be brief (1-2 mins per server)
[08:44:33] <elukey>	 +1 from my side
[08:50:31] <moritzm>	 ack, starting with those now
[09:05:20] <moritzm>	 all done
[09:06:40] <moritzm>	 I'm also rebooting the VM parts of an-test* in a bit: an-test-client1001.eqiad.wmnet an-test-druid1001.eqiad.wmnet an-test-presto1001.eqiad.wmnet an-test-ui1001.eqiad.wmnet
[09:07:22] <elukey>	 +1
[09:31:07] <icinga-wm>	 PROBLEM - Check unit status of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[09:46:22] <wikibugs>	 (03CR) 10ZPapierski: [C: 03+1] rdf-streaming-updater: add a "reconcile" operation [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/737429 (https://phabricator.wikimedia.org/T279541) (owner: 10DCausse)
[10:02:08] <moritzm>	 I'm also restarting matomo1002 (piwik.wikimedia.org) in a bit
[10:02:48] <icinga-wm>	 RECOVERY - Check unit status of eventlogging_to_druid_navigationtiming_hourly on an-launcher1002 is OK: OK: Status of the systemd unit eventlogging_to_druid_navigationtiming_hourly https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[10:52:37] <moritzm>	 I'm also restarting archiva1002 (archiva.wikimedia.org) in a bit
[11:04:42] <elukey>	 super
[11:05:02] <elukey>	 matomo1002 may have needed a cleaner shutdown (since it hosts a mysql db) but generally it is ok
[11:05:26] <elukey>	 yeah mariadb is fine on it
[12:53:46] <wikibugs>	 (03CR) 10Phuedx: WIP: Basic ipinfo instrument setup (033 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[12:59:48] <wikibugs>	 (03CR) 10Phuedx: WIP: Basic ipinfo instrument setup (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[14:14:33] <joal>	 26*0.75
[14:14:36] <joal>	 oops :)
[14:14:46] <joal>	 19.5 :)
[14:30:48] <wikibugs>	 (03CR) 10Phuedx: "Sorry for the multiple sets of comments 😅 I was trying to get the user_groups property working locally and uncovered a flaw in the task th" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[15:02:50] <wikibugs>	 10Data-Engineering, 10Generated Data Platform, 10Platform Engineering, 10SRE, 10Patch-For-Review: Import Debian package of Cassandra 3.11.11 as 'dev' version - https://phabricator.wikimedia.org/T298805 (10MoritzMuehlenhoff) I added component/cassandradev for buster and stretch. For the import we can eith...
[15:11:02] <wikibugs>	 10Analytics-Radar, 10WMDE-Technical-Wishes-Maintenance, 10WMDE-Templates-FocusArea, 10Patch-For-Review, 10WMDE-TechWish (Sprint-2021-02-03): Add missing normalization to CodeMirror Grafana board - https://phabricator.wikimedia.org/T273748 (10thiemowmde)
[15:25:08] <milimetric>	 I keep forgetting that people explicitly ask for stuff that I say "in theory" somebody wants: https://phabricator.wikimedia.org/T221397
[15:25:14] <milimetric>	 (link history in this case)
[16:41:14] <wikibugs>	 10Data-Engineering, 10Generated Data Platform, 10Platform Engineering, 10SRE: Import Debian package of Cassandra 3.11.11 as 'dev' version - https://phabricator.wikimedia.org/T298805 (10Eevans) >>! In T298805#7622471, @MoritzMuehlenhoff wrote: > I added component/cassandradev for buster and stretch. For the...
[17:32:06] <wikibugs>	 10Data-Engineering, 10Data-Engineering-Kanban, 10Data-Catalog, 10Epic: Run Atlas on cloud services cluster - https://phabricator.wikimedia.org/T299166 (10Ottomata) Nice
[19:37:56] <addshore>	 ottomata: can I just start sending events, or should I wait for https://gerrit.wikimedia.org/r/c/schemas/event/secondary/+/745914/ to be merged first?
[20:26:16] <wikibugs>	 (03PS11) 10AGueyte: WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415)
[20:27:02] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[20:44:04] <wikibugs>	 (03PS12) 10AGueyte: WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415)
[20:44:24] <wikibugs>	 (03CR) 10AGueyte: WIP: Basic ipinfo instrument setup (035 comments) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[20:44:37] <wikibugs>	 (03CR) 10jerkins-bot: [V: 04-1] WIP: Basic ipinfo instrument setup [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/753548 (https://phabricator.wikimedia.org/T296415) (owner: 10AGueyte)
[23:24:01] <icinga-wm>	 PROBLEM - Hadoop NodeManager on an-worker1138 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process
[23:43:17] <icinga-wm>	 RECOVERY - Hadoop NodeManager on an-worker1138 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process