[00:14:41] 06Data-Engineering, 10Data-Engineering-Wikistats: Non-mobile UAs on mobile (2g/gprs, etc) IP-blocks - https://phabricator.wikimedia.org/T58628#9640570 (10VirginiaPoundstone) [00:14:51] 14Analytics, 06Data Products: Add cawiki to clickstream dataset - https://phabricator.wikimedia.org/T327982#9640571 (10VirginiaPoundstone) [00:17:41] 06Data-Engineering, 06Structured-Data-Backlog: Bump memory to enable large artifacts sync on HDFS - https://phabricator.wikimedia.org/T348958#9640583 (10Ottomata) Hm, actually, as far as I can tell, reading from HTTP (and many other sources) uses https://filesystem-spec.readthedocs.io/en/stable/api.html#fsspec... [09:17:49] (03PS3) 10Santiago Faci: Adding a new contextual attribute: performer.activity_token [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758) [09:17:57] (03PS1) 10Santiago Faci: ammend [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1012605 [09:18:05] (03PS4) 10Santiago Faci: Adding a new contextual attribute: performer.active_browsing_session_token [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758) [09:19:21] (03Abandoned) 10Santiago Faci: ammend [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1012605 (owner: 10Santiago Faci) [10:54:44] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9641410 (10Fabfur) [10:55:10] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9641413 (10Fabfur) [10:56:17] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Install new Benthos instance on cp hosts - https://phabricator.wikimedia.org/T358109#9641411 (10Fabfur) 05Open→03In progress p:05Triage→03Medium [10:58:53] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Benthos: better management for unparsable logs - https://phabricator.wikimedia.org/T359627#9641417 (10Fabfur) p:05Triage→03Low Even without metrics generation, this has been fixed with a small processing on the input side. Lea... [11:13:04] (03PS5) 10Santiago Faci: Adding a new contextual attribute: performer.active_browsing_session_token [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1008541 (https://phabricator.wikimedia.org/T358758) [14:22:05] (KafkaReplicationFactorTooLow) firing: ... [14:22:05] Kafka topic eqiad.mediawiki.job.mediaModerationScanFileJob replication factor is too low on jumbo-eqiad - https://wikitech.wikimedia.org/wiki/Kafka/Administration#Increase_a_topic's_replication_factor - https://grafana.wikimedia.org/d/000000234/kafka-by-topic?var-kafka_cluster=jumbo-eqiad&var-kafka_broker=All&var-topic=eqiad.mediawiki.job.mediaModerationScanFileJob&viewPanel=40 - ... [14:22:05] https://alerts.wikimedia.org/?q=alertname%3DKafkaReplicationFactorTooLow [14:27:05] (KafkaReplicationFactorTooLow) resolved: ... [14:27:05] Kafka topic eqiad.mediawiki.job.mediaModerationScanFileJob replication factor is too low on jumbo-eqiad - https://wikitech.wikimedia.org/wiki/Kafka/Administration#Increase_a_topic's_replication_factor - https://grafana.wikimedia.org/d/000000234/kafka-by-topic?var-kafka_cluster=jumbo-eqiad&var-kafka_broker=All&var-topic=eqiad.mediawiki.job.mediaModerationScanFileJob&viewPanel=40 - ... [14:27:05] https://alerts.wikimedia.org/?q=alertname%3DKafkaReplicationFactorTooLow [15:51:38] 06Data-Engineering, 10Observability-Logging, 06Traffic: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450 (10Fabfur) 03NEW [16:32:40] brouberol: o/ https://phabricator.wikimedia.org/T326419#9639228 <3 [16:33:59] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450#9643007 (10gmodena) For context: this is the approach we follow with other producers, e.g. [Java](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimed... [16:37:03] 06Data-Engineering, 10Observability-Logging, 06Traffic, 10Event-Platform, 13Patch-For-Review: Add $schema key to Benthos payload - https://phabricator.wikimedia.org/T360450#9643030 (10gmodena) [17:06:23] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Install new Benthos instance on cp hosts - https://phabricator.wikimedia.org/T358109#9643142 (10Fabfur) [17:07:51] 06Data-Engineering, 10Observability-Logging, 06Traffic, 10Event-Platform, 13Patch-For-Review: 14Add $schema key to Benthos payload - 14https://phabricator.wikimedia.org/T360450#9643140 (10Fabfur) 05Open→03Resolved p:05Triage→03Low [17:09:50] 06Data-Engineering, 10Observability-Logging, 06Traffic: Better Benthos performances - https://phabricator.wikimedia.org/T360454 (10Fabfur) 03NEW [17:25:17] (03CR) 10Jenniferwang: [C:03+2] "The current schema structure looks good to me. If the identifier would have multiple different sources in the future, I also suggest addi" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1011281 (https://phabricator.wikimedia.org/T354597) (owner: 10Kosta Harlan) [17:25:52] (03Merged) 10jenkins-bot: Add ip_reputation/score schema [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1011281 (https://phabricator.wikimedia.org/T354597) (owner: 10Kosta Harlan) [19:26:14] 06Data-Engineering, 10ops-eqiad, 06SRE: Degraded RAID on dumpsdata1007 - https://phabricator.wikimedia.org/T359702#9643764 (10Jclark-ctr) Replaced disk 5. noticed 2nd disk failure disk 7. opened another ticket for replacement of disk 7 You have successfully submitted request SR187258816. [19:34:27] 06Data-Engineering, 10ops-eqiad, 06SRE: 14Degraded RAID on dumpsdata1007 - 14https://phabricator.wikimedia.org/T359702#9643787 (10Jclark-ctr) 05Open→03Resolved 14replaced disk 7 with onhand disk will put replacement into extra storage when it arrives [20:58:53] (HiveServerHeapUsage) firing: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [21:56:11] 06Data-Engineering, 06Data-Persistence: analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482 (10Urbanecm_WMF) 03NEW [21:56:42] 06Data-Engineering, 06Data-Persistence: analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644203 (10Urbanecm_WMF) p:05Triage→03High Appears to be an outage within the analytics infra, hence triaging as High. Feel free to re-triage. [23:03:53] (HiveServerHeapUsage) resolved: Hive Server JVM Heap usage is above 80% on an-coord1003:10100 - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive/Alerts#Hive_Server_Heap_Usage - https://grafana.wikimedia.org/d/000000379/hive?panelId=7&fullscreen&orgId=1&var-instance=an-coord1003:10100 - https://alerts.wikimedia.org/?q=alertname%3DHiveServerHeapUsage [23:18:30] 06Data-Engineering, 06Data-Persistence, 06Data-Platform-SRE: analytics-mysql centralauth returns a MySQL error - https://phabricator.wikimedia.org/T360482#9644337 (10lbowmaker) [23:26:36] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9644365 (10Fabfur) [23:27:02] 06Data-Engineering, 10Observability-Logging, 06Traffic, 13Patch-For-Review: 14Benthos: better management for unparsable logs - 14https://phabricator.wikimedia.org/T359627#9644377 (10Fabfur) 05In progress→03Resolved [23:27:14] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: Move analytics log from Varnish to HAProxy - https://phabricator.wikimedia.org/T351117#9644378 (10Fabfur)