[02:05:33] (03CR) 10David Martin: Create schema for tracking WikiLambda run-function API endpoints (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1019138 (https://phabricator.wikimedia.org/T356228) (owner: 10David Martin) [02:09:04] FIRING: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [02:09:09] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=codfw.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [03:09:04] RESOLVED: GobblinKafkaRecordsExtractedNotEqualRecordsExpected: Gobblin job event_default ingested an unexpected number of records for a Kafka topic partition. ... [03:09:04] - https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Gobblin - https://grafana.wikimedia.org/d/pAQaJwEnk/gobblin?orgId=1&var-gobblin_job_name=event_default&var-kafka_topic=codfw.mediawiki.cirrussearch.page_rerender.v1&viewPanel=4 - https://alerts.wikimedia.org/?q=alertname%3DGobblinKafkaRecordsExtractedNotEqualRecordsExpected [08:08:37] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: HAProxy log format doesn't support "invalid" request path - https://phabricator.wikimedia.org/T365117#9815424 (10Fabfur) Update: opened [[ https://github.com/haproxy/haproxy/issues/2573 | this issue ]] upstream t... [08:36:36] 06Data-Engineering, 10Observability-Logging, 06Traffic: Umbrella task for Benthos parsing error - https://phabricator.wikimedia.org/T365441 (10Fabfur) 03NEW [08:36:57] 06Data-Engineering, 10Observability-Logging, 06Traffic: Umbrella task for Benthos parsing error - https://phabricator.wikimedia.org/T365441#9815580 (10Fabfur) [08:36:58] 06Data-Engineering, 06Data Products, 10Observability-Logging, 06Traffic, 13Patch-For-Review: HAProxy log format doesn't support "invalid" request path - https://phabricator.wikimedia.org/T365117#9815579 (10Fabfur) [08:38:44] 06Data-Engineering, 10Observability-Logging, 06Traffic: Umbrella task for Benthos parsing error - https://phabricator.wikimedia.org/T365441#9815588 (10Fabfur) A missing Host header in the request result in a 400 from Varnish and a parsing error from varnish: `json { "$schema": "/webrequest/1.0.0", "back... [09:05:14] joal: I'm a few minutes from starting the reimage of an-launcher1002 - Good to go from your perspective, or is there anything else I should do? [09:07:06] Good morning btullis - All good for me :) [09:07:27] btullis: let me know if there's anything I can help with [09:09:28] joal: Thanks. Will do. [09:10:18] !log Upgrading an-launcher1002 to bullseye [09:10:20] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [09:17:36] 06Data-Engineering, 10Data Pipelines, 13Patch-For-Review: Fix generation of _IMPORTED flags by Gobblin - https://phabricator.wikimedia.org/T365223#9815698 (10JAllemandou) [09:27:08] joal: I formatted `/srv` on an-launcher1002 by mistake. I had intended to merge this before starting the reimage, but got the order wrong: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034437 [09:27:34] I don't think that it should be too big a deal, but I will need to do some deploys at the very least. [09:28:39] https://www.irccloud.com/pastebin/PyaS4kUC/ [09:29:07] This is what was there previously. [09:29:56] btullis: I think it should be ok from a dpeloy perspective: any single deploy contains normally all jars etc [09:32:14] joal: Ack. So I will need to deploy refinery and hdfs-tools. airflow-dags might pull the latest automatically, but I will check that. Does anything else come to mind? [09:34:25] btullis: you also need to deploy airflow-dags I think :) [09:34:43] otherwise the code might not be there for airflow [09:35:53] joal: Right. When I reimaged the other airflow instances recently, I found that puppet had automatically pulled the latest master branch on first setting up the scap::target resource. Anyway, I will definitely check because I could be wrong. [09:36:19] btullis: interesting! could be a puppet thing from scap? [09:36:26] btullis: interesting! could be a puppet thing from scap-initialisation? [09:36:34] sorry for the double messagen [09:36:57] Yes, I think so. I will check with a `git log` before trying a deploy, so we know either way. [09:37:26] great, thank you :) [09:41:46] (03PS8) 10Gehel: fix(*DatabaseReader): avoid null pointer exception when reading MaxMind [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1032562 (https://phabricator.wikimedia.org/T365197) [09:44:09] (03PS9) 10Gehel: fix(*DatabaseReader): avoid null pointer exception when reading MaxMind [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1032562 (https://phabricator.wikimedia.org/T365197) [09:46:20] 10Data-Engineering (Q4 2024 April 1st - June 30th), 13Patch-For-Review: [Refine refactoring] Extract refine schema management into a dedicated tool - https://phabricator.wikimedia.org/T356762#9815820 (10Antoine_Quhen) [09:46:25] (03PS10) 10Gehel: fix(*DatabaseReader): avoid null pointer exception when reading MaxMind [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1032562 (https://phabricator.wikimedia.org/T365197) [10:02:28] 14Analytics, 06Data-Engineering, 06DBA, 10Event-Platform: Eventually Consistent MediaWiki State Change Events - https://phabricator.wikimedia.org/T120242#9815911 (10akosiaris) >>! In T120242#9813821, @Ottomata wrote: >> Jobs can be retried if failed, maybe we could utilize that as a proxy? > > @akosiaris... [10:05:03] 14Analytics, 06Data-Engineering, 06DBA, 10Event-Platform: Eventually Consistent MediaWiki State Change Events - https://phabricator.wikimedia.org/T120242#9815915 (10akosiaris) >>! In T120242#9813867, @Ottomata wrote: >> As pointed out above, not even MediaWiki products assume 100% > > @akosiaris, just so... [10:36:31] (03PS1) 10Mforns: Rename Commons Impact Metrics dump queries [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1034455 (https://phabricator.wikimedia.org/T364875) [10:37:29] !log Deploy refinery on an-launcher1002 after reimage [10:37:30] (03CR) 10Mforns: [V:03+2 C:03+2] "Self merging to fix deployment" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/1034455 (https://phabricator.wikimedia.org/T364875) (owner: 10Mforns) [10:37:30] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [10:51:34] (03CR) 10Santiago Faci: Create schema for tracking WikiLambda run-function API endpoints (031 comment) [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/1019138 (https://phabricator.wikimedia.org/T356228) (owner: 10David Martin) [20:01:32] (03PS3) 10Gehel: style(maxmind): fix checkstyle violations for MaxMind package. [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1034558 (https://phabricator.wikimedia.org/T365197) [20:21:54] (03CR) 10CDanis: [C:03+2] Include subdivision ISO code in the geo response [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1032847 (owner: 10CDanis) [20:39:53] (03Merged) 10jenkins-bot: Include subdivision ISO code in the geo response [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/1032847 (owner: 10CDanis) [23:09:12] 10Quarry: [bug] Access denied for user 'quarry'@'172.16.2.72' (using password: NO) - https://phabricator.wikimedia.org/T365374#9819342 (10Liz) Thanks to whomever fixed this problem. [23:30:52] 14Data-Engineering-Kanban, 10Cassandra, 06Data-Platform-SRE: Properly add aqsloader user (w/ secrets) - https://phabricator.wikimedia.org/T305600#9819381 (10Eevans) p:05Medium→03High [23:31:07] 06Data-Engineering, 10Cassandra, 10Data Pipelines: Encrypt Spark-Cassandra connection - https://phabricator.wikimedia.org/T310820#9819382 (10Eevans) p:05Medium→03High [23:31:13] 06Data-Engineering-Radar, 10Cassandra: Make Cassandra client encryption non-optional (AQS cluster) - https://phabricator.wikimedia.org/T309229#9819385 (10Eevans) p:05Medium→03High