[06:10:21] 10Analytics, 10Analytics-Kanban, 10Traffic: Review use of realloc in varnishkafka - https://phabricator.wikimedia.org/T287561 (10elukey) @odimitrijevic the author of the patch contributed to Varnishkafka before, they seem to know the codebase but the specific realloc patch seems to target a use case that we... [09:05:36] 10Analytics, 10EventStreams: EventStreams doesn't connect to multiple streams - https://phabricator.wikimedia.org/T291505 (10SD0001) 05Open→03Invalid Oh, I misundersood the response format. Looking at an event like ` id: [{"topic":"eqiad.mediawiki.page-delete","partition":0,"offset":-1},{"topic":"codfw.m... [09:21:08] 10Analytics, 10Analytics-Kanban: Check AQS with cassandra (serving + data) - https://phabricator.wikimedia.org/T290068 (10BTullis) Commencing the repair of part 2 of 4 of the `local_group_default_T_mediarequest_per_file data` table now. Based on previous experience, this takes around 2 days, 5 hours for each... [11:36:57] 10Analytics, 10Analytics-Kanban: Test snapshot-reload from all instances using pageview-top data table - https://phabricator.wikimedia.org/T291473 (10BTullis) I wasn't happy with the number of manual steps required and the possibility for error that brings, so I've written a script to perform the rsync transfe... [11:43:26] (03CR) 10Joal: [C: 03+2] "Merge for deploy" [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/719290 (https://phabricator.wikimedia.org/T290469) (owner: 10Joal) [11:50:28] 10Analytics, 10Analytics-Kanban: Test snapshot-reload from all instances using pageview-top data table - https://phabricator.wikimedia.org/T291473 (10BTullis) [11:52:44] (03Merged) 10jenkins-bot: Add num-partitions param to mw-history checkers [analytics/refinery/source] - 10https://gerrit.wikimedia.org/r/719290 (https://phabricator.wikimedia.org/T290469) (owner: 10Joal) [11:54:33] !log release refiner-source v0.1.18 to archiva with Jenkins [11:54:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [11:55:54] Starting build #95 for job analytics-refinery-maven-release-docker [12:09:00] Project analytics-refinery-maven-release-docker build #95: 09SUCCESS in 13 min: https://integration.wikimedia.org/ci/job/analytics-refinery-maven-release-docker/95/ [12:15:26] Starting build #53 for job analytics-refinery-update-jars-docker [12:15:54] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.18 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722848 [12:15:55] Project analytics-refinery-update-jars-docker build #53: 09SUCCESS in 29 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/53/ [13:00:58] (03PS1) 10Joal: Fix update_refinery_source_jar script [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722861 (https://phabricator.wikimedia.org/T217967) [13:01:52] (03Abandoned) 10Joal: Add refinery-source jars for v0.1.18 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722848 (owner: 10Maven-release-user) [13:03:21] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging bug to unlock deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722861 (https://phabricator.wikimedia.org/T217967) (owner: 10Joal) [13:14:05] 10Analytics, 10EventStreams: EventStreams doesn't connect to multiple streams - https://phabricator.wikimedia.org/T291505 (10Ottomata) Yeah, this is [[ https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation | Server Sent Events / EventSource format ]]. See also https://gith... [13:19:05] (03CR) 10Ottomata: Fix update_refinery_source_jar script (031 comment) [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722861 (https://phabricator.wikimedia.org/T217967) (owner: 10Joal) [13:19:13] joal: o/ want to talk mw events stuff? [14:21:23] (03CR) 10Phuedx: [C: 03+1] "+1 for:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/714875 (https://phabricator.wikimedia.org/T290074) (owner: 10Krinkle) [15:31:41] 10Analytics, 10Analytics-Kanban: Test snapshot-reload from all instances using pageview-top data table - https://phabricator.wikimedia.org/T291473 (10BTullis) I have written a script to coordinate the reloading of the 8 snapshots. It finds the transferred files into the correct directory structure on each host... [15:48:42] ottomata: sorry I was with the kids - is now before standup ok for you? [15:50:07] joal am going to the alluxio talk [15:50:16] you should come too! [15:50:42] https://app.hopin.com/events/apachecon-2021-home/sessions/0b7e697b-d4d5-4ac4-8cdb-9fd691a18396 [15:54:04] In there ottomata - thanks for the reminder [15:57:56] 10Analytics-Radar, 10Product-Analytics: Do the messages left for unregistered or logged-out IP editors get read by those editors? - https://phabricator.wikimedia.org/T291297 (10nettrom_WMF) Hi @Whatamidoing-WMF ! The Product Analytics team will triage and prioritize this task in our next meeting, on Sept 27. [16:02:55] Starting build #54 for job analytics-refinery-update-jars-docker [16:03:23] (03PS1) 10Maven-release-user: Add refinery-source jars for v0.1.18 to artifacts [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722913 [16:03:24] Project analytics-refinery-update-jars-docker build #54: 09SUCCESS in 29 sec: https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/54/ [16:04:25] 10Analytics, 10Analytics-Kanban: Repair and reload cassandra2 mediarequest_per_file data table - https://phabricator.wikimedia.org/T291470 (10BTullis) I started the repair of part 2 of 4 of the `local_group_default_T_mediarequest_per_file data` table at 09:21 this morning. Based on previous experience, this t... [16:04:37] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/722913 (owner: 10Maven-release-user) [16:05:02] 10Analytics, 10Analytics-Kanban: Repair and reload cassandra2 mediarequest_per_file data table - https://phabricator.wikimedia.org/T291470 (10BTullis) 05Open→03In progress p:05Triage→03High a:05JAllemandou→03BTullis [16:05:04] 10Analytics, 10Analytics-Kanban: Check AQS with cassandra (serving + data) - https://phabricator.wikimedia.org/T290068 (10BTullis) [16:07:26] (03PS5) 10Joal: Grow mediawiki-history oozie jobs resources [analytics/refinery] - 10https://gerrit.wikimedia.org/r/719111 (https://phabricator.wikimedia.org/T290469) [16:07:31] 10Analytics, 10Analytics-Kanban: Snapshot and Reload cassandra2 pageview_per_file data table from all 12 instances - https://phabricator.wikimedia.org/T291472 (10BTullis) p:05Triage→03High a:05JAllemandou→03BTullis This is blocked by successful testing of the methodology in {T291473} [16:07:49] (03CR) 10Joal: [V: 03+2 C: 03+2] "Merging for deploy" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/719111 (https://phabricator.wikimedia.org/T290469) (owner: 10Joal) [16:47:53] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10BTullis) I wonder if this would be useful: https://github.com/zz22394/presto-audit This explains why we only have them in the UI: https://stackoverflow.com/questions/62386932/presto-... [17:05:32] !log Kill-restart oozie jobs after deploy (mediawiki-history-denormalize-coord, mediawiki-history-check_denormalize-coord, mediawiki-history-dumps-coord, mediawiki-history-reduced-coord) [17:05:37] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:07:15] 10Analytics, 10Event-Platform, 10Patch-For-Review: Users should run explicit commands to materialize schema versions, rather than using magic git hooks - https://phabricator.wikimedia.org/T290074 (10Ottomata) @mpopov @nshahquinn-wmf any thoughts on this? [17:13:19] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10BTullis) We chatted about this and decided that it would be really useful as a first step to try to get the Presto Web UI working. https://prestodb.io/docs/current/admin/web-interface... [17:23:54] !log razzi@an-test-coord1001:/etc/presto$ sudo systemctl restart presto-server [17:24:00] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [17:24:10] That was to enable http ui for testing, not a production configuration!!! [17:34:32] 10Analytics, 10Analytics-Kanban, 10Data-Engineering: Add a presto query logger - https://phabricator.wikimedia.org/T269832 (10BTullis) We tested it and were able to get to the web UI over HTTP, using SSH forwarding, after enabling the `http-server.http.enabled` value manually on an-test-coord1001. {F34651072... [17:36:24] joal: The loading of `local_group_default_T_top_pageviews/data` is proceeding well. now that I've scripted it. I'll check later this evening and let you know whether you can proceed to your testing. [17:37:58] super btullis - many thanks :) [18:06:35] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10Product-Analytics (Kanban), 10User-Johan: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10JAllemandou) Flagging Analytics as IPs are used to flag readers as automated or not. [20:07:48] 10Analytics, 10Analytics-Kanban: Test snapshot-reload from all instances using pageview-top data table - https://phabricator.wikimedia.org/T291473 (10BTullis) a:05BTullis→03JAllemandou The script finished successfully, so that table is now fully imported. Assigning back to @JAllemandou so that he can conti... [20:08:04] 10Analytics, 10Analytics-Kanban: Test snapshot-reload from all instances using pageview-top data table - https://phabricator.wikimedia.org/T291473 (10BTullis) [20:13:26] 10Analytics, 10Analytics-Kanban: Repair and reload cassandra2 mediarequest_per_file data table - https://phabricator.wikimedia.org/T291470 (10BTullis) Progress is fairly good. It has completed 32% of the 2nd snapshot in 12 hours, which if linear would mean around 36 hours in total to complete this snapshot. Ma... [20:15:37] joal: `local_group_default_T_top_pageviews.data` is ready for you to start testing on the new cluster whenever you like. [20:24:22] this is more of a general SRE question but I remember seeing it from one of you guys first so I figured I'd try here - what was the site to search across all the wikimedia projects at once? [20:42:42] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10Product-Analytics (Kanban), 10User-Johan: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10DLynch) I would suspect that these IPs wouldn't be producing many (any?) automated actions, as inherently any requests from them sh... [20:56:04] (03PS1) 10Sharvaniharan: Migrate MobileWikiAppDailyStats to MEP [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 [20:56:41] (03CR) 10jerkins-bot: [V: 04-1] Migrate MobileWikiAppDailyStats to MEP [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 (owner: 10Sharvaniharan) [21:01:41] (03PS2) 10Sharvaniharan: Migrate MobileWikiAppDailyStats to MEP [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 [21:02:29] (03PS3) 10Sharvaniharan: Migrate MobileWikiAppDailyStats to MEP Bug: T286000 Change-Id: Ibf65637926bb3be80c64b2c373958a52b034aedb [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 (https://phabricator.wikimedia.org/T286000) [21:05:09] (03CR) 10Sharvaniharan: "Hi @Ottomata Please review this migration and advise any changes." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 (https://phabricator.wikimedia.org/T286000) (owner: 10Sharvaniharan) [21:10:45] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10Product-Analytics (Kanban), 10User-Johan: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10Urbanecm) >>! In T289795#7373088, @DLynch wrote: > I would suspect that these IPs wouldn't be producing many (any?) automated actio... [21:14:56] (03CR) 10Ottomata: "Oh, cool! So this is not a legacy schema migration, but a brand new schema with similar fields. Cool." [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 (https://phabricator.wikimedia.org/T286000) (owner: 10Sharvaniharan) [21:15:17] 10Analytics, 10Wikipedia-iOS-App-Backlog, 10Product-Analytics (Kanban), 10User-Johan: Understand impact of Apple's Relay Service - https://phabricator.wikimedia.org/T289795 (10Base) > Generally an IP block will block all users, registered or not. This depends on block settings, but blocks under No open pr... [21:15:31] (03CR) 10Ottomata: [C: 03+1] "I haven't gone deep but LGTM!" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/722964 (https://phabricator.wikimedia.org/T286000) (owner: 10Sharvaniharan) [22:12:02] 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering, 10SRE, 10ops-eqiad: Q1:(Need By: ASAP) rack/setup/install an-db100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T289632 (10Jclark-ctr) an-db1001 A6 U26 cableid1951 port 28 an-db1002 C5 U21 cableid1842 port 13 [22:12:37] 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering, 10SRE, 10ops-eqiad: Q1:(Need By: ASAP) rack/setup/install an-db100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T289632 (10Jclark-ctr) [22:12:56] 10Analytics-Clusters, 10DC-Ops, 10Data-Engineering, 10SRE, 10ops-eqiad: Q1:(Need By: ASAP) rack/setup/install an-db100[12].eqiad.wmnet - https://phabricator.wikimedia.org/T289632 (10Jclark-ctr) a:05Jclark-ctr→03Cmjohnson [23:04:33] PROBLEM - Check unit status of hdfs-cleaner-tmp on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit hdfs-cleaner-tmp https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:27:43] PROBLEM - Check unit status of hdfs-cleaner-tmp-analytics on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit hdfs-cleaner-tmp-analytics https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [23:37:29] PROBLEM - Check unit status of hdfs-cleaner-tmp-druid on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit hdfs-cleaner-tmp-druid https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers