Fork me on GitHub

Wikimedia IRC logs browser - #wikimedia-operations

Filter:
Start date
End date

Displaying 1211 items:

2025-02-19 00:10:35 <jinxer-wm> FIRING: Wikidata Reliability Metrics - Median loading time alert: <no value> - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert
2025-02-19 00:10:57 <icinga-wm> PROBLEM - MariaDB Replica Lag: s1 on db2141 is CRITICAL: CRITICAL slave_sql_lag Replication lag: 649.23 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica
2025-02-19 00:24:40 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10562012 (''phaultfinder)'
2025-02-19 00:27:04 <wikibugs> ('CR) ''Scott French: "Thanks, Riccardo!" [software/spicerack] - ''https://gerrit.wikimedia.org/r/1120648 (https://phabricator.wikimedia.org/T383324) (owner: ''Scott French)'
2025-02-19 00:29:32 <jinxer-wm> FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 00:30:35 <jinxer-wm> RESOLVED: Wikidata Reliability Metrics - Median loading time alert: <no value> - https://alerts.wikimedia.org/?q=alertname%3DWikidata+Reliability+Metrics+-+Median+loading+time+alert
2025-02-19 00:33:40 <jinxer-wm> RESOLVED: KubernetesRsyslogDown: rsyslog on wikikube-worker1149:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://grafana.wikimedia.org/d/OagQjQmnk?var-server=wikikube-worker1149 - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown
2025-02-19 00:38:34 <wikibugs> ('PS1) ''TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1120682'
2025-02-19 00:38:34 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1120682 (owner: ''TrainBranchBot)'
2025-02-19 00:39:10 <wikibugs> ('PS1) ''Andrew Bogott: vendordata.txt: include rudimentary clouds.yaml in initial VM [puppet] - ''https://gerrit.wikimedia.org/r/1120683 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 00:39:11 <wikibugs> ('PS1) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 00:39:25 <wikibugs> ('CR) ''Andrew Bogott: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 00:45:04 <wikibugs> ('PS1) ''Jdlrobson: Remove init event from Search AB test and also remove ABTestEnrollment.js. [extensions/WikimediaEvents] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120685 (https://phabricator.wikimedia.org/T386243)'
2025-02-19 00:45:28 <wikibugs> ('PS2) ''Jdlrobson: Remove init event from Search AB test and also remove ABTestEnrollment.js. [extensions/WikimediaEvents] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120685 (https://phabricator.wikimedia.org/T386734)'
2025-02-19 00:46:02 <wikibugs> ('CR) ''Brennen Bearnes: [C:''+1] "Yep, definitely in favor at this point, filters out expected noise." [puppet] - ''https://gerrit.wikimedia.org/r/1056221 (https://phabricator.wikimedia.org/T371633) (owner: ''Ahmon Dancy)'
2025-02-19 00:49:04 <wikibugs> ('Merged) ''jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1120682 (owner: ''TrainBranchBot)'
2025-02-19 00:59:37 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10562048 (''phaultfinder)'
2025-02-19 01:08:39 <wikibugs> ('PS1) ''TrainBranchBot: Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1120686'
2025-02-19 01:08:39 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1120686 (owner: ''TrainBranchBot)'
2025-02-19 01:19:44 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10562062 (''phaultfinder)'
2025-02-19 01:23:44 <wikibugs> ('PS2) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 01:23:55 <wikibugs> ('CR) ''Andrew Bogott: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 01:29:56 <wikibugs> ('PS1) ''Zabe: Prepare satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120688 (https://phabricator.wikimedia.org/T386619)'
2025-02-19 01:30:42 <wikibugs> ('Merged) ''jenkins-bot: Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1120686 (owner: ''TrainBranchBot)'
2025-02-19 01:37:16 <wikibugs> ('CR) ''Zabe: [C:''+2] Prepare satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120688 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 01:37:59 <wikibugs> ('Merged) ''jenkins-bot: Prepare satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120688 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 01:41:27 <logmsgbot> !log zabe@deploy2002 Started scap sync-world: Backport for [[gerrit:1120688|Prepare satwiktionary (T386619)]]
2025-02-19 01:41:30 <stashbot> T386619: Create Wiktionary Santali - https://phabricator.wikimedia.org/T386619
2025-02-19 01:44:00 <wikibugs> ('PS3) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 01:44:23 <wikibugs> ('CR) ''Andrew Bogott: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 01:44:27 <logmsgbot> !log zabe@deploy2002 zabe: Backport for [[gerrit:1120688|Prepare satwiktionary (T386619)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 01:44:35 <logmsgbot> !log zabe@deploy2002 zabe: Continuing with sync
2025-02-19 01:51:12 <logmsgbot> !log zabe@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120688|Prepare satwiktionary (T386619)]] (duration: 09m 45s)
2025-02-19 01:51:17 <stashbot> T386619: Create Wiktionary Santali - https://phabricator.wikimedia.org/T386619
2025-02-19 01:52:18 <wikibugs> ('PS1) ''Zabe: Increase revision-slots cache expiry back to default for 3 wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120689 (https://phabricator.wikimedia.org/T183490)'
2025-02-19 01:54:31 <wikibugs> ('CR) ''Zabe: [C:''+2] Increase revision-slots cache expiry back to default for 3 wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120689 (https://phabricator.wikimedia.org/T183490) (owner: ''Zabe)'
2025-02-19 01:55:11 <wikibugs> ('Merged) ''jenkins-bot: Increase revision-slots cache expiry back to default for 3 wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120689 (https://phabricator.wikimedia.org/T183490) (owner: ''Zabe)'
2025-02-19 01:55:16 <wikibugs> ('PS1) ''Zabe: Activate satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120690 (https://phabricator.wikimedia.org/T386619)'
2025-02-19 01:55:33 <wikibugs> ('CR) ''Zabe: [C:''+2] Activate satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120690 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 01:56:16 <wikibugs> ('Merged) ''jenkins-bot: Activate satwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120690 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 01:56:56 <logmsgbot> !log zabe@deploy2002 Started scap sync-world: Backport for [[gerrit:1120690|Activate satwiktionary (T386619)]], [[gerrit:1120689|Increase revision-slots cache expiry back to default for 3 wikis (T183490)]]
2025-02-19 01:57:02 <stashbot> T386619: Create Wiktionary Santali - https://phabricator.wikimedia.org/T386619
2025-02-19 01:57:02 <stashbot> T183490: MCR schema migration stage 4: Migrate External Store URLs (wmf production) - https://phabricator.wikimedia.org/T183490
2025-02-19 01:59:56 <logmsgbot> !log zabe@deploy2002 zabe: Backport for [[gerrit:1120690|Activate satwiktionary (T386619)]], [[gerrit:1120689|Increase revision-slots cache expiry back to default for 3 wikis (T183490)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 02:00:56 <wikibugs> ('PS4) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 02:01:06 <wikibugs> ('CR) ''Andrew Bogott: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 02:01:15 <logmsgbot> !log zabe@deploy2002 zabe: Continuing with sync
2025-02-19 02:05:07 <wikibugs> ('CR) ''Scott French: dbctl: pass DbCtlConfiguration to DbConfig (''1 comment) [software/spicerack] - ''https://gerrit.wikimedia.org/r/1120648 (https://phabricator.wikimedia.org/T383324) (owner: ''Scott French)'
2025-02-19 02:07:52 <logmsgbot> !log zabe@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120690|Activate satwiktionary (T386619)]], [[gerrit:1120689|Increase revision-slots cache expiry back to default for 3 wikis (T183490)]] (duration: 10m 55s)
2025-02-19 02:07:58 <stashbot> T386619: Create Wiktionary Santali - https://phabricator.wikimedia.org/T386619
2025-02-19 02:07:58 <stashbot> T183490: MCR schema migration stage 4: Migrate External Store URLs (wmf production) - https://phabricator.wikimedia.org/T183490
2025-02-19 02:08:57 <icinga-wm> RECOVERY - MariaDB Replica Lag: s1 on db2141 is OK: OK slave_sql_lag Replication lag: 0.23 seconds https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica
2025-02-19 02:10:48 <wikibugs> ('PS1) ''Zabe: Update interwiki cache [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120691 (https://phabricator.wikimedia.org/T386619)'
2025-02-19 02:10:49 <wikibugs> ('CR) ''Zabe: [C:''+2] Update interwiki cache [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120691 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 02:11:32 <wikibugs> ('Merged) ''jenkins-bot: Update interwiki cache [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120691 (https://phabricator.wikimedia.org/T386619) (owner: ''Zabe)'
2025-02-19 02:11:50 <logmsgbot> !log zabe@deploy2002 Started scap sync-world: T386619
2025-02-19 02:15:22 <wikibugs> ('PS5) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 02:15:30 <wikibugs> ('CR) ''Andrew Bogott: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 02:16:31 <icinga-wm> RECOVERY - Host ms-be2075 is UP: PING OK - Packet loss = 0%, RTA = 1.60 ms
2025-02-19 02:21:35 <logmsgbot> !log zabe@deploy2002 Finished scap sync-world: T386619 (duration: 09m 44s)
2025-02-19 02:21:39 <stashbot> T386619: Create Wiktionary Santali - https://phabricator.wikimedia.org/T386619
2025-02-19 02:22:55 <icinga-wm> PROBLEM - Host ms-be2075 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 02:27:10 <wikibugs> ('PS6) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 02:36:42 <jinxer-wm> FIRING: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
2025-02-19 02:48:56 <jinxer-wm> FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 02:53:56 <jinxer-wm> RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 03:01:42 <jinxer-wm> RESOLVED: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
2025-02-19 03:36:14 <wikibugs> ('CR) ''Subramanya Sastry: [C:''+1] Turn on Parsoid Read Views for 31 wiktionaries [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120679 (https://phabricator.wikimedia.org/T386762) (owner: ''Arlolra)'
2025-02-19 03:36:50 <wikibugs> ('PS1) ''RLazarus: deployment_server: Refactor some utility functions into a Job class [puppet] - ''https://gerrit.wikimedia.org/r/1120700'
2025-02-19 04:32:21 <jinxer-wm> FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 05:12:25 <wikibugs> ('CR) ''Jgiannelos: [C:''+1] Bust cache for recreated pages [deployment-charts] - ''https://gerrit.wikimedia.org/r/1118890 (https://phabricator.wikimedia.org/T386244) (owner: ''Arlolra)'
2025-02-19 05:32:37 <icinga-wm> PROBLEM - Hadoop NodeManager on an-worker1158 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process
2025-02-19 05:42:59 <wikibugs> ('PS1) ''KartikMistry: Update cxserver to 2025-02-14-191041-production [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120709 (https://phabricator.wikimedia.org/T386464)'
2025-02-19 05:44:37 <icinga-wm> RECOVERY - Hadoop NodeManager on an-worker1158 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process
2025-02-19 06:02:52 <kart_> Deploying MinT. Staging first.
2025-02-19 06:06:37 <logmsgbot> !log kartik@deploy2002 helmfile [staging] START helmfile.d/services/machinetranslation: apply
2025-02-19 06:12:23 <logmsgbot> !log kartik@deploy2002 helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
2025-02-19 06:27:21 <jinxer-wm> FIRING: [4x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 06:28:40 <logmsgbot> !log kartik@deploy2002 helmfile [codfw] START helmfile.d/services/machinetranslation: apply
2025-02-19 06:37:34 <wikibugs> ('CR) ''Aklapper: "Hmm, is the timer still in place? Wondering as I still receive this email..." [puppet] - ''https://gerrit.wikimedia.org/r/1117489 (https://phabricator.wikimedia.org/T304792) (owner: ''Aklapper)'
2025-02-19 06:38:09 <logmsgbot> !log kartik@deploy2002 helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
2025-02-19 06:46:48 <wikibugs> 'SRE, ''Wikimedia-Incident: 503 Service Unavailable on all production - https://phabricator.wikimedia.org/T386740#10562253 (''Iniquity) >>! In T386740#10561252, @ssingh wrote: >>>! In T386740#10561043, @Iniquity wrote: >> I want to know for the future, this is not the first time I have reported about "Service...'
2025-02-19 06:50:05 <logmsgbot> !log kartik@deploy2002 helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
2025-02-19 06:52:22 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reimage for host ganeti1023.eqiad.wmnet with OS bookworm
2025-02-19 06:52:30 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562254 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host ganeti1023.eqiad.wmnet with OS bookworm'
2025-02-19 07:00:05 <jouncebot> Deploy window MediaWiki infrastructure (UTC early) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0700)
2025-02-19 07:05:58 <logmsgbot> !log kartik@deploy2002 helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
2025-02-19 07:12:09 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562275 (''MoritzMuehlenhoff)'
2025-02-19 07:12:40 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1023.eqiad.wmnet with reason: host reimage
2025-02-19 07:16:30 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1023.eqiad.wmnet with reason: host reimage
2025-02-19 07:17:30 <kart_> !log Updated MinT to 2025-02-05-115716-production (T383750, T385552)
2025-02-19 07:17:34 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 07:17:34 <stashbot> T383750: MinT: Fails to download models/files from peopleweb.discovery.wmnet - https://phabricator.wikimedia.org/T383750
2025-02-19 07:17:34 <stashbot> T385552: MinT: Add support for Obolo, Central Dusun, Iban and, South Ndebele - https://phabricator.wikimedia.org/T385552
2025-02-19 07:20:33 <wikibugs> ('PS2) ''Michael Große: testwiki: enable surfacing structured task experiment [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120646 (https://phabricator.wikimedia.org/T386739)'
2025-02-19 07:24:03 <vgutierrez> !log upload haproxy 2.8.14 to apt.wm.o (bullseye-wikimedia) - T386751
2025-02-19 07:24:06 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 07:24:07 <stashbot> T386751: update haproxy to version 2.8.14 - https://phabricator.wikimedia.org/T386751
2025-02-19 07:29:44 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp4052.ulsfo.wmnet,cp4044.ulsfo.wmnet} and A:cp
2025-02-19 07:34:24 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp4052.ulsfo.wmnet,cp4044.ulsfo.wmnet} and A:cp
2025-02-19 07:37:51 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1023.eqiad.wmnet with OS bookworm
2025-02-19 07:38:01 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562320 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host ganeti1023.eqiad.wmnet with OS bookworm completed: - ganeti102...'
2025-02-19 07:39:09 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
2025-02-19 07:39:27 <wikibugs> ('PS2) ''Anzx: satwiktionary: add sitename, timezone, projectnamespace [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120701 (https://phabricator.wikimedia.org/T386631)'
2025-02-19 07:39:45 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC morning backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120701 (https://phabricator.wikimedia.org/T386631) (owner: ''Anzx)'
2025-02-19 07:39:53 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC morning backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120646 (https://phabricator.wikimedia.org/T386739) (owner: ''Michael Große)'
2025-02-19 07:40:11 <wikibugs> ('PS2) ''Anzx: uzwikiquote: add logos [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120699 (https://phabricator.wikimedia.org/T386569)'
2025-02-19 07:40:34 <wikibugs> ('CR) ''Urbanecm: [C:''+1] "LGTM" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120646 (https://phabricator.wikimedia.org/T386739) (owner: ''Michael Große)'
2025-02-19 07:40:38 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC morning backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120699 (https://phabricator.wikimedia.org/T386569) (owner: ''Anzx)'
2025-02-19 07:47:32 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
2025-02-19 07:49:42 <moritzm> !log installing openjdk-11 security updates
2025-02-19 07:49:43 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 07:51:01 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
2025-02-19 07:53:22 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4052.*} and A:cp
2025-02-19 07:53:48 <wikibugs> ('PS2) ''Anzx: madwiki: add namespace aliases [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120888 (https://phabricator.wikimedia.org/T382087)'
2025-02-19 07:53:59 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC morning backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120888 (https://phabricator.wikimedia.org/T382087) (owner: ''Anzx)'
2025-02-19 07:54:49 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.*} and A:cp
2025-02-19 07:55:54 <wikibugs> ('PS1) ''Arnaudb: ferm: remove moscovium from allowlist [puppet] - ''https://gerrit.wikimedia.org/r/1120889 (https://phabricator.wikimedia.org/T385777)'
2025-02-19 07:59:14 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
2025-02-19 08:00:05 <jouncebot> Amir1, Urbanecm, and awight: #bothumor I � Unicode. All rise for UTC morning backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0800).
2025-02-19 08:00:05 <jouncebot> anzx and MichaelG_WMF: A patch you scheduled for UTC morning backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
2025-02-19 08:00:17 <anzx> o/
2025-02-19 08:00:20 <MichaelG_WMF> o/
2025-02-19 08:03:06 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.addnode for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A
2025-02-19 08:03:44 <wikibugs> ('PS1) ''Elukey: role::kubernetes::worker: add kartotherian-k8s-ssl to the lvs pools [puppet] - ''https://gerrit.wikimedia.org/r/1120893 (https://phabricator.wikimedia.org/T386648)'
2025-02-19 08:04:21 <logmsgbot> !log jmm@cumin2002 END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1023.eqiad.wmnet to cluster eqiad and group A
2025-02-19 08:04:54 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.addnode for new host ganeti1033.eqiad.wmnet to cluster eqiad and group D
2025-02-19 08:07:01 <wikibugs> ('CR) ''Elukey: [V:''+1] "PCC SUCCESS (CORE_DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/4953/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1120893 (https://phabricator.wikimedia.org/T386648) (owner: ''Elukey)'
2025-02-19 08:07:11 <icinga-wm> PROBLEM - Router interfaces on cr2-eqord is CRITICAL: CRITICAL: host 208.80.154.198, interfaces up: 44, down: 2, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 08:07:39 <icinga-wm> PROBLEM - Router interfaces on cr2-codfw is CRITICAL: CRITICAL: host 208.80.153.193, interfaces up: 112, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 08:07:41 <icinga-wm> PROBLEM - Router interfaces on cr3-ulsfo is CRITICAL: CRITICAL: host 198.35.26.192, interfaces up: 69, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 08:08:58 <logmsgbot> !log jmm@cumin2002 END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1033.eqiad.wmnet to cluster eqiad and group D
2025-02-19 08:11:21 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and not P{cp4052.*} and A:cp
2025-02-19 08:15:27 <dcausse> jouncebot: nowandnext
2025-02-19 08:15:27 <jouncebot> For the next 0 hour(s) and 44 minute(s): UTC morning backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0800)
2025-02-19 08:15:27 <jouncebot> In 0 hour(s) and 44 minute(s): MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0900)
2025-02-19 08:15:58 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and not P{cp4044.*} and A:cp
2025-02-19 08:18:20 <wikibugs> ('PS1) ''DCausse: Do not update the search index if the assessment did not change [extensions/PageAssessments] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120895'
2025-02-19 08:18:32 <wikibugs> ('PS1) ''DCausse: Do not update the search index if the assessment did not change [extensions/PageAssessments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120896'
2025-02-19 08:19:10 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp
2025-02-19 08:19:27 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp
2025-02-19 08:22:34 <dcausse> anzx, MichaelG_WMF do you have a deployer?
2025-02-19 08:22:56 <MichaelG_WMF> no, not yet.
2025-02-19 08:23:47 <MichaelG_WMF> If you could do it, that would be great. Though I could also wait to the next window if necessary (but that is usually quite full)
2025-02-19 08:23:58 <dcausse> I think I can deploy
2025-02-19 08:24:08 <dcausse> anzx: are you still around?
2025-02-19 08:24:13 <MichaelG_WMF> YaY, thank you dcausse!
2025-02-19 08:24:29 <anzx> dcausse: yes I am around
2025-02-19 08:25:06 <dcausse> anzx: can I ship all your 3 config changes at once or would you prefer to test them individually?
2025-02-19 08:25:10 <wikibugs> ('Abandoned) ''Brouberol: opensearch: include the minor version in the apt component name [puppet] - ''https://gerrit.wikimedia.org/r/1120140 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 08:25:19 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562405 (''MoritzMuehlenhoff)'
2025-02-19 08:25:49 <anzx> dcausse: ship all at once
2025-02-19 08:25:55 <dcausse> ack
2025-02-19 08:27:16 <wikibugs> ('CR) ''DCausse: [C:''+1] satwiktionary: add sitename, timezone, projectnamespace [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120701 (https://phabricator.wikimedia.org/T386631) (owner: ''Anzx)'
2025-02-19 08:28:56 <wikibugs> ('CR) ''DCausse: [C:''+1] uzwikiquote: add logos [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120699 (https://phabricator.wikimedia.org/T386569) (owner: ''Anzx)'
2025-02-19 08:29:17 <wikibugs> ('PS1) ''Elukey: role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727)'
2025-02-19 08:30:03 <wikibugs> ('CR) ''DCausse: [C:''+1] madwiki: add namespace aliases [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120888 (https://phabricator.wikimedia.org/T382087) (owner: ''Anzx)'
2025-02-19 08:30:34 <wikibugs> ('CR) ''Elukey: [V:''+1] "PCC SUCCESS (CORE_DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/4954/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:32:19 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120701 (https://phabricator.wikimedia.org/T386631) (owner: ''Anzx)'
2025-02-19 08:32:20 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120888 (https://phabricator.wikimedia.org/T382087) (owner: ''Anzx)'
2025-02-19 08:32:20 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120699 (https://phabricator.wikimedia.org/T386569) (owner: ''Anzx)'
2025-02-19 08:33:05 <wikibugs> ('Merged) ''jenkins-bot: satwiktionary: add sitename, timezone, projectnamespace [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120701 (https://phabricator.wikimedia.org/T386631) (owner: ''Anzx)'
2025-02-19 08:33:08 <wikibugs> ('Merged) ''jenkins-bot: madwiki: add namespace aliases [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120888 (https://phabricator.wikimedia.org/T382087) (owner: ''Anzx)'
2025-02-19 08:33:10 <wikibugs> ('Merged) ''jenkins-bot: uzwikiquote: add logos [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120699 (https://phabricator.wikimedia.org/T386569) (owner: ''Anzx)'
2025-02-19 08:33:53 <logmsgbot> !log dcausse@deploy2002 Started scap sync-world: Backport for [[gerrit:1120701|satwiktionary: add sitename, timezone, projectnamespace (T386631)]], [[gerrit:1120888|madwiki: add namespace aliases (T382087)]], [[gerrit:1120699|uzwikiquote: add logos (T386569)]]
2025-02-19 08:33:59 <stashbot> T386631: Post-creation work for satwiktionary - https://phabricator.wikimedia.org/T386631
2025-02-19 08:33:59 <stashbot> T382087: Add Indonesian language fallback aliases for Namespaces in Madurese - https://phabricator.wikimedia.org/T382087
2025-02-19 08:34:00 <stashbot> T386569: Proposed Revisions to the Uzbek Wikiquote Logo - https://phabricator.wikimedia.org/T386569
2025-02-19 08:34:37 <wikibugs> ('PS1) ''Brouberol: cirrus: add the opensearch.motd file [puppet] - ''https://gerrit.wikimedia.org/r/1120900 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 08:37:00 <logmsgbot> !log dcausse@deploy2002 dcausse, anzx: Backport for [[gerrit:1120701|satwiktionary: add sitename, timezone, projectnamespace (T386631)]], [[gerrit:1120888|madwiki: add namespace aliases (T382087)]], [[gerrit:1120699|uzwikiquote: add logos (T386569)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 08:37:05 <anzx> dcausse: checking
2025-02-19 08:37:09 <wikibugs> ('PS2) ''Elukey: role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727)'
2025-02-19 08:37:30 <wikibugs> ('CR) ''CI reject: [V:''-1] role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:37:58 <logmsgbot> !log arnaudb@cumin1002 START - Cookbook sre.hosts.decommission for hosts moscovium.eqiad.wmnet
2025-02-19 08:38:23 <wikibugs> ('CR) ''Brouberol: [C:''+2] cirrus: add the opensearch.motd file [puppet] - ''https://gerrit.wikimedia.org/r/1120900 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 08:39:03 <wikibugs> ('PS3) ''Elukey: role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727)'
2025-02-19 08:39:56 <anzx> dcausse: all patches looks good
2025-02-19 08:40:07 <dcausse> anzx: ack, deploying
2025-02-19 08:40:13 <logmsgbot> !log dcausse@deploy2002 dcausse, anzx: Continuing with sync
2025-02-19 08:40:15 <fabfur> !log upgrading haproxykafka package on apt repo to 0.3.5 (T374128)
2025-02-19 08:40:18 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 08:40:18 <stashbot> T374128: haproxykafka features - https://phabricator.wikimedia.org/T374128
2025-02-19 08:40:26 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562456 (''MoritzMuehlenhoff)'
2025-02-19 08:40:37 <wikibugs> ('PS1) ''Arnaudb: rt: remove cname [dns] - ''https://gerrit.wikimedia.org/r/1120901 (https://phabricator.wikimedia.org/T385777)'
2025-02-19 08:41:07 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
2025-02-19 08:41:08 <wikibugs> ('CR) ''Elukey: [V:''+1] "PCC SUCCESS (CORE_DIFF 4): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/4956/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:41:20 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562457 (''ops-monitoring-bot) Draining ganeti1025.eqiad.wmnet of running VMs'
2025-02-19 08:41:27 <wikibugs> ('CR) ''Kamila Součková: [C:''+1] role::kubernetes::worker: add kartotherian-k8s-ssl to the lvs pools [puppet] - ''https://gerrit.wikimedia.org/r/1120893 (https://phabricator.wikimedia.org/T386648) (owner: ''Elukey)'
2025-02-19 08:41:42 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
2025-02-19 08:41:51 <wikibugs> ('CR) ''Elukey: [V:''+1 C:''-1] "WIP sorry" [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:42:12 <wikibugs> ('PS1) ''Muehlenhoff: Switch ganeti1025 to nftables [puppet] - ''https://gerrit.wikimedia.org/r/1120902'
2025-02-19 08:42:15 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
2025-02-19 08:42:30 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562458 (''ops-monitoring-bot) Draining ganeti1025.eqiad.wmnet of running VMs'
2025-02-19 08:42:35 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp
2025-02-19 08:42:52 <logmsgbot> !log arnaudb@cumin1002 START - Cookbook sre.dns.netbox
2025-02-19 08:45:21 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp
2025-02-19 08:45:36 <fabfur> !log upgrading haproxykafka to 0.3.5 on cp4037 to test new feature (T374128)
2025-02-19 08:45:39 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 08:45:40 <stashbot> T374128: haproxykafka features - https://phabricator.wikimedia.org/T374128
2025-02-19 08:46:25 <wikibugs> ('PS4) ''Elukey: role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727)'
2025-02-19 08:46:30 <logmsgbot> !log arnaudb@cumin1002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moscovium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
2025-02-19 08:46:51 <logmsgbot> !log arnaudb@cumin1002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moscovium.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - arnaudb@cumin1002"
2025-02-19 08:46:51 <logmsgbot> !log arnaudb@cumin1002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
2025-02-19 08:46:52 <logmsgbot> !log arnaudb@cumin1002 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moscovium.eqiad.wmnet
2025-02-19 08:46:54 <logmsgbot> !log dcausse@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120701|satwiktionary: add sitename, timezone, projectnamespace (T386631)]], [[gerrit:1120888|madwiki: add namespace aliases (T382087)]], [[gerrit:1120699|uzwikiquote: add logos (T386569)]] (duration: 13m 00s)
2025-02-19 08:47:02 <stashbot> T386631: Post-creation work for satwiktionary - https://phabricator.wikimedia.org/T386631
2025-02-19 08:47:02 <stashbot> T382087: Add Indonesian language fallback aliases for Namespaces in Madurese - https://phabricator.wikimedia.org/T382087
2025-02-19 08:47:02 <stashbot> T386569: Proposed Revisions to the Uzbek Wikiquote Logo - https://phabricator.wikimedia.org/T386569
2025-02-19 08:47:07 <wikibugs> ('CR) ''Elukey: "reverted back to a simpler patch, I'll do the clean up manually, a more generic code needs some thinking to avoid polluting too many nodes" [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:47:09 <dcausse> anzx: should be live
2025-02-19 08:47:13 <anzx> dcausse: thank you
2025-02-19 08:47:18 <dcausse> yw! :)
2025-02-19 08:47:50 <dcausse> MichaelG_WMF: going to ship your patch, are you still around?
2025-02-19 08:47:55 <MichaelG_WMF> yes
2025-02-19 08:47:58 <dcausse> ack
2025-02-19 08:48:43 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120646 (https://phabricator.wikimedia.org/T386739) (owner: ''Michael Große)'
2025-02-19 08:49:20 <wikibugs> ('PS1) ''Brouberol: opensearcgh:cirrus: include the diffie-hellman parameter file [puppet] - ''https://gerrit.wikimedia.org/r/1120903 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 08:50:05 <wikibugs> ('Merged) ''jenkins-bot: testwiki: enable surfacing structured task experiment [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120646 (https://phabricator.wikimedia.org/T386739) (owner: ''Michael Große)'
2025-02-19 08:50:31 <logmsgbot> !log dcausse@deploy2002 Started scap sync-world: Backport for [[gerrit:1120646|testwiki: enable surfacing structured task experiment (T386739)]]
2025-02-19 08:50:35 <stashbot> T386739: Surfacing "Add a link" Structured Tasks: Test Wikipedia Release - https://phabricator.wikimedia.org/T386739
2025-02-19 08:52:09 <wikibugs> ('CR) ''Elukey: "Thanks for working on this!" [puppet] - ''https://gerrit.wikimedia.org/r/1120602 (https://phabricator.wikimedia.org/T385727) (owner: ''Herron)'
2025-02-19 08:52:37 <wikibugs> ('CR) ''Elukey: [V:''+1 C:''+2] role::kubernetes::worker: add kartotherian-k8s-ssl to the lvs pools [puppet] - ''https://gerrit.wikimedia.org/r/1120893 (https://phabricator.wikimedia.org/T386648) (owner: ''Elukey)'
2025-02-19 08:53:03 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "LGTM" [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:53:23 <logmsgbot> !log dcausse@deploy2002 migr, dcausse: Backport for [[gerrit:1120646|testwiki: enable surfacing structured task experiment (T386739)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 08:53:36 <wikibugs> ('CR) ''Elukey: [C:''+2] role::etcd::v3::aux_k8s_etcd: remove backups [puppet] - ''https://gerrit.wikimedia.org/r/1120899 (https://phabricator.wikimedia.org/T385727) (owner: ''Elukey)'
2025-02-19 08:53:36 <MichaelG_WMF> is testing
2025-02-19 08:54:19 <MichaelG_WMF> @dcausse It works as expected, thank you!
2025-02-19 08:54:27 <wikibugs> ('CR) ''Brouberol: [C:''+2] opensearcgh:cirrus: include the diffie-hellman parameter file [puppet] - ''https://gerrit.wikimedia.org/r/1120903 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 08:54:37 <dcausse> MichaelG_WMF: cool, shipping then
2025-02-19 08:54:39 <logmsgbot> !log dcausse@deploy2002 migr, dcausse: Continuing with sync
2025-02-19 08:55:13 <wikibugs> ('PS4) ''Anzx: knwiki, knwikisource, tcywikisource: add confirmed user usergroup [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781)'
2025-02-19 08:55:56 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 08:56:30 <dcausse> jouncebot: next
2025-02-19 08:56:31 <jouncebot> In 0 hour(s) and 3 minute(s): MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0900)
2025-02-19 08:56:54 <dcausse> I'll have two backport to ship after this one, hope it's ok to take a bit of the mw train time
2025-02-19 08:57:33 <dcausse> but please let me know if not
2025-02-19 08:58:16 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp
2025-02-19 08:58:26 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
2025-02-19 08:59:24 <icinga-wm> RECOVERY - Elasticsearch HTTPS for relforge-eqiad-small-alpha on relforge1004 is OK: SSL OK - Certificate relforge1004.eqiad.wmnet valid until 2025-03-12 19:54:00 +0000 (expires in 21 days) https://wikitech.wikimedia.org/wiki/Search
2025-02-19 08:59:24 <icinga-wm> RECOVERY - Elasticsearch HTTPS for relforge-eqiad on relforge1004 is OK: SSL OK - Certificate relforge1004.eqiad.wmnet valid until 2025-03-12 19:54:00 +0000 (expires in 21 days) https://wikitech.wikimedia.org/wiki/Search
2025-02-19 09:00:05 <jouncebot> dancy and andre: I, the Bot under the Fountain, call upon thee, The Deployer, to do MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T0900).
2025-02-19 09:00:36 <dcausse> dancy, andre the backport window is running a bit late, sorry about that
2025-02-19 09:01:07 <andre> dcausse: we don't run the train for the next 10 hours, no problem :)
2025-02-19 09:01:12 <logmsgbot> !log dcausse@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120646|testwiki: enable surfacing structured task experiment (T386739)]] (duration: 10m 41s)
2025-02-19 09:01:13 <dcausse> ah
2025-02-19 09:01:16 <stashbot> T386739: Surfacing "Add a link" Structured Tasks: Test Wikipedia Release - https://phabricator.wikimedia.org/T386739
2025-02-19 09:01:18 <dcausse> andre: good for me :)
2025-02-19 09:01:22 <andre> hehe
2025-02-19 09:02:09 <dcausse> MichaelG_WMF: should be live
2025-02-19 09:02:20 <elukey> !log elukey@cumin1002:~$ sudo cumin --m async 'aux-k8s-etcd*' 'systemctl stop etcd-backup.timer etcd-backup.service' 'rm /lib/systemd/system/etcd-backup.service /lib/systemd/system/etcd-backup.timer' 'systemctl daemon-reload' - T385727
2025-02-19 09:02:23 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 09:02:24 <stashbot> T385727: etcd: adapt etcd-backup.py for etcd 3.4 - https://phabricator.wikimedia.org/T385727
2025-02-19 09:02:32 <MichaelG_WMF> @dcausse Thanks! 🙏
2025-02-19 09:02:36 <dcausse> yw! :)
2025-02-19 09:02:57 <dcausse> extending the backport window a bit to ship two more patches
2025-02-19 09:04:25 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [extensions/PageAssessments] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120895 (owner: ''DCausse)'
2025-02-19 09:04:25 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by dcausse@deploy2002 using scap backport" [extensions/PageAssessments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120896 (owner: ''DCausse)'
2025-02-19 09:04:32 <jinxer-wm> RESOLVED: [3x] SystemdUnitFailed: etcd-backup.service on aux-k8s-etcd2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 09:05:46 <icinga-wm> RECOVERY - Router interfaces on cr2-codfw is OK: OK: host 208.80.153.193, interfaces up: 113, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 09:05:46 <icinga-wm> RECOVERY - Router interfaces on cr3-ulsfo is OK: OK: host 198.35.26.192, interfaces up: 70, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 09:05:48 <wikibugs> ('PS1) ''Brouberol: opensearch:cirrus: install curator for opensearch [puppet] - ''https://gerrit.wikimedia.org/r/1120908 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 09:06:12 <icinga-wm> RECOVERY - Router interfaces on cr2-eqord is OK: OK: host 208.80.154.198, interfaces up: 46, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 09:07:14 <wikibugs> ('Merged) ''jenkins-bot: Do not update the search index if the assessment did not change [extensions/PageAssessments] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120895 (owner: ''DCausse)'
2025-02-19 09:07:14 <wikibugs> ('Merged) ''jenkins-bot: Do not update the search index if the assessment did not change [extensions/PageAssessments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120896 (owner: ''DCausse)'
2025-02-19 09:07:28 <fabfur> !log upgrading haproxykafka to 0.3.5 on ulsfo (T374128)
2025-02-19 09:07:31 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 09:07:31 <stashbot> T374128: haproxykafka features - https://phabricator.wikimedia.org/T374128
2025-02-19 09:07:44 <logmsgbot> !log dcausse@deploy2002 Started scap sync-world: Backport for [[gerrit:1120895|Do not update the search index if the assessment did not change]], [[gerrit:1120896|Do not update the search index if the assessment did not change]]
2025-02-19 09:08:36 <wikibugs> ('CR) ''Brouberol: [C:''+2] opensearch:cirrus: install curator for opensearch [puppet] - ''https://gerrit.wikimedia.org/r/1120908 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 09:09:37 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=wikikube-worker200.*.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 09:10:34 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=wikikube-worker100.*.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 09:10:39 <logmsgbot> !log dcausse@deploy2002 dcausse: Backport for [[gerrit:1120895|Do not update the search index if the assessment did not change]], [[gerrit:1120896|Do not update the search index if the assessment did not change]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 09:10:59 <logmsgbot> !log dcausse@deploy2002 dcausse: Continuing with sync
2025-02-19 09:11:16 <jinxer-wm> FIRING: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.264s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 09:16:00 <logmsgbot> !log klausman@cumin1002 START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002
2025-02-19 09:16:16 <jinxer-wm> RESOLVED: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.264s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 09:17:36 <logmsgbot> !log dcausse@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120895|Do not update the search index if the assessment did not change]], [[gerrit:1120896|Do not update the search index if the assessment did not change]] (duration: 09m 51s)
2025-02-19 09:18:32 <dcausse> !log closing the UTC morning backport window
2025-02-19 09:18:33 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 09:20:25 <wikibugs> ('PS1) ''Michael Große: Growth: increase minimum tasks per topic on idwiki; ruwiki => default [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120904 (https://phabricator.wikimedia.org/T385343)'
2025-02-19 09:27:35 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp
2025-02-19 09:28:04 <wikibugs> ('PS1) ''Brouberol: opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 09:28:25 <wikibugs> ('CR) ''CI reject: [V:''-1] opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 09:28:31 <wikibugs> ('PS2) ''Brouberol: opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 09:28:51 <wikibugs> ('CR) ''CI reject: [V:''-1] opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 09:28:54 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp
2025-02-19 09:29:19 <wikibugs> ('CR) ''Brouberol: [V:''+1] "PCC SUCCESS (CORE_DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/4957/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 09:30:03 <wikibugs> ('PS3) ''Brouberol: opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 09:33:12 <wikibugs> ('CR) ''Brouberol: [C:''+2] opensearch:cirrus: pin elasticsearch-curator version [puppet] - ''https://gerrit.wikimedia.org/r/1120914 (https://phabricator.wikimedia.org/T380752) (owner: ''Brouberol)'
2025-02-19 09:33:42 <logmsgbot> !log klausman@cumin1002 END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1002
2025-02-19 09:33:58 <logmsgbot> !log klausman@cumin1002 START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002
2025-02-19 09:35:42 <Emperor> !log restart envoy/swift on ms-fe1013
2025-02-19 09:35:44 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 09:36:27 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
2025-02-19 09:36:33 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
2025-02-19 09:37:16 <Emperor> !log restart envoy/swift on ms-fe201[2-4]
2025-02-19 09:37:18 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 09:38:40 <jinxer-wm> FIRING: KubernetesRsyslogDown: rsyslog on wikikube-worker1150:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://grafana.wikimedia.org/d/OagQjQmnk?var-server=wikikube-worker1150 - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown
2025-02-19 09:40:50 <icinga-wm> PROBLEM - Router interfaces on cr1-codfw is CRITICAL: CRITICAL: host 208.80.153.192, interfaces up: 128, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 09:40:50 <icinga-wm> PROBLEM - Router interfaces on cr4-ulsfo is CRITICAL: CRITICAL: host 198.35.26.193, interfaces up: 70, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 09:43:32 <wikibugs> ('PS1) ''Arnaudb: moscovium: remove from site.pp [puppet] - ''https://gerrit.wikimedia.org/r/1120917 (https://phabricator.wikimedia.org/T385777)'
2025-02-19 09:43:40 <jinxer-wm> RESOLVED: KubernetesRsyslogDown: rsyslog on wikikube-worker1150:9105 is missing kubernetes logs - https://wikitech.wikimedia.org/wiki/Kubernetes/Logging#Common_issues - https://grafana.wikimedia.org/d/OagQjQmnk?var-server=wikikube-worker1150 - https://alerts.wikimedia.org/?q=alertname%3DKubernetesRsyslogDown
2025-02-19 09:51:31 <logmsgbot> !log klausman@cumin1002 END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1002
2025-02-19 09:58:15 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] moscovium: remove from site.pp [puppet] - ''https://gerrit.wikimedia.org/r/1120917 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 09:58:27 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp
2025-02-19 10:00:06 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp
2025-02-19 10:00:44 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
2025-02-19 10:00:55 <logmsgbot> !log vgutierrez@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
2025-02-19 10:09:51 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes:weight=30; selector: name=wikikube-worker1138.codfw.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:09:59 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes:weight=30; selector: name=wikikube-worker1138.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:10:32 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes:weight=30; selector: name=wikikube-worker1002.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:16:54 <wikibugs> ('CR) ''MVernon: "Hi," [puppet] - ''https://gerrit.wikimedia.org/r/1120496 (https://phabricator.wikimedia.org/T385564) (owner: ''Vgutierrez)'
2025-02-19 10:24:59 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp
2025-02-19 10:30:15 <wikibugs> ('PS5) ''Arturo Borrero Gonzalez: cloudgw: move icmp checks under wmcs [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 10:30:36 <wikibugs> ('CR) ''Arturo Borrero Gonzalez: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 10:31:10 <logmsgbot> !log vgutierrez@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp
2025-02-19 10:31:21 <wikibugs> ('CR) ''Arturo Borrero Gonzalez: "let me know how this looks @tfogli@wikimedia.org" [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 10:32:36 <wikibugs> ('PS6) ''Arturo Borrero Gonzalez: cloudgw: move icmp checks under wmcs [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 10:32:47 <wikibugs> ('CR) ''Arturo Borrero Gonzalez: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 10:33:01 <icinga-wm> RECOVERY - Router interfaces on cr1-codfw is OK: OK: host 208.80.153.192, interfaces up: 129, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 10:33:09 <wikibugs> ('PS1) ''Fabfur: haproxykafka: limit memory usage to 5% of total physical memory [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386753)'
2025-02-19 10:33:47 <wikibugs> ('PS2) ''Fabfur: haproxykafka: limit memory usage to 5% of total physical memory [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386753)'
2025-02-19 10:34:01 <icinga-wm> RECOVERY - Router interfaces on cr4-ulsfo is OK: OK: host 198.35.26.193, interfaces up: 71, down: 0, dormant: 0, excluded: 0, unused: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 10:36:16 <wikibugs> ('CR) ''Urbanecm: [C:''+1] "lgtm" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120904 (https://phabricator.wikimedia.org/T385343) (owner: ''Michael Große)'
2025-02-19 10:38:45 <wikibugs> ('PS1) ''Filippo Giunchedi: o11y: promote thanos compact alerts to critical [alerts] - ''https://gerrit.wikimedia.org/r/1120923'
2025-02-19 10:38:51 <wikibugs> ('CR) ''CI reject: [V:''-1] o11y: promote thanos compact alerts to critical [alerts] - ''https://gerrit.wikimedia.org/r/1120923 (owner: ''Filippo Giunchedi)'
2025-02-19 10:39:29 <icinga-wm> RECOVERY - Disk space on titan2001 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=titan2001&var-datasource=codfw+prometheus/ops
2025-02-19 10:39:40 <wikibugs> ('CR) ''Filippo Giunchedi: "recheck" [alerts] - ''https://gerrit.wikimedia.org/r/1120923 (owner: ''Filippo Giunchedi)'
2025-02-19 10:40:33 <wikibugs> ('CR) ''Michael Große: "T think, this also needs to set `wgGESurfacingStructuredTasksEnabled` to `true` for this wiki." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120505 (https://phabricator.wikimedia.org/T385343) (owner: ''Sergio Gimeno)'
2025-02-19 10:42:49 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=no; selector: name=maps2005.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:43:01 <wikibugs> ('PS1) ''Fabfur: hiera: reasonable message batches number [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753)'
2025-02-19 10:43:50 <wikibugs> ('CR) ''Fabfur: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 10:44:22 <wikibugs> ('CR) ''Vgutierrez: [C:''+1] "commit message nitpick aside, LGTM" [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 10:45:39 <wikibugs> ('CR) ''Filippo Giunchedi: "recheck" [alerts] - ''https://gerrit.wikimedia.org/r/1120923 (owner: ''Filippo Giunchedi)'
2025-02-19 10:45:42 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=maps2006.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:45:46 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=maps2005.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:46:30 <wikibugs> ('PS1) ''Urbanecm: [Growth] enwiki: Release Add Link to 15% of newcomers [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120925 (https://phabricator.wikimedia.org/T386029)'
2025-02-19 10:47:10 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120904 (https://phabricator.wikimedia.org/T385343) (owner: ''Michael Große)'
2025-02-19 10:48:06 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120618 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 10:48:14 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120620 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 10:48:19 <wikibugs> ('PS1) ''Vgutierrez: aptrepo,haproxy: Allow installing HAProxy 1.3 on bullseye [puppet] - ''https://gerrit.wikimedia.org/r/1120926 (https://phabricator.wikimedia.org/T386796)'
2025-02-19 10:48:22 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120643 (owner: ''Michael Große)'
2025-02-19 10:48:39 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=maps1005.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:52:24 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/weight=10; selector: name=wikikube-worker1002.eqiad.wmnet.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:53:15 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes:weight=10; selector: name=wikikube-worker1002.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:53:59 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=maps1006.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 10:55:02 <wikibugs> ('PS2) ''Fabfur: hiera: reasonable message batches number [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753)'
2025-02-19 10:56:00 <wikibugs> ('CR) ''Vgutierrez: haproxykafka: limit memory usage to 5% of total physical memory (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 10:56:48 <wikibugs> ('CR) ''Fabfur: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 10:57:56 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539#10562858 (''ayounsi) ''Open''Resolved a:''ayounsi Nop, thanks for the ping. There is now {T364092}'
2025-02-19 11:00:05 <jouncebot> Deploy window MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1100)
2025-02-19 11:00:24 <wikibugs> ('PS3) ''Fabfur: hiera, hpk: reasonable message batches number [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753)'
2025-02-19 11:01:02 <wikibugs> ('CR) ''Fabfur: hiera, hpk: reasonable message batches number (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 11:01:09 <wikibugs> ('CR) ''Fabfur: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 11:01:56 <wikibugs> ('CR) ''Vgutierrez: [C:''+1] hiera, hpk: reasonable message batches number [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 11:04:43 <wikibugs> ('CR) ''Fabfur: [C:''+2] hiera, hpk: reasonable message batches number [puppet] - ''https://gerrit.wikimedia.org/r/1120924 (https://phabricator.wikimedia.org/T386753) (owner: ''Fabfur)'
2025-02-19 11:06:37 <wikibugs> ('PS1) ''Ladsgroup: ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120928 (https://phabricator.wikimedia.org/T384921)'
2025-02-19 11:06:43 <Amir1> jouncebot: nowandnext
2025-02-19 11:06:43 <jouncebot> For the next 0 hour(s) and 53 minute(s): MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1100)
2025-02-19 11:06:43 <jouncebot> In 0 hour(s) and 53 minute(s): Services – Citoid / Zotero (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1200)
2025-02-19 11:07:13 <wikibugs> ('PS1) ''Ladsgroup: ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120929 (https://phabricator.wikimedia.org/T384921)'
2025-02-19 11:07:18 <wikibugs> ('CR) ''Ladsgroup: [C:''+2] ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120929 (https://phabricator.wikimedia.org/T384921) (owner: ''Ladsgroup)'
2025-02-19 11:07:22 <wikibugs> ('CR) ''Ladsgroup: [C:''+2] ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120928 (https://phabricator.wikimedia.org/T384921) (owner: ''Ladsgroup)'
2025-02-19 11:09:14 <fabfur> !log upgrading haproxykafka to 0.3.5 on all DCs (T374128)
2025-02-19 11:09:17 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 11:09:18 <stashbot> T374128: haproxykafka features - https://phabricator.wikimedia.org/T374128
2025-02-19 11:13:07 <wikibugs> ('CR) ''Filippo Giunchedi: "Please take a look, modulo CI failures which are tracked at https://phabricator.wikimedia.org/T386784"; [alerts] - ''https://gerrit.wikimedia.org/r/1120923 (owner: ''Filippo Giunchedi)'
2025-02-19 11:16:16 <wikibugs> ('CR) ''Michael Große: [C:''+1] [Growth] enwiki: Release Add Link to 15% of newcomers [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120925 (https://phabricator.wikimedia.org/T386029) (owner: ''Urbanecm)'
2025-02-19 11:17:49 <wikibugs> ('CR) ''CI reject: [V:''-1] ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120928 (https://phabricator.wikimedia.org/T384921) (owner: ''Ladsgroup)'
2025-02-19 11:18:38 <wikibugs> ('Merged) ''jenkins-bot: ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120929 (https://phabricator.wikimedia.org/T384921) (owner: ''Ladsgroup)'
2025-02-19 11:19:03 <wikibugs> ('Merged) ''jenkins-bot: ChangeTagsStore: Lengthen cache times [core] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120928 (https://phabricator.wikimedia.org/T384921) (owner: ''Ladsgroup)'
2025-02-19 11:24:28 <logmsgbot> !log ladsgroup@deploy2002 Started scap sync-world: Backport for [[gerrit:1120928|ChangeTagsStore: Lengthen cache times (T384921)]], [[gerrit:1120929|ChangeTagsStore: Lengthen cache times (T384921)]]
2025-02-19 11:24:30 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
2025-02-19 11:24:32 <stashbot> T384921: OAuth oauth_registered_consumer table: Reads to the table are exceeding transaction profiler limits at a rate of ~4 per second - https://phabricator.wikimedia.org/T384921
2025-02-19 11:27:29 <logmsgbot> !log ladsgroup@deploy2002 ladsgroup: Backport for [[gerrit:1120928|ChangeTagsStore: Lengthen cache times (T384921)]], [[gerrit:1120929|ChangeTagsStore: Lengthen cache times (T384921)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 11:27:59 <logmsgbot> !log ladsgroup@deploy2002 ladsgroup: Continuing with sync
2025-02-19 11:28:24 <logmsgbot> !log fabfur@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
2025-02-19 11:29:15 <logmsgbot> !log jmm@cumin2002 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti1025.eqiad.wmnet with reason: remove from cluster for reimage
2025-02-19 11:29:22 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10562969 (''ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=f7811566-34c7-44ec-b90f-ec261439dabd) set by jmm@cumin2002 for 1 day, 0:00:00 on 1 host(...'
2025-02-19 11:29:59 <wikibugs> ('CR) ''Muehlenhoff: [C:''+2] Switch ganeti1025 to nftables [puppet] - ''https://gerrit.wikimedia.org/r/1120902 (owner: ''Muehlenhoff)'
2025-02-19 11:33:00 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti1025.eqiad.wmnet
2025-02-19 11:34:44 <logmsgbot> !log ladsgroup@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120928|ChangeTagsStore: Lengthen cache times (T384921)]], [[gerrit:1120929|ChangeTagsStore: Lengthen cache times (T384921)]] (duration: 10m 16s)
2025-02-19 11:34:48 <stashbot> T384921: OAuth oauth_registered_consumer table: Reads to the table are exceeding transaction profiler limits at a rate of ~4 per second - https://phabricator.wikimedia.org/T384921
2025-02-19 11:35:15 <jinxer-wm> FIRING: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-web/canary at codfw: 25% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=codfw%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 11:35:25 <wikibugs> ('CR) ''Ladsgroup: [C:''-1] "We have to do this for important reasons, unfortunately we can't just fully revert it back. At least not yet. Can you give us list of para" [puppet] - ''https://gerrit.wikimedia.org/r/1080357 (https://phabricator.wikimedia.org/T318285) (owner: ''Simon04)'
2025-02-19 11:40:15 <jinxer-wm> RESOLVED: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-web/canary at codfw: 25% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=codfw%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 11:43:25 <wikibugs> ('CR) ''Andrew Bogott: [C:''+2] cloudgw1003: take over cloudgw1001 [puppet] - ''https://gerrit.wikimedia.org/r/1114997 (https://phabricator.wikimedia.org/T382356) (owner: ''Arturo Borrero Gonzalez)'
2025-02-19 11:44:09 <kart_> OK to do deployment of MinT/machinetranslation service?
2025-02-19 11:44:36 <hnowlan> jouncebot: nowandnext
2025-02-19 11:44:36 <jouncebot> For the next 0 hour(s) and 15 minute(s): MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1100)
2025-02-19 11:44:36 <jouncebot> In 0 hour(s) and 15 minute(s): Services – Citoid / Zotero (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1200)
2025-02-19 11:45:03 <hnowlan> lgtm? assuming Amir1 doesn't have any further deploys
2025-02-19 11:45:14 <Amir1> I don't!
2025-02-19 11:45:29 <kart_> Cool. Thanks. Attempting :)
2025-02-19 11:45:45 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 11:46:05 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563016 (''ops-monitoring-bot) Draining ganeti1036.eqiad.wmnet of running VMs'
2025-02-19 11:46:28 <logmsgbot> !log kartik@deploy2002 helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
2025-02-19 11:46:42 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 11:47:15 <wikibugs> ('PS1) ''Muehlenhoff: Switch ganeti1036 to nftables [puppet] - ''https://gerrit.wikimedia.org/r/1120934'
2025-02-19 11:48:08 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS bullseye
2025-02-19 11:48:35 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
2025-02-19 11:48:48 <wikibugs> ('CR) ''Arnaudb: [C:''+2] moscovium: remove from site.pp [puppet] - ''https://gerrit.wikimedia.org/r/1120917 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 11:49:04 <logmsgbot> !log fabfur@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp
2025-02-19 11:49:50 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.reimage for host cloudgw1001.eqiad.wmnet with OS bookworm
2025-02-19 11:51:02 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1004.eqiad.wmnet to drbd
2025-02-19 11:51:28 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563029 (''ops-monitoring-bot) VM kubestagemaster1004.eqiad.wmnet switching disk type to drbd'
2025-02-19 11:52:53 <wikibugs> ('PS3) ''Fabfur: haproxykafka: limit memory usage to 5% of total physical memory [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386747)'
2025-02-19 11:53:36 <logmsgbot> !log fabfur@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
2025-02-19 11:55:32 <wikibugs> ('CR) ''Fabfur: haproxykafka: limit memory usage to 5% of total physical memory (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386747) (owner: ''Fabfur)'
2025-02-19 11:55:46 <wikibugs> ('CR) ''Fabfur: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120922 (https://phabricator.wikimedia.org/T386747) (owner: ''Fabfur)'
2025-02-19 11:59:37 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
2025-02-19 11:59:39 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti1025.eqiad.wmnet
2025-02-19 12:00:04 <jouncebot> mvolz: I, the Bot under the Fountain, call upon thee, The Deployer, to do Services – Citoid / Zotero deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1200).
2025-02-19 12:01:08 <logmsgbot> !log kartik@deploy2002 helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
2025-02-19 12:06:31 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: host reimage
2025-02-19 12:06:45 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1004.eqiad.wmnet to drbd
2025-02-19 12:06:46 <icinga-wm> PROBLEM - Host kubestagemaster1004 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 12:07:19 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 12:07:21 <jinxer-wm> FIRING: [2x] ProbeDown: Service kubestagemaster1004:6443 has failed probes (http_staging_eqiad_kube_apiserver_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#kubestagemaster1004:6443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 12:07:30 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563106 (''ops-monitoring-bot) Draining ganeti1036.eqiad.wmnet of running VMs'
2025-02-19 12:07:34 <icinga-wm> RECOVERY - Host kubestagemaster1004 is UP: PING OK - Packet loss = 0%, RTA = 0.86 ms
2025-02-19 12:07:39 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 12:09:32 <jinxer-wm> RESOLVED: [2x] ProbeDown: Service kubestagemaster1004:6443 has failed probes (http_staging_eqiad_kube_apiserver_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#kubestagemaster1004:6443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 12:09:44 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
2025-02-19 12:10:10 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563124 (''ops-monitoring-bot) VM kubestagemaster1004.eqiad.wmnet switching disk type to plain'
2025-02-19 12:10:28 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1004.eqiad.wmnet to plain
2025-02-19 12:10:32 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: host reimage
2025-02-19 12:10:45 <logmsgbot> !log fabfur@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp
2025-02-19 12:12:12 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 12:12:25 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563128 (''ops-monitoring-bot) Draining ganeti1036.eqiad.wmnet of running VMs'
2025-02-19 12:20:53 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops: Gaps in gNMI network statistics in eqiad - https://phabricator.wikimedia.org/T386807 (''cmooney) ''NEW p:''Triage''Low'
2025-02-19 12:20:59 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops: Gaps in gNMI network statistics in eqiad - https://phabricator.wikimedia.org/T386807#10563148 (''cmooney)'
2025-02-19 12:21:01 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10563149 (''cmooney)'
2025-02-19 12:22:17 <logmsgbot> !log fabfur@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp
2025-02-19 12:25:10 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "LGTM" [dns] - ''https://gerrit.wikimedia.org/r/1120901 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 12:25:50 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "LGTM" [puppet] - ''https://gerrit.wikimedia.org/r/1120889 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 12:26:05 <wikibugs> ('CR) ''Arnaudb: [C:''+2] ferm: remove moscovium from allowlist [puppet] - ''https://gerrit.wikimedia.org/r/1120889 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 12:27:36 <logmsgbot> !log andrew@cumin1002 END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudgw1003.eqiad.wmnet with OS bullseye
2025-02-19 12:27:37 <wikibugs> ('CR) ''Arnaudb: [C:''+2] rt: remove cname [dns] - ''https://gerrit.wikimedia.org/r/1120901 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 12:27:50 <logmsgbot> !log arnaudb@dns1004 START - running authdns-update
2025-02-19 12:28:39 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1001.eqiad.wmnet with OS bookworm
2025-02-19 12:29:12 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "Looks good" [puppet] - ''https://gerrit.wikimedia.org/r/1120926 (https://phabricator.wikimedia.org/T386796) (owner: ''Vgutierrez)'
2025-02-19 12:29:46 <logmsgbot> !log arnaudb@dns1004 END - running authdns-update
2025-02-19 12:30:24 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS bullseye
2025-02-19 12:31:58 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reimage for host ganeti1025.eqiad.wmnet with OS bookworm
2025-02-19 12:32:12 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563175 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host ganeti1025.eqiad.wmnet with OS bookworm'
2025-02-19 12:46:16 <logmsgbot> !log fabfur@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp
2025-02-19 12:46:35 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
2025-02-19 12:47:17 <wikibugs> ('PS1) ''Muehlenhoff: sre.hardware.upgrade-hardware: Mention possibly long run time [cookbooks] - ''https://gerrit.wikimedia.org/r/1120948 (https://phabricator.wikimedia.org/T385873)'
2025-02-19 12:48:44 <wikibugs> ('CR) ''Arturo Borrero Gonzalez: [C:''+1] cloudgw: move icmp checks under wmcs [puppet] - ''https://gerrit.wikimedia.org/r/1100819 (https://phabricator.wikimedia.org/T381580) (owner: ''Tiziano Fogli)'
2025-02-19 12:50:07 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
2025-02-19 12:50:46 <logmsgbot> !log aborrero@cumin1002 START - Cookbook sre.dns.netbox
2025-02-19 12:51:41 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1025.eqiad.wmnet with reason: host reimage
2025-02-19 12:54:29 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1025.eqiad.wmnet with reason: host reimage
2025-02-19 12:54:30 <logmsgbot> !log aborrero@cumin1002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw updates - aborrero@cumin1002"
2025-02-19 12:54:35 <logmsgbot> !log aborrero@cumin1002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw updates - aborrero@cumin1002"
2025-02-19 12:54:36 <logmsgbot> !log aborrero@cumin1002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
2025-02-19 12:55:41 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "That sounds plausible, yes" [puppet] - ''https://gerrit.wikimedia.org/r/1119718 (https://phabricator.wikimedia.org/T386297) (owner: ''Jelto)'
2025-02-19 13:00:56 <jinxer-wm> FIRING: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 13:03:45 <wikibugs> ('CR) ''Filippo Giunchedi: "I might be missing something here, though I'm not sure what's wrong with keeping https ?" [puppet] - ''https://gerrit.wikimedia.org/r/1120160 (https://phabricator.wikimedia.org/T385750) (owner: ''Phedenskog)'
2025-02-19 13:05:56 <jinxer-wm> RESOLVED: [2x] ProbeDown: Service mirror1001:443 has failed probes (http_mirrors_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 13:09:06 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS bullseye
2025-02-19 13:11:35 <logmsgbot> !log aborrero@cumin1002 START - Cookbook sre.dns.wipe-cache vlan1120.cloudgw1003.eqiad1.wikimediacloud.org on all recursors
2025-02-19 13:11:38 <logmsgbot> !log aborrero@cumin1002 END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) vlan1120.cloudgw1003.eqiad1.wikimediacloud.org on all recursors
2025-02-19 13:13:50 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1025.eqiad.wmnet with OS bookworm
2025-02-19 13:13:54 <wikibugs> 'SRE, ''Ganeti, ''Infrastructure-Foundations: Update remaining Ganeti servers in eqiad to Bookworm - https://phabricator.wikimedia.org/T382507#10563237 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host ganeti1025.eqiad.wmnet with OS bookworm completed: - ganeti102...'
2025-02-19 13:18:48 <wikibugs> ('PS2) ''Anzx: Lift IP cap for edit-a-thon on 2025-02-26 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793)'
2025-02-19 13:18:56 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#dep"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793) (owner: ''Anzx)'
2025-02-19 13:20:25 <jinxer-wm> FIRING: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 13:21:34 <wikibugs> ('CR) ''Volans: [C:''+1] "LGTM" [cookbooks] - ''https://gerrit.wikimedia.org/r/1120948 (https://phabricator.wikimedia.org/T385873) (owner: ''Muehlenhoff)'
2025-02-19 13:22:00 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.decommission for hosts cloudgw1001.eqiad.wmnet
2025-02-19 13:23:11 <logmsgbot> !log jmm@cumin2002 START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
2025-02-19 13:25:25 <jinxer-wm> RESOLVED: SystemdUnitFailed: check_netbox_uncommitted_dns_changes.service on netbox1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
2025-02-19 13:26:49 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.dns.netbox
2025-02-19 13:30:30 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
2025-02-19 13:31:09 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
2025-02-19 13:31:09 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
2025-02-19 13:31:10 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw1001.eqiad.wmnet
2025-02-19 13:31:35 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
2025-02-19 13:32:02 <wikibugs> ('PS1) ''Andrew Bogott: Clean up refs to cloudgw100[12] [puppet] - ''https://gerrit.wikimedia.org/r/1120958 (https://phabricator.wikimedia.org/T386810)'
2025-02-19 13:32:18 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.hosts.decommission for hosts cloudgw1002.eqiad.wmnet
2025-02-19 13:35:17 <wikibugs> ('CR) ''Muehlenhoff: [C:''+2] sre.hardware.upgrade-hardware: Mention possibly long run time [cookbooks] - ''https://gerrit.wikimedia.org/r/1120948 (https://phabricator.wikimedia.org/T385873) (owner: ''Muehlenhoff)'
2025-02-19 13:35:36 <wikibugs> ('CR) ''Andrew Bogott: [C:''+2] Clean up refs to cloudgw100[12] [puppet] - ''https://gerrit.wikimedia.org/r/1120958 (https://phabricator.wikimedia.org/T386810) (owner: ''Andrew Bogott)'
2025-02-19 13:36:23 <wikibugs> 'SRE, ''SRE-tools, ''Infrastructure-Foundations, ''Patch-For-Review: sre.hardware.upgrade-firmware: Firmware update hangs on Dell PowerEdge R440 - https://phabricator.wikimedia.org/T385873#10563373 (''MoritzMuehlenhoff) ''Open''Resolved a:''MoritzMuehlenhoff I've modified the sre.hardware.upgra...'
2025-02-19 13:37:00 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.dns.netbox
2025-02-19 13:41:20 <logmsgbot> !log andrew@cumin1002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
2025-02-19 13:41:31 <logmsgbot> !log dcausse@deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
2025-02-19 13:41:33 <moritzm> !log installing libtasn1-6 security updates
2025-02-19 13:41:35 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 13:41:46 <logmsgbot> !log dcausse@deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
2025-02-19 13:41:55 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"
2025-02-19 13:41:55 <logmsgbot> !log andrew@cumin1002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
2025-02-19 13:41:56 <logmsgbot> !log andrew@cumin1002 END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudgw1002.eqiad.wmnet
2025-02-19 13:41:57 <logmsgbot> !log dcausse@deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
2025-02-19 13:42:20 <logmsgbot> !log dcausse@deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
2025-02-19 13:43:01 <wikibugs> ('PS2) ''Muehlenhoff: Bump versions of Java 11/17 production images [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1120544'
2025-02-19 13:43:10 <wikibugs> ('CR) ''Muehlenhoff: Bump versions of Java 11/17 production images (''1 comment) [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1120544 (owner: ''Muehlenhoff)'
2025-02-19 13:43:52 <wikibugs> 'ops-eqiad, ''cloud-services-team, ''DC-Ops, ''decommission-hardware, ''Patch-For-Review: decommission cloudgw100[12] - https://phabricator.wikimedia.org/T386810#10563401 (''Andrew) a:''Andrew''None'
2025-02-19 13:44:39 <wikibugs> ('PS2) ''Arnaudb: rt: discarding modules about request tracker [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 13:45:28 <logmsgbot> !log fabfur@cumin1002 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
2025-02-19 13:45:30 <wikibugs> ('CR) ''Arnaudb: "I guess we can now move up the relation chain to clean up rt artifacts" [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 13:45:54 <wikibugs> ('PS2) ''Arnaudb: rt: discarding templates [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 13:46:40 <wikibugs> ('CR) ''Jgiannelos: [C:''+2] mobileapps: pipeline bot promote [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120469 (owner: ''PipelineBot)'
2025-02-19 13:47:38 <wikibugs> ('CR) ''Muehlenhoff: rt: discarding modules about request tracker (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 13:47:49 <wikibugs> ('Merged) ''jenkins-bot: mobileapps: pipeline bot promote [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120469 (owner: ''PipelineBot)'
2025-02-19 13:48:07 <wikibugs> ('PS3) ''Arnaudb: rt: discarding modules about request tracker [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 13:48:13 <wikibugs> ('PS3) ''Arnaudb: rt: discarding templates [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 13:55:18 <icinga-wm> PROBLEM - BGP status on cr1-eqiad is CRITICAL: BGP CRITICAL - No response from remote host 208.80.154.196 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
2025-02-19 13:57:38 <wikibugs> ('CR) ''Daimona Eaytoy: [C:''-1] Introduce config setting to disable default event-organizer group [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120632 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 13:59:47 <logmsgbot> !log jgiannelos@deploy2002 helmfile [staging] START helmfile.d/services/mobileapps: apply
2025-02-19 14:00:05 <jouncebot> Lucas_WMDE, Urbanecm, and TheresNoTime: I, the Bot under the Fountain, call upon thee, The Deployer, to do UTC afternoon backport window deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1400).
2025-02-19 14:00:05 <jouncebot> Daimona, anzx, and MichaelG_WMF: A patch you scheduled for UTC afternoon backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
2025-02-19 14:00:09 <anzx> o/
2025-02-19 14:00:13 <logmsgbot> !log jgiannelos@deploy2002 helmfile [staging] DONE helmfile.d/services/mobileapps: apply
2025-02-19 14:00:18 <Lucas_WMDE> o/
2025-02-19 14:00:23 <wikibugs> ('PS2) ''Daimona Eaytoy: Introduce config setting to disable default event-organizer group [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120632 (https://phabricator.wikimedia.org/T386290)'
2025-02-19 14:00:27 <Daimona> o/
2025-02-19 14:00:35 <Lucas_WMDE> I’m trying to figure out which of the changes are deployable
2025-02-19 14:00:40 <Lucas_WMDE> (I can deploy)
2025-02-19 14:01:53 <Daimona> Mine are deployable, I just made a last-minute change.
2025-02-19 14:02:04 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): Lift IP cap for edit-a-thon on 2025-02-26 (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793) (owner: ''Anzx)'
2025-02-19 14:02:52 <wikibugs> ('PS3) ''Anzx: Lift IP cap for edit-a-thon on 2025-02-26 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793)'
2025-02-19 14:03:10 <wikibugs> ('CR) ''Anzx: Lift IP cap for edit-a-thon on 2025-02-26 (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793) (owner: ''Anzx)'
2025-02-19 14:03:36 <Lucas_WMDE> ok, then let’s start with Daimona
2025-02-19 14:03:51 <Lucas_WMDE> is it okay to deploy both of those config changes at once?
2025-02-19 14:04:08 <Lucas_WMDE> *together
2025-02-19 14:05:04 <Daimona> I think so. The first one should be a no-op.
2025-02-19 14:05:20 <Lucas_WMDE> alright
2025-02-19 14:05:23 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120632 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 14:05:24 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120626 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 14:06:15 <wikibugs> ('Merged) ''jenkins-bot: Introduce config setting to disable default event-organizer group [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120632 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 14:07:21 <Lucas_WMDE> “Gerrit could not merge the change '1120626' as is and could require a rebase”
2025-02-19 14:07:33 <wikibugs> ('CR) ''Arnaudb: rt: discarding modules about request tracker (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:07:45 <Lucas_WMDE> ah, it wasn’t rebased
2025-02-19 14:07:50 <wikibugs> ('PS4) ''Daimona Eaytoy: enwiki, mswikt: Enable the CampaignEvents extension [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120626 (https://phabricator.wikimedia.org/T386290)'
2025-02-19 14:07:57 <wikibugs> ('CR) ''TrainBranchBot: "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120626 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 14:08:20 <logmsgbot> !log fabfur@cumin1002 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp
2025-02-19 14:08:54 <wikibugs> ('Merged) ''jenkins-bot: enwiki, mswikt: Enable the CampaignEvents extension [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120626 (https://phabricator.wikimedia.org/T386290) (owner: ''Daimona Eaytoy)'
2025-02-19 14:09:20 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1120632|Introduce config setting to disable default event-organizer group (T386290)]], [[gerrit:1120626|enwiki, mswikt: Enable the CampaignEvents extension (T386290 T386538)]]
2025-02-19 14:09:25 <stashbot> T386290: Enable CampaignEvents Extension on English Wikipedia - https://phabricator.wikimedia.org/T386290
2025-02-19 14:09:25 <stashbot> T386538: Enable CampaignEvents Extension on mswikt - https://phabricator.wikimedia.org/T386538
2025-02-19 14:09:27 <Lucas_WMDE> anzx: I’m trying to understand the groups change
2025-02-19 14:09:35 <Lucas_WMDE> especially regarding https://phabricator.wikimedia.org/T386781#10562387
2025-02-19 14:10:42 <Lucas_WMDE> AFAICT that comment is wrong… what do you think?
2025-02-19 14:11:42 <Lucas_WMDE> wait o_O
2025-02-19 14:12:00 <Lucas_WMDE> why is there no "confirmed" group in https://kn.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=usergroups&formatversion=2
2025-02-19 14:12:04 <Lucas_WMDE> but there is one in https://kn.wikipedia.org/w/index.php?title=%E0%B2%B5%E0%B2%BF%E0%B2%B6%E0%B3%87%E0%B2%B7:ListGroupRights&uselang=en ?
2025-02-19 14:12:19 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, daimona: Backport for [[gerrit:1120632|Introduce config setting to disable default event-organizer group (T386290)]], [[gerrit:1120626|enwiki, mswikt: Enable the CampaignEvents extension (T386290 T386538)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 14:12:20 <anzx> yeah comment does seem misleading, since i am creating new groups on those wikis
2025-02-19 14:12:33 <Lucas_WMDE> Daimona: please test
2025-02-19 14:12:40 <Daimona> doing
2025-02-19 14:13:18 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "LGTM" [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:14:13 <Lucas_WMDE> ok, https://www.wikidata.org/w/api.php?action=query&meta=siteinfo&siprop=usergroups&formatversion=2 vs. https://www.wikidata.org/wiki/Special:ListGroupRights has the same confusing behavior
2025-02-19 14:14:27 <Lucas_WMDE> where a “confirmed” group apparently exists (I can also see it at https://www.wikidata.org/wiki/Special:UserRights/Lucas_Werkmeister_(WMDE)) but not in the API output
2025-02-19 14:14:45 <Lucas_WMDE> (except in add/remove, i.e. other groups are allowed to add to / remove from this group)
2025-02-19 14:16:23 <Lucas_WMDE> ok, it seems to be because wgGroupInheritsPermissions has the confirmed group inherit from the autoconfirmed group
2025-02-19 14:16:24 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "Looks good! Make sure to also remove passwords::misc::rt from" [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:16:25 <Lucas_WMDE> (on all wikis)
2025-02-19 14:16:40 <Lucas_WMDE> and I guess the siteinfo API doesn’t account for that
2025-02-19 14:17:46 <wikibugs> ('CR) ''Filippo Giunchedi: "Clearing up my review queue -- also I don't think we should be mimicking check_proc and rely instead of systemd to do the right thing in m" [puppet] - ''https://gerrit.wikimedia.org/r/1004672 (owner: ''Slyngshede)'
2025-02-19 14:18:22 <Lucas_WMDE> aaand it’s a known issue T357846
2025-02-19 14:18:23 <stashbot> T357846: siteinfo API module does not correctly process groups defined using $wgGroupInheritsPermissions - https://phabricator.wikimedia.org/T357846
2025-02-19 14:19:33 <Daimona> Lucas_WMDE: everything looks OK AFAICT. As a side note, I still need to figure out why ResourceLoader always reports a module as not existing on the first page load, but that's for later. Probably some caching issue.
2025-02-19 14:19:43 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, daimona: Continuing with sync
2025-02-19 14:19:45 <Lucas_WMDE> ok, thanks!
2025-02-19 14:22:50 <wikibugs> ('CR) ''Dzahn: "the cache/text part can and should be merged. the gerrit/phab config parts are pretty unrelated and probably warrant a chat before just re" [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:23:44 <wikibugs> ('PS3) ''Ssingh: Release dnsdist 1.9.8-1+wmf12u1 [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607'
2025-02-19 14:24:53 <wikibugs> ('CR) ''CI reject: [V:''-1] Release dnsdist 1.9.8-1+wmf12u1 [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607 (owner: ''Ssingh)'
2025-02-19 14:26:15 <jinxer-wm> FIRING: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-web/canary at codfw: 20.83% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=codfw%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 14:26:19 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120632|Introduce config setting to disable default event-organizer group (T386290)]], [[gerrit:1120626|enwiki, mswikt: Enable the CampaignEvents extension (T386290 T386538)]] (duration: 16m 58s)
2025-02-19 14:26:24 <stashbot> T386290: Enable CampaignEvents Extension on English Wikipedia - https://phabricator.wikimedia.org/T386290
2025-02-19 14:26:24 <stashbot> T386538: Enable CampaignEvents Extension on mswikt - https://phabricator.wikimedia.org/T386538
2025-02-19 14:26:38 <Daimona> Noice, thank you :)
2025-02-19 14:26:40 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=maps1006.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:26:45 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=maps1005.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:27:16 <wikibugs> ('PS1) ''Filippo Giunchedi: profile: don't require realm production for netbox::data [puppet] - ''https://gerrit.wikimedia.org/r/1120967'
2025-02-19 14:27:29 <wikibugs> ('PS1) ''Gergő Tisza: CentralAuth: Enable SUL3 signup on group 0 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120968 (https://phabricator.wikimedia.org/T384007)'
2025-02-19 14:27:35 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=wikikube-worker100.*.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:27:40 <Lucas_WMDE> np :)
2025-02-19 14:27:57 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=maps2005.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:28:02 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=maps2006.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:28:24 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=wikikube-worker200.*.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 14:29:09 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''-1] "It looks like the `groupOverrides` changes shouldn’t be necessary, because the `confirmed` group automatically inherits permissions from t" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:30:04 <Lucas_WMDE> alright, let’s continue with the throttling exception for anzx
2025-02-19 14:30:13 <Lucas_WMDE> (and the groups changes will have to wait a bit)
2025-02-19 14:30:28 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793) (owner: ''Anzx)'
2025-02-19 14:30:47 <wikibugs> ('PS1) ''Bking: relforge: define opensearch datadir as 'opensearch' [puppet] - ''https://gerrit.wikimedia.org/r/1120969 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 14:31:15 <jinxer-wm> RESOLVED: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-web/canary at codfw: 20.83% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=codfw%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 14:31:15 <wikibugs> ('Merged) ''jenkins-bot: Lift IP cap for edit-a-thon on 2025-02-26 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120954 (https://phabricator.wikimedia.org/T386793) (owner: ''Anzx)'
2025-02-19 14:31:45 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1120954|Lift IP cap for edit-a-thon on 2025-02-26 (T386793)]]
2025-02-19 14:31:49 <stashbot> T386793: IP Lift for Wikithon at Leeds University Weds 26th February - https://phabricator.wikimedia.org/T386793
2025-02-19 14:32:39 <wikibugs> ('CR) ''ArielGlenn: [C:''+1] "Here we go..." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120968 (https://phabricator.wikimedia.org/T384007) (owner: ''Gergő Tisza)'
2025-02-19 14:34:30 <wikibugs> ('CR) ''Bking: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1120969 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 14:34:38 <anzx> Lucas_WMDE: but there is no group default for confirmed user in https://github.com/wikimedia/operations-mediawiki-config/blob/47b79412442f37a096de304fc9de1ea018fbcd9b/wmf-config/core-Permissions.php#L3220
2025-02-19 14:34:40 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1120954|Lift IP cap for edit-a-thon on 2025-02-26 (T386793)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 14:35:16 <Lucas_WMDE> anzx: can you test the IP cap change?
2025-02-19 14:35:19 <Lucas_WMDE> (I’m guessing no ^^)
2025-02-19 14:35:27 <anzx> Lucas_WMDE: no
2025-02-19 14:35:30 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, anzx: Continuing with sync
2025-02-19 14:35:34 <Lucas_WMDE> alright, then let’s roll forward with that
2025-02-19 14:36:10 <Lucas_WMDE> anzx: that’s right, the confirmed group is defined via https://github.com/wikimedia/operations-mediawiki-config/blob/47b79412442f37a096de304fc9de1ea018fbcd9b/wmf-config/InitialiseSettings.php#L3285 instead
2025-02-19 14:36:41 <wikibugs> ('PS1) ''Muehlenhoff: Extend access for aitolkyn [puppet] - ''https://gerrit.wikimedia.org/r/1120970'
2025-02-19 14:36:42 <jinxer-wm> FIRING: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
2025-02-19 14:36:45 <Lucas_WMDE> it doesn’t show up in the siteinfo API output due to a bug, but it does exist, you can see it e.g. on Special:UserGroupRights
2025-02-19 14:36:51 <logmsgbot> !log klausman@cumin1002 START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
2025-02-19 14:37:36 <wikibugs> ('CR) ''Brouberol: [C:''+1] relforge: define opensearch datadir as 'opensearch' [puppet] - ''https://gerrit.wikimedia.org/r/1120969 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 14:39:27 <wikibugs> ('CR) ''Volans: [C:''+1] "The hiera lookups seems to have a default in:" [puppet] - ''https://gerrit.wikimedia.org/r/1120967 (owner: ''Filippo Giunchedi)'
2025-02-19 14:39:33 <wikibugs> ('PS5) ''Anzx: knwiki, knwikisource, tcywikisource: add confirmed user usergroup [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781)'
2025-02-19 14:40:22 <Lucas_WMDE> MichaelG_WMF: should we deploy your changes together or separately? (once we get to them)
2025-02-19 14:40:26 <wikibugs> ('CR) ''Anzx: "removed `groupOverrides`" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:40:42 <MichaelG_WMF> Lucas_WMDE: yes please
2025-02-19 14:40:47 <wikibugs> ('CR) ''Bking: [C:''+2] relforge: define opensearch datadir as 'opensearch' [puppet] - ''https://gerrit.wikimedia.org/r/1120969 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 14:40:56 <Lucas_WMDE> that was supposed to be an exclusive or :P
2025-02-19 14:41:06 <MichaelG_WMF> XD
2025-02-19 14:41:10 <MichaelG_WMF> toghether
2025-02-19 14:41:14 <Lucas_WMDE> ok ^^
2025-02-19 14:41:24 <wikibugs> ('CR) ''Muehlenhoff: [C:''+2] Extend access for aitolkyn [puppet] - ''https://gerrit.wikimedia.org/r/1120970 (owner: ''Muehlenhoff)'
2025-02-19 14:41:35 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''+1] knwiki, knwikisource, tcywikisource: add confirmed user usergroup [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:41:42 <Lucas_WMDE> though I’ll do ^ first, it looks good to me now
2025-02-19 14:41:44 <Lucas_WMDE> jouncebot: next
2025-02-19 14:41:44 <jouncebot> In 0 hour(s) and 18 minute(s): Wikifunctions Services UTC Afternoon (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1500)
2025-02-19 14:41:56 <tgr|away> I'll have one more backport soon, will self-deploy
2025-02-19 14:42:09 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120954|Lift IP cap for edit-a-thon on 2025-02-26 (T386793)]] (duration: 10m 24s)
2025-02-19 14:42:13 <stashbot> T386793: IP Lift for Wikithon at Leeds University Weds 26th February - https://phabricator.wikimedia.org/T386793
2025-02-19 14:42:20 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''+1] "thanks! LGTM now" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:42:26 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:42:27 <MichaelG_WMF> (I hope that it is faster than in the past, because GE no longer depends on Wikibase in CI except for the gate jobs)
2025-02-19 14:42:36 <logmsgbot> !log eevans@cumin1002 START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Apply JDK 11 update - eevans@cumin1002
2025-02-19 14:42:54 <logmsgbot> !log klausman@cumin1002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
2025-02-19 14:42:59 <Lucas_WMDE> James_F: backports will probably overrun into your window, I’m guessing that’s okay as usual
2025-02-19 14:43:13 <Lucas_WMDE> MichaelG_WMF: let’s start the backport gate-and-submits already
2025-02-19 14:43:24 <wikibugs> ('Merged) ''jenkins-bot: knwiki, knwikisource, tcywikisource: add confirmed user usergroup [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120891 (https://phabricator.wikimedia.org/T386781) (owner: ''Anzx)'
2025-02-19 14:43:30 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''+2] "starting gate-and-submit ahead of deployment" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120618 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:43:34 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''+2] "starting gate-and-submit ahead of deployment" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120620 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:43:37 <wikibugs> ('CR) ''Lucas Werkmeister (WMDE): [C:''+2] "starting gate-and-submit ahead of deployment" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120643 (owner: ''Michael Große)'
2025-02-19 14:43:46 <Lucas_WMDE> (but not the config change just yet)
2025-02-19 14:43:54 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1120891|knwiki, knwikisource, tcywikisource: add confirmed user usergroup (T386781)]]
2025-02-19 14:43:57 <stashbot> T386781: Allow sysops to add/revoke Confirmed user usergroup on knwiki, knwikisource, tcywikisource - https://phabricator.wikimedia.org/T386781
2025-02-19 14:44:19 <wikibugs> ('PS4) ''Arnaudb: rt: discarding templates [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 14:44:52 <icinga-wm> RECOVERY - Disk space on archiva1002 is OK: DISK OK https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=archiva1002&var-datasource=eqiad+prometheus/ops
2025-02-19 14:46:12 <wikibugs> ('PS1) ''Gergő Tisza: Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120977 (https://phabricator.wikimedia.org/T384549)'
2025-02-19 14:46:13 <wikibugs> ('PS5) ''Arnaudb: rt: sunsetting caching [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595)'
2025-02-19 14:46:14 <wikibugs> ('CR) ''Arnaudb: "Files have been restored, this commit now only impacts cache hieradata" [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:46:44 <wikibugs> ('PS1) ''Gergő Tisza: Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120978 (https://phabricator.wikimedia.org/T384549)'
2025-02-19 14:46:52 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1120891|knwiki, knwikisource, tcywikisource: add confirmed user usergroup (T386781)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 14:46:56 <anzx> Lucas_WMDE: checking
2025-02-19 14:46:59 <Lucas_WMDE> thanks :)
2025-02-19 14:47:15 <wikibugs> ('PS3) ''Brouberol: airflow: add kafka-{test,jumbo}-eqiad connections to the remaining instances [puppet] - ''https://gerrit.wikimedia.org/r/1118831 (https://phabricator.wikimedia.org/T379676)'
2025-02-19 14:48:34 <Lucas_WMDE> changes look good to me so far
2025-02-19 14:48:51 <anzx> Lucas_WMDE:
2025-02-19 14:48:58 <anzx> looks good to me
2025-02-19 14:49:00 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, anzx: Continuing with sync
2025-02-19 14:49:05 <Lucas_WMDE> great, thank you!
2025-02-19 14:49:07 <wikibugs> ('CR) ''Brouberol: [V:''+1] "PCC SUCCESS (CORE_DIFF 3): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/4958/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1118831 (https://phabricator.wikimedia.org/T379676) (owner: ''Brouberol)'
2025-02-19 14:49:19 <wikibugs> ('CR) ''Brouberol: [V:''+1 C:''+2] airflow: add kafka-{test,jumbo}-eqiad connections to the remaining instances [puppet] - ''https://gerrit.wikimedia.org/r/1118831 (https://phabricator.wikimedia.org/T379676) (owner: ''Brouberol)'
2025-02-19 14:50:39 <wikibugs> ('CR) ''Filippo Giunchedi: [C:''+2] "Thank you for the quick review! -- I can confirm that the profile works fine in realm labs" [puppet] - ''https://gerrit.wikimedia.org/r/1120967 (owner: ''Filippo Giunchedi)'
2025-02-19 14:53:41 <wikibugs> ('CR) ''Arnaudb: "I found the private repo entry, still trying to find the stub in labs/private" [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:53:57 <wikibugs> ('CR) ''Arturo Borrero Gonzalez: [C:''+1] [wmcs::kubeadm::core] remove kubeadm-flags.env [puppet] - ''https://gerrit.wikimedia.org/r/1113194 (https://phabricator.wikimedia.org/T374193) (owner: ''Raymond Ndibe)'
2025-02-19 14:54:17 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops, ''observability, and 3 others: Prevent BGP alerts triggering when K8s host maintenance is being done - https://phabricator.wikimedia.org/T384731#10563685 (''ayounsi) Thanks ! >>! In T384731#10556225, @fgiunchedi wrote: > Since we have to overwrite `instance`...'
2025-02-19 14:55:35 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120891|knwiki, knwikisource, tcywikisource: add confirmed user usergroup (T386781)]] (duration: 11m 41s)
2025-02-19 14:55:39 <stashbot> T386781: Allow sysops to add/revoke Confirmed user usergroup on knwiki, knwikisource, tcywikisource - https://phabricator.wikimedia.org/T386781
2025-02-19 14:56:06 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120904 (https://phabricator.wikimedia.org/T385343) (owner: ''Michael Große)'
2025-02-19 14:56:07 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120618 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:56:07 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120620 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:56:09 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120643 (owner: ''Michael Große)'
2025-02-19 14:56:24 <anzx> Lucas_WMDE: thank you
2025-02-19 14:56:28 <Lucas_WMDE> np :)
2025-02-19 14:56:59 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] rt: sunsetting caching [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:57:07 <wikibugs> ('Merged) ''jenkins-bot: Growth: increase minimum tasks per topic on idwiki; ruwiki => default [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120904 (https://phabricator.wikimedia.org/T385343) (owner: ''Michael Große)'
2025-02-19 14:57:41 <wikibugs> ('Merged) ''jenkins-bot: fix(Surfacing): make instrumentation platform-aware [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120618 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:57:43 <wikibugs> ('Merged) ''jenkins-bot: feat(Surfacing): track performance metrics with statslib [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120620 (https://phabricator.wikimedia.org/T386490) (owner: ''Michael Große)'
2025-02-19 14:57:44 <wikibugs> ('Merged) ''jenkins-bot: fix(surfacing): add dependency for link-icon in popup header [extensions/GrowthExperiments] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120643 (owner: ''Michael Große)'
2025-02-19 14:58:11 <MichaelG_WMF> is here and ready to test when you are :)
2025-02-19 14:58:19 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1120904|Growth: increase minimum tasks per topic on idwiki; ruwiki => default (T385343)]], [[gerrit:1120618|fix(Surfacing): make instrumentation platform-aware (T386490)]], [[gerrit:1120620|feat(Surfacing): track performance metrics with statslib (T386490)]], [[gerrit:1120643|fix(surfacing): add dependency for link-icon in popup header]]
2025-02-19 14:58:22 <wikibugs> ('CR) ''Muehlenhoff: [C:''+1] "It's in private/modules/passwords/manifests/init.pp" [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 14:58:23 <stashbot> T385343: Surfacing "Add a link" Structured Tasks: Experiment Release (FY24/25 WE1.2.9) - https://phabricator.wikimedia.org/T385343
2025-02-19 14:58:24 <stashbot> T386490: Update Surfacing Add a Link intrumentation and tracking to desktop and statslib - https://phabricator.wikimedia.org/T386490
2025-02-19 14:58:47 <James_F> Lucas_WMDE: Yeah, fine.
2025-02-19 14:59:00 <Lucas_WMDE> 2x ack :)
2025-02-19 14:59:55 <MichaelG_WMF> We can do my config change another time, that is not urgent and I can just come back about it in the backport window tonight or tomorrow
2025-02-19 15:00:05 <jouncebot> Deploy window Wikifunctions Services UTC Afternoon (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1500)
2025-02-19 15:00:18 <Lucas_WMDE> MichaelG_WMF: scap is already running
2025-02-19 15:00:45 <MichaelG_WMF> Lucas_WMDE: ah, that is also fine, thanks!
2025-02-19 15:01:18 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, migr: Backport for [[gerrit:1120904|Growth: increase minimum tasks per topic on idwiki; ruwiki => default (T385343)]], [[gerrit:1120618|fix(Surfacing): make instrumentation platform-aware (T386490)]], [[gerrit:1120620|feat(Surfacing): track performance metrics with statslib (T386490)]], [[gerrit:1120643|fix(surfacing): add dependency for link-icon in popup heade
2025-02-19 15:01:18 <logmsgbot> r]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 15:01:25 <MichaelG_WMF> there is nothing to test for the config change (it changes behavior of a maintenance script which will be picked up later)
2025-02-19 15:01:34 <Lucas_WMDE> and anything for the backports?
2025-02-19 15:01:39 <MichaelG_WMF> yes!
2025-02-19 15:01:53 <MichaelG_WMF> though testing that backport will be quick
2025-02-19 15:02:06 <Lucas_WMDE> ok
2025-02-19 15:03:33 <wikibugs> ('PS1) ''Jforrester: wikifunctions: Update orchestrator from 2025-02-12-171406 to 2025-02-19-134350 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120991 (https://phabricator.wikimedia.org/T383631)'
2025-02-19 15:03:35 <wikibugs> ('PS1) ''Jforrester: wikifunctions: Update evaluators from 2025-02-11-155338 to 2025-02-19-135838 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120992 (https://phabricator.wikimedia.org/T383644)'
2025-02-19 15:03:36 <MichaelG_WMF> Lucas_WMDE: Looks good with mwdebug!
2025-02-19 15:03:46 <MichaelG_WMF> Ready to move forward from my side
2025-02-19 15:03:56 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, migr: Continuing with sync
2025-02-19 15:03:58 <Lucas_WMDE> great, thanks!
2025-02-19 15:04:21 <wikibugs> ('CR) ''FNegri: [C:''+2] [wmcs::kubeadm::core] remove kubeadm-flags.env [puppet] - ''https://gerrit.wikimedia.org/r/1113194 (https://phabricator.wikimedia.org/T374193) (owner: ''Raymond Ndibe)'
2025-02-19 15:04:32 <ottomata> !log upgrading eventgate-analytics in eqiad to node20 - T383814
2025-02-19 15:04:35 <logmsgbot> !log otto@deploy2002 helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
2025-02-19 15:04:35 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 15:04:36 <stashbot> T383814: Upgrade eventgate-wikimedia to node20 - https://phabricator.wikimedia.org/T383814
2025-02-19 15:04:50 <James_F> ottomata: This is our services window. :-P
2025-02-19 15:05:07 <tgr|away> James_F: is it ok to do another half an hour or so of MediaWiki backports? AIUI it doesn't interfere with the Wikifunctions window
2025-02-19 15:05:26 <James_F> tgr|away: Yes, it shouldn't be an issue.
2025-02-19 15:05:33 <logmsgbot> !log otto@deploy2002 helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
2025-02-19 15:05:36 <ottomata> James_F: ah I'm sorry. i usually look but didn't today. waited until afternoon backport was over.
2025-02-19 15:05:45 <ottomata> James_F: it shouldn't be related or interfere at all
2025-02-19 15:05:47 <James_F> ottomata: Afternoon backport also isn't over.
2025-02-19 15:06:09 <ottomata> well crap sorry. just waited until scheduled window time was over.
2025-02-19 15:06:17 <James_F> I mean, the /window/ is over, but the deploying isn't.
2025-02-19 15:06:19 <ottomata> i should have checked in.
2025-02-19 15:06:26 <James_F> No worries. :-)
2025-02-19 15:06:33 <ottomata> will do next time.
2025-02-19 15:06:42 <jinxer-wm> RESOLVED: JobUnavailable: Reduced availability for job sidekiq in ops@eqiad - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
2025-02-19 15:07:24 <apergos> thanks James (for letting us continue with MW backports)
2025-02-19 15:07:32 <tgr|away> a scap backport and a helm-based deploy running in parallel should be fine though, right?
2025-02-19 15:07:40 <wikibugs> ('CR) ''Genoveva Galarza: [C:''+2] wikifunctions: Update orchestrator from 2025-02-12-171406 to 2025-02-19-134350 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120991 (https://phabricator.wikimedia.org/T383631) (owner: ''Jforrester)'
2025-02-19 15:07:49 <James_F> tgr|away: Depends on what they talk to, but in this case yes.
2025-02-19 15:08:43 <wikibugs> ('PS3) ''Awight: [beta] Enable Community Configuration for Cite [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597)'
2025-02-19 15:08:52 <wikibugs> ('Merged) ''jenkins-bot: wikifunctions: Update orchestrator from 2025-02-12-171406 to 2025-02-19-134350 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120991 (https://phabricator.wikimedia.org/T383631) (owner: ''Jforrester)'
2025-02-19 15:10:32 <logmsgbot> !log gengh@deploy2002 helmfile [staging] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:10:32 <logmsgbot> !log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1120904|Growth: increase minimum tasks per topic on idwiki; ruwiki => default (T385343)]], [[gerrit:1120618|fix(Surfacing): make instrumentation platform-aware (T386490)]], [[gerrit:1120620|feat(Surfacing): track performance metrics with statslib (T386490)]], [[gerrit:1120643|fix(surfacing): add dependency for link-icon in popup header]]
2025-02-19 15:10:32 <logmsgbot> (duration: 12m 13s)
2025-02-19 15:10:39 <stashbot> T385343: Surfacing "Add a link" Structured Tasks: Experiment Release (FY24/25 WE1.2.9) - https://phabricator.wikimedia.org/T385343
2025-02-19 15:10:40 <stashbot> T386490: Update Surfacing Add a Link intrumentation and tracking to desktop and statslib - https://phabricator.wikimedia.org/T386490
2025-02-19 15:10:40 <Lucas_WMDE> done deploying
2025-02-19 15:10:44 <Lucas_WMDE> tgr|away: over to you
2025-02-19 15:10:53 <wikibugs> ('CR) ''Awight: [C:''+2] "Beta deployment" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120217 (https://phabricator.wikimedia.org/T373307) (owner: ''WMDE-Fisch)'
2025-02-19 15:11:01 <wikibugs> ('CR) ''Awight: [C:''+2] "Beta deployment" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597) (owner: ''Awight)'
2025-02-19 15:11:13 <logmsgbot> !log gengh@deploy2002 helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:11:45 <wikibugs> ('Merged) ''jenkins-bot: [beta] Change sub-referencing feature flag to new name [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120217 (https://phabricator.wikimedia.org/T373307) (owner: ''WMDE-Fisch)'
2025-02-19 15:11:48 <wikibugs> ('CR) ''CI reject: [V:''-1] [beta] Enable Community Configuration for Cite [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597) (owner: ''Awight)'
2025-02-19 15:12:11 <wikibugs> ('PS1) ''Andrew Bogott: wmcs-novastats-cephleaks: don't crash if trying to delete a missing file [puppet] - ''https://gerrit.wikimedia.org/r/1120996 (https://phabricator.wikimedia.org/T383796)'
2025-02-19 15:12:22 <tgr|away> thx
2025-02-19 15:12:38 <wikibugs> ('CR) ''Gergő Tisza: [C:''+2] Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120977 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:12:40 <wikibugs> ('CR) ''Gergő Tisza: [C:''+2] Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120978 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:12:47 <logmsgbot> !log gengh@deploy2002 helmfile [codfw] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:12:56 <wikibugs> ('CR) ''Andrew Bogott: [C:''+2] wmcs-novastats-cephleaks: don't crash if trying to delete a missing file [puppet] - ''https://gerrit.wikimedia.org/r/1120996 (https://phabricator.wikimedia.org/T383796) (owner: ''Andrew Bogott)'
2025-02-19 15:13:20 <wikibugs> ('PS4) ''Awight: [beta] Enable Community Configuration for Cite [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597)'
2025-02-19 15:13:35 <logmsgbot> !log gengh@deploy2002 helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:13:48 <wikibugs> ('CR) ''Awight: [C:''+2] "Beta deployment" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597) (owner: ''Awight)'
2025-02-19 15:13:51 <logmsgbot> !log gengh@deploy2002 helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:14:36 <wikibugs> ('Merged) ''jenkins-bot: [beta] Enable Community Configuration for Cite [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120466 (https://phabricator.wikimedia.org/T385597) (owner: ''Awight)'
2025-02-19 15:14:45 <logmsgbot> !log gengh@deploy2002 helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:15:16 <jinxer-wm> FIRING: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-web/next (k8s) 1.008s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-release=next - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 15:17:02 <wikibugs> ('CR) ''Pcoombe: "Just "search" and "uselang" should be all" [puppet] - ''https://gerrit.wikimedia.org/r/1080357 (https://phabricator.wikimedia.org/T318285) (owner: ''Simon04)'
2025-02-19 15:18:09 <wikibugs> ('CR) ''Genoveva Galarza: [C:''+2] wikifunctions: Update evaluators from 2025-02-11-155338 to 2025-02-19-135838 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120992 (https://phabricator.wikimedia.org/T383644) (owner: ''Jforrester)'
2025-02-19 15:18:26 <icinga-wm> PROBLEM - Checks that the local airflow scheduler for airflow @analytics is working properly on an-launcher1002 is CRITICAL: CRITICAL: /usr/bin/env PYTHONPATH=/srv/deployment/airflow-dags/analytics AIRFLOW_HOME=/srv/airflow-analytics /usr/lib/airflow/bin/airflow jobs check --job-type SchedulerJob --hostname an-launcher1002.eqiad.wmnet did not succeed https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow
2025-02-19 15:18:39 <logmsgbot> !log eevans@cumin1002 END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Apply JDK 11 update - eevans@cumin1002
2025-02-19 15:19:18 <wikibugs> ('Merged) ''jenkins-bot: wikifunctions: Update evaluators from 2025-02-11-155338 to 2025-02-19-135838 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1120992 (https://phabricator.wikimedia.org/T383644) (owner: ''Jforrester)'
2025-02-19 15:19:26 <icinga-wm> RECOVERY - Checks that the local airflow scheduler for airflow @analytics is working properly on an-launcher1002 is OK: OK: /usr/bin/env PYTHONPATH=/srv/deployment/airflow-dags/analytics AIRFLOW_HOME=/srv/airflow-analytics /usr/lib/airflow/bin/airflow jobs check --job-type SchedulerJob --hostname an-launcher1002.eqiad.wmnet succeeded https://wikitech.wikimedia.org/wiki/Analytics/Systems/Airflow
2025-02-19 15:20:16 <jinxer-wm> RESOLVED: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-web/next (k8s) 1.008s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-web&var-release=next - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 15:20:52 <logmsgbot> !log gengh@deploy2002 helmfile [staging] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:20:57 <wikibugs> 'SRE, ''LDAP, ''Patch-For-Review: ldap-admins POSIX group does not actually give any permissions to its members - https://phabricator.wikimedia.org/T386472#10563839 (''MoritzMuehlenhoff) I did a little Puppet archeology: * The name of the modify-ldap-user command was moved from sbin to bin in Puppet in 2016...'
2025-02-19 15:21:10 <wikibugs> ('CR) ''Muehlenhoff: [C:''-1] "Let's put this on hold until the discussion on https://phabricator.wikimedia.org/T386472 is complete" [puppet] - ''https://gerrit.wikimedia.org/r/1120592 (https://phabricator.wikimedia.org/T386472) (owner: ''Dzahn)'
2025-02-19 15:21:55 <logmsgbot> !log gengh@deploy2002 helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:22:06 <wikibugs> ('Merged) ''jenkins-bot: Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120977 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:22:15 <wikibugs> ('Merged) ''jenkins-bot: Add configuration options and global preference for the SUL3 rolllout [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120978 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:22:46 <logmsgbot> !log gengh@deploy2002 helmfile [codfw] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:23:33 <logmsgbot> !log gengh@deploy2002 helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:23:50 <logmsgbot> !log gengh@deploy2002 helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
2025-02-19 15:24:40 <logmsgbot> !log jgiannelos@deploy2002 helmfile [codfw] START helmfile.d/services/mobileapps: apply
2025-02-19 15:24:50 <logmsgbot> !log gengh@deploy2002 helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
2025-02-19 15:25:11 <logmsgbot> !log jgiannelos@deploy2002 helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
2025-02-19 15:25:22 <logmsgbot> !log jgiannelos@deploy2002 helmfile [eqiad] START helmfile.d/services/mobileapps: apply
2025-02-19 15:26:15 <logmsgbot> !log jgiannelos@deploy2002 helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
2025-02-19 15:29:02 <logmsgbot> !log tgr@deploy2002 Started scap sync-world: Backport for [[gerrit:1120977|Add configuration options and global preference for the SUL3 rolllout (T384549 T377144 T384552 T384215)]], [[gerrit:1120978|Add configuration options and global preference for the SUL3 rolllout (T384549 T377144 T384552 T384215)]]
2025-02-19 15:29:10 <stashbot> T384549: Create a per-user flag for enabling SUL3 - https://phabricator.wikimedia.org/T384549
2025-02-19 15:29:10 <stashbot> T377144: Create method for deterministically opting new users into SUL3 rollout - https://phabricator.wikimedia.org/T377144
2025-02-19 15:29:11 <stashbot> T384552: Create method for staged opt-in of new users into SUL3 rollout - https://phabricator.wikimedia.org/T384552
2025-02-19 15:29:11 <stashbot> T384215: Create method for staged opt-in of existing users into SUL3 rollout - https://phabricator.wikimedia.org/T384215
2025-02-19 15:30:56 <tgr|away> awight: please do a git rebase after merging beta patches next time, unexpected patches confuse scap backport
2025-02-19 15:32:38 <Lucas_WMDE> (ftr awight had asked about this in -releng and I wasn’t sure if just +2ing was okay or not – good to know)
2025-02-19 15:32:53 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=wikikube-worker2001.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 15:33:08 <dancy> awight: Another way to put that is: Always run `scap backport` for any operations/mediawiki-config change, even if they're beta-only. `scap backport` is smart enough to shortcut the deployment if it sees a beta-only config change.
2025-02-19 15:33:50 <tgr|away> oh cool, wasn't aware of that
2025-02-19 15:33:53 <awight> tgr|away: Oof, sorry that I chose the busiest possible moment to "sneak" some cruft into the mix.
2025-02-19 15:34:01 <apergos> oh will it do the rebase and etc first, then bail? that's nice
2025-02-19 15:34:15 <dancy> Yeah
2025-02-19 15:34:26 <dancy> Please never use git commands in /srv/mediawiki-staging again. :-))
2025-02-19 15:34:33 <apergos> yay!
2025-02-19 15:34:34 <awight> :-D
2025-02-19 15:34:38 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10563987 (''phaultfinder)'
2025-02-19 15:37:06 <wikibugs> ('CR) ''Arnaudb: [C:''+2] rt: discarding modules about request tracker [puppet] - ''https://gerrit.wikimedia.org/r/1117530 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 15:38:59 <wikibugs> ('CR) ''Arnaudb: [C:''+2] rt: sunsetting caching [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 15:40:00 <tgr|away> Lucas_WMDE: do you by any chance have an idea what this error means? https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-deploy-1-7.0.0-1-2025.02.19?id=whDaHpUBLmySI1N_YsRT
2025-02-19 15:40:20 <Lucas_WMDE> looks
2025-02-19 15:40:28 <Lucas_WMDE> oh
2025-02-19 15:40:28 <Lucas_WMDE> oh fuck
2025-02-19 15:40:32 <tgr|away> it's breaking one of the scap tests, but it's not obvious to me how it could be related to the patch being deployed
2025-02-19 15:40:41 <Lucas_WMDE> this is bizarre
2025-02-19 15:40:45 <tgr|away> only happening on mwdebug though
2025-02-19 15:40:49 <Lucas_WMDE> we just started seeing errors like this in CI too https://phabricator.wikimedia.org/T386836
2025-02-19 15:40:58 <Lucas_WMDE> but how on earth could it sneak into production now
2025-02-19 15:41:18 <tgr|away> cold cache or something like that?
2025-02-19 15:42:02 <tgr|away> you can see it at https://test.wikidata.org/wiki/Q232463 on mwdebug, but on the normal servers it works
2025-02-19 15:42:29 <Lucas_WMDE> I have a feeling it must be caused by https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/1115497 ?
2025-02-19 15:42:43 <Lucas_WMDE> if we started seeing it in CI soon after that was merged on master, and it’s also whining on mwdebug during backport
2025-02-19 15:42:48 <Lucas_WMDE> even if I have no idea yet how it could be related
2025-02-19 15:42:52 <Lucas_WMDE> looks at the change
2025-02-19 15:43:06 <Lucas_WMDE> (I seem to recall that Wikibase CI does indeed pull in CentralAuth through some transitive dependency)
2025-02-19 15:43:29 <tgr|away> that would make sense, but nothing in that patch interferes with content model registration afaik
2025-02-19 15:43:56 <apergos> not directly, that's certain
2025-02-19 15:44:09 <Lucas_WMDE> it’s probably something pretty arcane
2025-02-19 15:44:18 <apergos> anything else from that req id that might lead us to an earlier message?
2025-02-19 15:44:20 <Lucas_WMDE> e.g. I could imagine that your patch causes some services to be initialized in a different order
2025-02-19 15:44:31 <Lucas_WMDE> and now some Wikibase hook runs too late to register the content models
2025-02-19 15:46:04 <Lucas_WMDE> let me try to see how wikibase registers those content models
2025-02-19 15:47:40 <apergos> sigh no related log entries in logstash
2025-02-19 15:49:06 <wikibugs> ('PS1) ''Scott French: setup.py: add with-dbctl extra to conftool dependency [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021'
2025-02-19 15:49:30 <tgr|away> one of the CA hooks runs on SetupAfterCache which is quite early, and the patch changes its dependencies
2025-02-19 15:50:05 <Lucas_WMDE> yeah, and Wikibase also registers its content models in onSetupAfterCache
2025-02-19 15:50:06 <tgr|away> the new dependencies are PreferencesFactory and UserNameUtils
2025-02-19 15:50:08 <wikibugs> ('CR) ''Scott French: dbctl: pass DbCtlConfiguration to DbConfig (''1 comment) [software/spicerack] - ''https://gerrit.wikimedia.org/r/1120648 (https://phabricator.wikimedia.org/T383324) (owner: ''Scott French)'
2025-02-19 15:50:09 <Lucas_WMDE> I think I remember that being an issue before
2025-02-19 15:50:10 <wikibugs> ('CR) ''Scott French: [C:''+2] dbctl: pass DbCtlConfiguration to DbConfig [software/spicerack] - ''https://gerrit.wikimedia.org/r/1120648 (https://phabricator.wikimedia.org/T383324) (owner: ''Scott French)'
2025-02-19 15:50:39 <Lucas_WMDE> maybe https://phabricator.wikimedia.org/T288819
2025-02-19 15:50:47 <tgr|away> anyway, if it's CI reproducible, I'll just roll back
2025-02-19 15:51:14 <tgr|away> thanks for the quick response
2025-02-19 15:51:28 <Lucas_WMDE> yeah UserNameUtils pulls in NamespaceInfo
2025-02-19 15:51:39 <Lucas_WMDE> via ContentLanguage -> LanguageFactory -> NamespaceInfo
2025-02-19 15:51:39 <wikibugs> ('PS4) ''Ssingh: Release dnsdist 1.9.8-1+wmf12u1 [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607'
2025-02-19 15:51:47 <Lucas_WMDE> tgr|away: ack, thanks
2025-02-19 15:52:19 <Lucas_WMDE> I’m *really* glad this was caught on mwdebug
2025-02-19 15:53:01 <wikibugs> ('PS1) ''TrainBranchBot: Revert "Add configuration options and global preference for the SUL3 rolllout" [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1121023'
2025-02-19 15:53:02 <wikibugs> ('CR) ''TrainBranchBot: "tgr@deploy2002 created a revert of this change as I823add719e2eaa8889c9f1676492c1cfe3d23a1c" [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1120977 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:53:09 <wikibugs> ('PS1) ''TrainBranchBot: Revert "Add configuration options and global preference for the SUL3 rolllout" [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121024'
2025-02-19 15:53:10 <wikibugs> ('CR) ''TrainBranchBot: "tgr@deploy2002 created a revert of this change as I2719f041db9f2a62aaf82001979a9296bba8b835" [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1120978 (https://phabricator.wikimedia.org/T384549) (owner: ''Gergő Tisza)'
2025-02-19 15:54:09 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by tgr@deploy2002 using scap backport" [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1121023 (owner: ''TrainBranchBot)'
2025-02-19 15:54:10 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by tgr@deploy2002 using scap backport" [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121024 (owner: ''TrainBranchBot)'
2025-02-19 15:58:52 <wikibugs> ('CR) ''Dzahn: [C:''+1] rt: sunsetting caching [puppet] - ''https://gerrit.wikimedia.org/r/1117531 (https://phabricator.wikimedia.org/T384595) (owner: ''Arnaudb)'
2025-02-19 15:59:10 <Lucas_WMDE> btw I just tested it and real Wikidata would also have been broken (e.g. https://www.wikidata.org/wiki/Q42)
2025-02-19 15:59:10 <wikibugs> ('CR) ''Dzahn: [C:''+1] rt: remove cname [dns] - ''https://gerrit.wikimedia.org/r/1120901 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 15:59:37 <wikibugs> ('CR) ''Dzahn: [C:''+1] ferm: remove moscovium from allowlist [puppet] - ''https://gerrit.wikimedia.org/r/1120889 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 15:59:53 <wikibugs> ('CR) ''Dzahn: [C:''+1] moscovium: remove from site.pp [puppet] - ''https://gerrit.wikimedia.org/r/1120917 (https://phabricator.wikimedia.org/T385777) (owner: ''Arnaudb)'
2025-02-19 16:00:28 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops: Gaps in gNMI network statistics in eqiad - https://phabricator.wikimedia.org/T386807#10564100 (''cmooney)'
2025-02-19 16:00:57 <wikibugs> ('CR) ''David Caro: nova vendordata: set fqdn from project_name rather than project_id (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 16:01:51 <wikibugs> ('Merged) ''jenkins-bot: Revert "Add configuration options and global preference for the SUL3 rolllout" [extensions/CentralAuth] (wmf/1.44.0-wmf.16) - ''https://gerrit.wikimedia.org/r/1121023 (owner: ''TrainBranchBot)'
2025-02-19 16:02:34 <wikibugs> ('Merged) ''jenkins-bot: Revert "Add configuration options and global preference for the SUL3 rolllout" [extensions/CentralAuth] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121024 (owner: ''TrainBranchBot)'
2025-02-19 16:02:47 <wikibugs> ('PS1) ''Scott French: setup.py: add with-dbctl extra to conftool dependency [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021'
2025-02-19 16:02:47 <wikibugs> ('CR) ''Scott French: "Thanks in advance for the review, Riccardo!" [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021 (owner: ''Scott French)'
2025-02-19 16:02:56 <wikibugs> ('PS1) ''Giuseppe Lavagetto: kartotherian: add extra FQDN [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121035'
2025-02-19 16:03:06 <logmsgbot> !log tgr@deploy2002 Started scap sync-world: Backport for [[gerrit:1121023|Revert "Add configuration options and global preference for the SUL3 rolllout"]], [[gerrit:1121024|Revert "Add configuration options and global preference for the SUL3 rolllout"]]
2025-02-19 16:03:26 <wikibugs> ('CR) ''Volans: [C:''+1] "LGTM" [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021 (owner: ''Scott French)'
2025-02-19 16:06:07 <logmsgbot> !log tgr@deploy2002 tgr, trainbranchbot: Backport for [[gerrit:1121023|Revert "Add configuration options and global preference for the SUL3 rolllout"]], [[gerrit:1121024|Revert "Add configuration options and global preference for the SUL3 rolllout"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 16:07:18 <wikibugs> ('CR) ''CI reject: [V:''-1] Release dnsdist 1.9.8-1+wmf12u1 [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607 (owner: ''Ssingh)'
2025-02-19 16:08:17 <wikibugs> ('CR) ''David Caro: [C:''+1] "Got a question, LGTM anyhow" [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 16:08:51 <wikibugs> ('PS2) ''Giuseppe Lavagetto: kartotherian: add extra FQDN [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121035'
2025-02-19 16:09:10 <logmsgbot> !log tgr@deploy2002 tgr, trainbranchbot: Continuing with sync
2025-02-19 16:10:49 <wikibugs> 'ops-eqiad, ''SRE, ''Ceph, ''cloud-services-team, and 2 others: evaluate new drives in cloudcephosd102[123] - https://phabricator.wikimedia.org/T386725#10564210 (''Andrew) p:''Triage''Medium'
2025-02-19 16:11:42 <wikibugs> ('PS5) ''Ssingh: Release dnsdist 1.9.8-1+wmf12u1 [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607'
2025-02-19 16:12:31 <wikibugs> ('Merged) ''jenkins-bot: dbctl: pass DbCtlConfiguration to DbConfig [software/spicerack] - ''https://gerrit.wikimedia.org/r/1120648 (https://phabricator.wikimedia.org/T383324) (owner: ''Scott French)'
2025-02-19 16:15:11 <wikibugs> ('CR) ''Scott French: [C:''+2] "Thanks, Riccard" [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021 (owner: ''Scott French)'
2025-02-19 16:15:36 <wikibugs> ('CR) ''Pppery: "Could you explain what those reasons are? The initial patch is completely lacking any reasoning." [puppet] - ''https://gerrit.wikimedia.org/r/1080357 (https://phabricator.wikimedia.org/T318285) (owner: ''Simon04)'
2025-02-19 16:15:50 <logmsgbot> !log tgr@deploy2002 Finished scap sync-world: Backport for [[gerrit:1121023|Revert "Add configuration options and global preference for the SUL3 rolllout"]], [[gerrit:1121024|Revert "Add configuration options and global preference for the SUL3 rolllout"]] (duration: 12m 43s)
2025-02-19 16:19:03 <wikibugs> ('PS3) ''Giuseppe Lavagetto: kartotherian: add extra FQDN [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121035'
2025-02-19 16:19:09 <wikibugs> ('PS3) ''Muehlenhoff: openssh: Remove code to disable NIST key exchange [puppet] - ''https://gerrit.wikimedia.org/r/1074381'
2025-02-19 16:20:45 <wikibugs> ('PS6) ''BryanDavis: toolhub: Add pod.kubernetes.io/sidecars annotation to CronJob [deployment-charts] - ''https://gerrit.wikimedia.org/r/1119198 (https://phabricator.wikimedia.org/T292861)'
2025-02-19 16:22:19 <wikibugs> ('CR) ''Muehlenhoff: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1074381 (owner: ''Muehlenhoff)'
2025-02-19 16:26:50 <wikibugs> ('CR) ''BryanDavis: [C:''+2] toolhub: Add pod.kubernetes.io/sidecars annotation to CronJob [deployment-charts] - ''https://gerrit.wikimedia.org/r/1119198 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 16:26:50 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops, ''observability, and 3 others: Prevent BGP alerts triggering when K8s host maintenance is being done - https://phabricator.wikimedia.org/T384731#10564274 (''cmooney) >>! In T384731#10563685, @ayounsi wrote: >>! In T384731#10556225, @fgiunchedi wrote: >> I also...'
2025-02-19 16:28:04 <wikibugs> ('CR) ''Elukey: [C:''+2] kartotherian: add extra FQDN [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121035 (owner: ''Giuseppe Lavagetto)'
2025-02-19 16:28:18 <wikibugs> ('Merged) ''jenkins-bot: toolhub: Add pod.kubernetes.io/sidecars annotation to CronJob [deployment-charts] - ''https://gerrit.wikimedia.org/r/1119198 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 16:29:22 <logmsgbot> !log bd808@deploy2002 helmfile [staging] START helmfile.d/services/toolhub: apply
2025-02-19 16:30:07 <logmsgbot> !log jmm@cumin2002 END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
2025-02-19 16:30:36 <logmsgbot> !log elukey@deploy2002 helmfile [staging] START helmfile.d/services/kartotherian: sync
2025-02-19 16:30:42 <logmsgbot> !log elukey@deploy2002 helmfile [staging] DONE helmfile.d/services/kartotherian: sync
2025-02-19 16:30:45 <wikibugs> ('CR) ''Ssingh: "Ready for review." [debs/dnsdist] - ''https://gerrit.wikimedia.org/r/1120607 (owner: ''Ssingh)'
2025-02-19 16:30:52 <logmsgbot> !log bd808@deploy2002 helmfile [staging] DONE helmfile.d/services/toolhub: apply
2025-02-19 16:31:07 <logmsgbot> !log bd808@deploy2002 helmfile [codfw] START helmfile.d/services/toolhub: apply
2025-02-19 16:31:49 <logmsgbot> !log elukey@deploy2002 helmfile [codfw] START helmfile.d/services/kartotherian: sync
2025-02-19 16:31:51 <logmsgbot> !log elukey@deploy2002 helmfile [codfw] DONE helmfile.d/services/kartotherian: sync
2025-02-19 16:32:11 <logmsgbot> !log elukey@deploy2002 helmfile [eqiad] START helmfile.d/services/kartotherian: sync
2025-02-19 16:32:15 <logmsgbot> !log elukey@deploy2002 helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
2025-02-19 16:32:31 <logmsgbot> !log bd808@deploy2002 helmfile [codfw] DONE helmfile.d/services/toolhub: apply
2025-02-19 16:32:55 <logmsgbot> !log bd808@deploy2002 helmfile [eqiad] START helmfile.d/services/toolhub: apply
2025-02-19 16:33:59 <logmsgbot> !log bd808@deploy2002 helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
2025-02-19 16:34:25 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=wikikube-worker200.*.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 16:35:14 <wikibugs> 'SRE, ''serviceops, ''Wikimedia-Mailing-lists: Set up memcached for mailman3 - https://phabricator.wikimedia.org/T282931#10564303 (''jijiki) ''Open''Stalled'
2025-02-19 16:35:30 <wikibugs> ('Merged) ''jenkins-bot: setup.py: add with-dbctl extra to conftool dependency [software/spicerack] - ''https://gerrit.wikimedia.org/r/1121021 (owner: ''Scott French)'
2025-02-19 16:37:45 <wikibugs> 'SRE, ''serviceops, ''Wikimedia-Mailing-lists: Set up memcached for mailman3 - https://phabricator.wikimedia.org/T282931#10564314 (''jijiki) p:''Medium''Low'
2025-02-19 16:38:27 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=yes; selector: name=wikikube-worker100.*.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 16:39:33 <wikibugs> ('CR) ''Jgiannelos: [C:''+1] "@hnowlan@wikimedia.org Looks good to me but can you also take a look? We can do the deployments." [deployment-charts] - ''https://gerrit.wikimedia.org/r/1118890 (https://phabricator.wikimedia.org/T386244) (owner: ''Arlolra)'
2025-02-19 16:44:41 <wikibugs> ('PS1) ''Aklapper: Rename a variable to be clearer [phabricator/antivandalism] (wmf/stable) - ''https://gerrit.wikimedia.org/r/1121050'
2025-02-19 16:44:53 <wikibugs> ('CR) ''Aklapper: [V:''+2 C:''+2] Rename a variable to be clearer [phabricator/antivandalism] (wmf/stable) - ''https://gerrit.wikimedia.org/r/1121050 (owner: ''Aklapper)'
2025-02-19 16:49:44 <wikibugs> ('CR) ''Scott French: [C:''+1] "Looks good. Thanks!" [puppet] - ''https://gerrit.wikimedia.org/r/1120700 (owner: ''RLazarus)'
2025-02-19 16:52:07 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=wikikube-worker100.*.eqiad.wmnet,dc=eqiad,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 16:52:17 <logmsgbot> !log elukey@puppetserver1001 conftool action : set/pooled=inactive; selector: name=wikikube-worker200.*.codfw.wmnet,dc=codfw,cluster=maps,service=kartotherian-k8s-ssl
2025-02-19 16:59:40 <wikibugs> 'SRE, ''Traffic: Define an event stream and schema for haproxy_requestctl analytics pipeline ingestion - https://phabricator.wikimedia.org/T383392#10564431 (''Fabfur) >>! In T383392#10560361, @Ottomata wrote: > @Fabfur {T383914} has been deployed, so it should be possible to remove the `meta.domain` field...'
2025-02-19 17:01:12 <wikibugs> ('PS1) ''DCausse: Revert "cirrus: enable mlr-2025 for select wikis" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120534 (owner: ''Gmodena)'
2025-02-19 17:02:06 <wikibugs> ('CR) ''Vgutierrez: "lvs2013 (low-traffic LVS) hasn't any IPIP services till now, so we had IPIP support disabled there, enabling it deploys ipip-multiqueue-op" [puppet] - ''https://gerrit.wikimedia.org/r/1120496 (https://phabricator.wikimedia.org/T385564) (owner: ''Vgutierrez)'
2025-02-19 17:05:13 <wikibugs> ('CR) ''Jdlrobson: [C:''-1] Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734) (owner: ''Bernard Wang)'
2025-02-19 17:09:26 <wikibugs> ('PS1) ''Aklapper: Add some comments in editscore section [phabricator/antivandalism] (wmf/stable) - ''https://gerrit.wikimedia.org/r/1121054'
2025-02-19 17:10:05 <wikibugs> ('CR) ''Aklapper: [V:''+2 C:''+2] Add some comments in editscore section [phabricator/antivandalism] (wmf/stable) - ''https://gerrit.wikimedia.org/r/1121054 (owner: ''Aklapper)'
2025-02-19 17:13:55 <wikibugs> ('PS4) ''Cwhite: Profiler: emit both statsd and dogstatsd [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1081461 (https://phabricator.wikimedia.org/T359385)'
2025-02-19 17:15:21 <wikibugs> ('CR) ''Krinkle: [C:''+1] "LGTM for deployment." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1081461 (https://phabricator.wikimedia.org/T359385) (owner: ''Cwhite)'
2025-02-19 17:18:24 <wikibugs> ('CR) ''Hnowlan: [C:''-1] "Change makes sense to me, but the chart will need a version bump for this to be rolled out successfully." [deployment-charts] - ''https://gerrit.wikimedia.org/r/1118890 (https://phabricator.wikimedia.org/T386244) (owner: ''Arlolra)'
2025-02-19 17:28:47 <wikibugs> ('CR) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id (''2 comments) [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030) (owner: ''Andrew Bogott)'
2025-02-19 17:29:04 <wikibugs> ('PS2) ''Andrew Bogott: vendordata.txt: include rudimentary clouds.yaml in initial VM [puppet] - ''https://gerrit.wikimedia.org/r/1120683 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 17:29:04 <wikibugs> ('PS7) ''Andrew Bogott: nova vendordata: set fqdn from project_name rather than project_id [puppet] - ''https://gerrit.wikimedia.org/r/1120684 (https://phabricator.wikimedia.org/T379030)'
2025-02-19 17:31:10 <wikibugs> ('PS1) ''Gergő Tisza: NewUserMessage: Enable on test2wiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121055'
2025-02-19 17:31:34 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deployca"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121055 (owner: ''Gergő Tisza)'
2025-02-19 17:33:38 <icinga-wm> PROBLEM - Hadoop NodeManager on an-worker1105 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process
2025-02-19 17:44:38 <icinga-wm> RECOVERY - Hadoop NodeManager on an-worker1105 is OK: PROCS OK: 1 process with command name java, args org.apache.hadoop.yarn.server.nodemanager.NodeManager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Alerts%23Yarn_Nodemanager_process
2025-02-19 17:45:22 <icinga-wm> PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - ASunknown/IPv6: Active https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
2025-02-19 17:47:01 <wikibugs> ('CR) ''MVernon: [C:''+1] "Thanks for the explanation :)" [puppet] - ''https://gerrit.wikimedia.org/r/1120496 (https://phabricator.wikimedia.org/T385564) (owner: ''Vgutierrez)'
2025-02-19 17:47:21 <wikibugs> ('CR) ''MVernon: [C:''+1] hiera: Enable IPIP on ms-fe@eqiad [puppet] - ''https://gerrit.wikimedia.org/r/1120603 (https://phabricator.wikimedia.org/T385564) (owner: ''Vgutierrez)'
2025-02-19 17:47:56 <wikibugs> ('CR) ''RLazarus: [C:''+2] deployment_server: Refactor some utility functions into a Job class [puppet] - ''https://gerrit.wikimedia.org/r/1120700 (owner: ''RLazarus)'
2025-02-19 17:54:26 <wikibugs> ('CR) ''Dzahn: [C:''+2] logspam: Consolidate CurlFactory cURL errors [puppet] - ''https://gerrit.wikimedia.org/r/1056221 (https://phabricator.wikimedia.org/T371633) (owner: ''Ahmon Dancy)'
2025-02-19 18:00:05 <jouncebot> Deploy window MediaWiki infrastructure (UTC late) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1800)
2025-02-19 18:11:45 <tgr|away> Lucas_WMDE: we'd like to backport the CentralAuth patch today or tomorrow as it's needed for SUL3 rollout. Would you feel comfortable with the MediaWikiServices patch also being backported, or should we look for a workaround for now?
2025-02-19 18:31:16 <wikibugs> ('CR) ''Daimona Eaytoy: [C:''+1] "Thank you! LGTM now." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120152 (https://phabricator.wikimedia.org/T386622) (owner: ''LD)'
2025-02-19 18:31:23 <wikibugs> ('CR) ''CI reject: [V:''-1] frwiki: Enable the CampaignEvents extension [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120152 (https://phabricator.wikimedia.org/T386622) (owner: ''LD)'
2025-02-19 18:32:20 <wikibugs> 'SRE, ''SRE-Access-Requests: Requesting access to stewards-users for Melos - https://phabricator.wikimedia.org/T386581#10565042 (''KFrancis) Hello all, the NDA is out for signatures. I'll confirm when it's complete.'
2025-02-19 18:42:15 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deployca"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 18:43:34 <wikibugs> ('PS2) ''Jdlrobson: Footer: Wikimedia icon should collapse at lower resolutions [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619)'
2025-02-19 18:43:40 <wikibugs> ('CR) ''Jdlrobson: Footer: Wikimedia icon should collapse at lower resolutions (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 18:45:57 <wikibugs> ('PS3) ''Bernard Wang: Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734)'
2025-02-19 18:46:00 <wikibugs> ('CR) ''Jdlrobson: [C:''+1] Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734) (owner: ''Bernard Wang)'
2025-02-19 18:46:14 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deployca"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734) (owner: ''Bernard Wang)'
2025-02-19 18:51:06 <wikibugs> ('PS1) ''Jdlrobson: Lazy image loading Grade C fallback is broken [extensions/MobileFrontend] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121077 (https://phabricator.wikimedia.org/T386400)'
2025-02-19 18:51:34 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Wednesday, February 19 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deployca"; [extensions/MobileFrontend] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121077 (https://phabricator.wikimedia.org/T386400) (owner: ''Jdlrobson)'
2025-02-19 18:54:45 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@95b14c7]: (no justification provided)
2025-02-19 18:54:54 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@95b14c7]: (no justification provided) (duration: 00m 11s)
2025-02-19 18:57:40 <wikibugs> ('PS3) ''Arlolra: Bust cache for recreated pages [deployment-charts] - ''https://gerrit.wikimedia.org/r/1118890 (https://phabricator.wikimedia.org/T386244)'
2025-02-19 19:00:05 <jouncebot> dancy and andre: #bothumor I � Unicode. All rise for MediaWiki train - Utc-7+Utc-0 Version deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1900).
2025-02-19 19:00:07 <wikibugs> ('CR) ''Ladsgroup: "LGTM, haven't tested it but I will do later. Want me to deploy it today?" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 19:01:34 <wikibugs> ('CR) ''Herron: "Thanks for the quick review!" [puppet] - ''https://gerrit.wikimedia.org/r/1120602 (https://phabricator.wikimedia.org/T385727) (owner: ''Herron)'
2025-02-19 19:05:19 <wikibugs> ('CR) ''Herron: [C:''+1] "makes sense to me 👍" [alerts] - ''https://gerrit.wikimedia.org/r/1120923 (owner: ''Filippo Giunchedi)'
2025-02-19 19:05:44 <dancy> o/
2025-02-19 19:07:31 <wikibugs> ('PS1) ''TrainBranchBot: group1 to 1.44.0-wmf.17 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121078 (https://phabricator.wikimedia.org/T382368)'
2025-02-19 19:07:33 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] group1 to 1.44.0-wmf.17 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121078 (https://phabricator.wikimedia.org/T382368) (owner: ''TrainBranchBot)'
2025-02-19 19:08:39 <wikibugs> 'SRE, ''Infrastructure-Foundations, ''netops, ''observability, and 3 others: Prevent BGP alerts triggering when K8s host maintenance is being done - https://phabricator.wikimedia.org/T384731#10565308 (''cmooney) >>! In T384731#10563685, @ayounsi wrote: > Is it possible to duplicate the metric, before the...'
2025-02-19 19:08:43 <wikibugs> ('Merged) ''jenkins-bot: group1 to 1.44.0-wmf.17 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121078 (https://phabricator.wikimedia.org/T382368) (owner: ''TrainBranchBot)'
2025-02-19 19:18:03 <logmsgbot> !log dancy@deploy2002 rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.17 refs T382368
2025-02-19 19:18:07 <stashbot> T382368: 1.44.0-wmf.17 deployment blockers - https://phabricator.wikimedia.org/T382368
2025-02-19 19:18:45 <wikibugs> ('PS1) ''Dzahn: puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960)'
2025-02-19 19:19:08 <wikibugs> ('CR) ''CI reject: [V:''-1] puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960) (owner: ''Dzahn)'
2025-02-19 19:20:44 <wikibugs> ('PS2) ''Dzahn: puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960)'
2025-02-19 19:21:06 <wikibugs> ('CR) ''CI reject: [V:''-1] puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960) (owner: ''Dzahn)'
2025-02-19 19:22:21 <mutante> wow, kudos, CI detected that I typed "pupppet" instead of puppet :)
2025-02-19 19:23:06 <dancy> Nice work jenkinsbot
2025-02-19 19:23:24 <wikibugs> ('PS3) ''Dzahn: puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960)'
2025-02-19 19:23:54 <mutante> and whoever added that string to the typos file after doing it before
2025-02-19 19:24:14 <wikibugs> ('CR) ''Jforrester: [C:''+1] Footer: Wikimedia icon should collapse at lower resolutions [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 19:27:15 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 19:28:01 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 46s)
2025-02-19 19:28:21 <dduvall> jouncebot: nowandnext
2025-02-19 19:28:21 <jouncebot> For the next 1 hour(s) and 31 minute(s): MediaWiki train - Utc-7+Utc-0 Version (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T1900)
2025-02-19 19:28:21 <jouncebot> In 1 hour(s) and 31 minute(s): UTC late backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T2100)
2025-02-19 19:29:57 <dduvall> dancy: are you done with train. i need to restart jenkins
2025-02-19 19:30:09 <dduvall> ^ sorry, that's a question :)
2025-02-19 19:30:20 <dancy> Yep. Train looks good.
2025-02-19 19:34:14 <wikibugs> ('PS1) ''Daimona Eaytoy: Enable $wgCampaignEventsEnableEventInvitation on most wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121080 (https://phabricator.wikimedia.org/T383800)'
2025-02-19 19:35:07 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, February 20 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#depl"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121080 (https://phabricator.wikimedia.org/T383800) (owner: ''Daimona Eaytoy)'
2025-02-19 19:35:13 <dduvall> !log restarting jenkins to fix git related issues following java update (T386755)
2025-02-19 19:35:15 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 19:35:16 <stashbot> T386755: Multiple *-pipeline-test jobs failing to load pipelinelib with git error - https://phabricator.wikimedia.org/T386755
2025-02-19 19:36:33 <dduvall> doing a "safe" restart so this might be awhile. the build queue is going to fill up quite a bit as well
2025-02-19 19:51:20 <bvibber> fyi a bad type hint made it into JsonConfig on the next and pretest-cut branches
2025-02-19 19:51:36 <bvibber> wanna make sure that doesn't make it to production (it's live on beta, revert is merging)
2025-02-19 19:53:14 <dancy> Thanks bvibber!
2025-02-19 19:53:50 <dancy> The images created from those branches do not currently run anywhere.
2025-02-19 19:54:51 <bvibber> yay
2025-02-19 19:57:00 <bvibber> will the release branches be cut straight from master later? if so we should be good then :D
2025-02-19 19:57:26 <dduvall> oh fun. the queued castor-save-workspace-cache builds are blocking the completion of all the other jobs. i will cancel them
2025-02-19 19:57:34 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 19:57:43 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 10s)
2025-02-19 19:58:05 <dduvall> !log cancelling queued castor builds to unblock completed builds and jenkins restart
2025-02-19 19:58:07 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 19:58:42 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 19:58:51 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 10s)
2025-02-19 19:59:17 <wikibugs> ('PS1) ''BCornwall: provision: Adjust thermal profile for F4 [cookbooks] - ''https://gerrit.wikimedia.org/r/1121086 (https://phabricator.wikimedia.org/T373993)'
2025-02-19 19:59:34 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 19:59:37 <wikibugs> 'ops-esams, ''ops-magru, ''SRE, ''DC-Ops, and 2 others: CPU temperature issues in cp hosts - https://phabricator.wikimedia.org/T373993#10565447 (''BCornwall) I don't see any change in performance - The throttling notifications only come sparingly so I doubt we'd see much of a difference until resources be...'
2025-02-19 19:59:43 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 10s)
2025-02-19 20:01:26 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 20:01:36 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 11s)
2025-02-19 20:03:03 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 20:03:26 <icinga-wm> PROBLEM - jenkins_service_running on contint1002 is CRITICAL: PROCS CRITICAL: 0 processes with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins
2025-02-19 20:03:42 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 40s)
2025-02-19 20:03:48 <dduvall> !log restarting jenkins via systemctl due to crash
2025-02-19 20:03:51 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 20:04:26 <icinga-wm> RECOVERY - jenkins_service_running on contint1002 is OK: PROCS OK: 1 process with regex args .*/bin/java .*-jar /usr/share/java/jenkins.war https://wikitech.wikimedia.org/wiki/Jenkins
2025-02-19 20:06:16 <mutante> ah! thanks dduvall
2025-02-19 20:06:17 <dduvall> !log jenkins successfully restarted via `systemctl restart jenkins`
2025-02-19 20:06:19 <stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
2025-02-19 20:06:26 <dduvall> mutante: np
2025-02-19 20:06:49 <dduvall> seems the "safe" restart was not so safe :)
2025-02-19 20:06:58 <mutante> hah, ack :)
2025-02-19 20:08:53 <wikibugs> ('CR) ''Dzahn: "yep, sounds good. on hold." [puppet] - ''https://gerrit.wikimedia.org/r/1120592 (https://phabricator.wikimedia.org/T386472) (owner: ''Dzahn)'
2025-02-19 20:11:28 <wikibugs> ('CR) ''Dzahn: [C:''-1] "Brandon said I should not return "normal" but nothing as it's a special value" [puppet] - ''https://gerrit.wikimedia.org/r/1117941 (https://phabricator.wikimedia.org/T274228) (owner: ''Dzahn)'
2025-02-19 20:12:57 <wikibugs> ('PS3) ''Dzahn: varnish: create new policy that allows websockets but also caches [puppet] - ''https://gerrit.wikimedia.org/r/1117941 (https://phabricator.wikimedia.org/T274228)'
2025-02-19 20:15:05 <wikibugs> 'SRE, ''SRE-Access-Requests: Requesting access to deployment for arthurtaylor - https://phabricator.wikimedia.org/T386349#10565491 (''Dzahn) Arthur confirmed via email that this is the correct key and it has not been used elsewhere / in cloud before. Checking that box off as well.'
2025-02-19 20:15:51 <wikibugs> 'SRE, ''SRE-Access-Requests: Requesting access to deployment for arthurtaylor - https://phabricator.wikimedia.org/T386349#10565492 (''Dzahn)'
2025-02-19 20:17:57 <wikibugs> 'SRE, ''SRE-Access-Requests, ''LDAP-Access-Requests: Requesting access to Dashboards in Superset / Hive interfaces (like Hue) that do access private data for Mariya Shilova - https://phabricator.wikimedia.org/T386754#10565495 (''Dzahn)'
2025-02-19 20:19:09 <wikibugs> 'SRE, ''SRE-Access-Requests, ''LDAP-Access-Requests: Requesting access to Dashboards in Superset / Hive interfaces (like Hue) that do access private data for Mariya Shilova - https://phabricator.wikimedia.org/T386754#10565497 (''Dzahn) Hello @Ahoelzl, this request will need your approval. Please comment her...'
2025-02-19 20:20:35 <wikibugs> 'SRE, ''SRE-Access-Requests, ''LDAP-Access-Requests: Requesting access to Dashboards in Superset / Hive interfaces (like Hue) that do access private data for Mariya Shilova - https://phabricator.wikimedia.org/T386754#10565499 (''Dzahn) Hello @MShilova_WMF, please take a look at L3 and sign it if you agree.'
2025-02-19 20:21:15 <wikibugs> ('CR) ''Dzahn: varnish: create new policy that allows websockets but also caches [puppet] - ''https://gerrit.wikimedia.org/r/1117941 (https://phabricator.wikimedia.org/T274228) (owner: ''Dzahn)'
2025-02-19 20:23:21 <wikibugs> ('PS1) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 20:23:43 <wikibugs> ('CR) ''CI reject: [V:''-1] cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 20:24:52 <wikibugs> 'SRE, ''SRE-Access-Requests: Requesting access to deployment for arthurtaylor - https://phabricator.wikimedia.org/T386349#10565521 (''Dzahn) Noticed now Arthur already has other non-deployment but production shell access, using this key: ` ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIL/OQjQqWzDvDCW9JNQxNAXEwlJ1BL2D...'
2025-02-19 20:28:00 <wikibugs> ('PS1) ''Dzahn: admin: upgrade arthurtaylor from restricted to deployment [puppet] - ''https://gerrit.wikimedia.org/r/1121088 (https://phabricator.wikimedia.org/T386349)'
2025-02-19 20:29:16 <wikibugs> ('CR) ''Dzahn: "This assumes the existing prod access key stays the same. (The new access request lists a new key)." [puppet] - ''https://gerrit.wikimedia.org/r/1121088 (https://phabricator.wikimedia.org/T386349) (owner: ''Dzahn)'
2025-02-19 20:30:26 <wikibugs> 'SRE, ''SRE-Access-Requests, ''Patch-For-Review: Requesting access to deployment for arthurtaylor - https://phabricator.wikimedia.org/T386349#10565532 (''Dzahn) ''Open''In progress'
2025-02-19 20:46:37 <seanleong-wmde> Hi, I am sorry for the inconvenience caused, I didn't realize that I cannot use the phab to test, I will be using https://phab.wmflabs.org/ to test from now on.
2025-02-19 20:53:45 <wikibugs> ('CR) ''JHathaway: [C:''+1] puppetserver: fix puppet dir dependency issue in cloudvps masters [puppet] - ''https://gerrit.wikimedia.org/r/1121079 (https://phabricator.wikimedia.org/T382960) (owner: ''Dzahn)'
2025-02-19 20:58:16 <wikibugs> ('PS2) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:00:05 <jouncebot> RoanKattouw, Urbanecm, cjming, TheresNoTime, and kindrobot: That opportune time for a UTC late backport window deploy is upon us again. Don't be afraid. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T2100).
2025-02-19 21:00:05 <jouncebot> tgr and Jdlrobson: A patch you scheduled for UTC late backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
2025-02-19 21:00:20 <tgr|away> o/
2025-02-19 21:00:45 <Jdlrobson> o/
2025-02-19 21:03:04 <cjming> o/
2025-02-19 21:03:09 <cjming> hi - i can deploy
2025-02-19 21:03:39 <wikibugs> ('PS2) ''Gergő Tisza: NewUserMessage: Enable on test2wiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121055'
2025-02-19 21:03:46 <wikibugs> ('PS1) ''Arlolra: Revert parsoid read views on frwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121092 (https://phabricator.wikimedia.org/T356718)'
2025-02-19 21:04:08 <arlolra> o/
2025-02-19 21:04:46 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121055 (owner: ''Gergő Tisza)'
2025-02-19 21:05:00 <Jdlrobson> thanks cjming :)
2025-02-19 21:05:16 <wikibugs> ('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, February 20 UTC morning backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploy"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121092 (https://phabricator.wikimedia.org/T356718) (owner: ''Arlolra)'
2025-02-19 21:05:51 <wikibugs> ('Merged) ''jenkins-bot: NewUserMessage: Enable on test2wiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121055 (owner: ''Gergő Tisza)'
2025-02-19 21:06:20 <logmsgbot> !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1121055|NewUserMessage: Enable on test2wiki]]
2025-02-19 21:07:11 <arlolra> cjming: I just tacked on a config change, hopefully we can squeeze that in
2025-02-19 21:08:36 <cjming> tgr: on test servers if you want to check
2025-02-19 21:08:45 <cjming> arlolra: np!
2025-02-19 21:09:24 <logmsgbot> !log cjming@deploy2002 tgr, cjming: Backport for [[gerrit:1121055|NewUserMessage: Enable on test2wiki]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 21:10:13 <wikibugs> ('PS3) ''Jdlrobson: Footer: Wikimedia icon should collapse at lower resolutions [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619)'
2025-02-19 21:10:27 <cjming> tgr: ok to sync?
2025-02-19 21:10:36 <tgr|away> thanks cjming! looks good
2025-02-19 21:10:40 <logmsgbot> !log cjming@deploy2002 tgr, cjming: Continuing with sync
2025-02-19 21:11:16 <icinga-wm> PROBLEM - Router interfaces on cr1-codfw is CRITICAL: CRITICAL: host 208.80.153.192, interfaces up: 128, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 21:11:57 <wikibugs> ('PS3) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:12:10 <icinga-wm> PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 219, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 21:12:17 <wikibugs> ('CR) ''CI reject: [V:''-1] cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 21:14:42 <wikibugs> ('CR) ''Ahmon Dancy: [C:''+1] admin: upgrade arthurtaylor from restricted to deployment [puppet] - ''https://gerrit.wikimedia.org/r/1121088 (https://phabricator.wikimedia.org/T386349) (owner: ''Dzahn)'
2025-02-19 21:17:12 <logmsgbot> !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1121055|NewUserMessage: Enable on test2wiki]] (duration: 10m 52s)
2025-02-19 21:17:58 <cjming> Jdlrobson: ok if i do your 2 config patches together? i'm a little time-crunched
2025-02-19 21:18:31 <cjming> i can also do separately - np
2025-02-19 21:19:15 <jinxer-wm> FIRING: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-api-ext/canary at eqiad: 16.07% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-api-ext&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 21:20:40 <Jdlrobson> cjming: yep
2025-02-19 21:20:43 <Jdlrobson> they can all go out together
2025-02-19 21:20:53 <cjming> cool - thx!
2025-02-19 21:21:05 <wikibugs> ('PS4) ''Bernard Wang: Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734)'
2025-02-19 21:23:38 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops: PDU sensor over limit - https://phabricator.wikimedia.org/T383383#10565691 (''phaultfinder)'
2025-02-19 21:24:15 <jinxer-wm> RESOLVED: PHPFPMTooBusy: Not enough idle PHP-FPM workers for Mediawiki mw-api-ext/canary at eqiad: 16.07% idle - https://bit.ly/wmf-fpmsat - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=84&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-api-ext&var-container_name=All&var-release=canary - https://alerts.wikimedia.org/?q=alertname%3DPHPFPMTooBusy
2025-02-19 21:24:16 <jinxer-wm> FIRING: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.261s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 21:26:48 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 21:26:48 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734) (owner: ''Bernard Wang)'
2025-02-19 21:27:35 <wikibugs> ('Merged) ''jenkins-bot: Footer: Wikimedia icon should collapse at lower resolutions [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 21:27:40 <wikibugs> ('Merged) ''jenkins-bot: Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1120609 (https://phabricator.wikimedia.org/T386734) (owner: ''Bernard Wang)'
2025-02-19 21:28:07 <logmsgbot> !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1119579|Footer: Wikimedia icon should collapse at lower resolutions (T384619)]], [[gerrit:1120609|Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki (T386734)]]
2025-02-19 21:28:12 <stashbot> T384619: Update skins to support different logos at different resolutions - https://phabricator.wikimedia.org/T384619
2025-02-19 21:28:12 <stashbot> T386734: Deploy updated Search A/B test to eu/ca/test wiki - https://phabricator.wikimedia.org/T386734
2025-02-19 21:28:47 <wikibugs> ('CR) ''Clare Ming: [C:''+2] Lazy image loading Grade C fallback is broken [extensions/MobileFrontend] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121077 (https://phabricator.wikimedia.org/T386400) (owner: ''Jdlrobson)'
2025-02-19 21:29:16 <jinxer-wm> RESOLVED: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.261s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 21:30:26 <cjming> Jdlrobson: your config patches are up on test servers if you'd like to check
2025-02-19 21:30:41 <Jdlrobson> cjming: on it
2025-02-19 21:31:10 <logmsgbot> !log cjming@deploy2002 jdlrobson, cjming, bwang: Backport for [[gerrit:1119579|Footer: Wikimedia icon should collapse at lower resolutions (T384619)]], [[gerrit:1120609|Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki (T386734)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 21:31:49 <Jdlrobson> cjming: unfortunately https://gerrit.wikimedia.org/r/1119579 doesn't look like it's working correctly :( I messed up the syntax
2025-02-19 21:31:58 <Jdlrobson> other one looks good though
2025-02-19 21:32:04 <Jdlrobson> sorry.. what's best in this situation?
2025-02-19 21:32:31 <cjming> shoot - i should done it separately - i guess can i sync and revert 1119579 ?
2025-02-19 21:32:43 <Jdlrobson> i can also do a follow up if helpful
2025-02-19 21:32:50 <cjming> let's do a follow up
2025-02-19 21:32:57 <cjming> i'm assuming it will be quick
2025-02-19 21:33:01 <cjming> so i can sync for now?
2025-02-19 21:33:14 <Jdlrobson> yes
2025-02-19 21:33:18 <logmsgbot> !log cjming@deploy2002 jdlrobson, cjming, bwang: Continuing with sync
2025-02-19 21:33:31 <Jdlrobson> (as long as we revert https://gerrit.wikimedia.org/r/1119579 quickly after)
2025-02-19 21:34:04 <wikibugs> ('PS4) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:34:22 <cjming> sure thing - i guess it's just still broken?
2025-02-19 21:34:25 <wikibugs> ('CR) ''CI reject: [V:''-1] cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 21:35:58 <Jdlrobson> no no it's working now but https://gerrit.wikimedia.org/r/1119579 is very broken
2025-02-19 21:36:19 <cjming> oh whoops - ok - i'll revert as soon as it finishes syncing
2025-02-19 21:36:48 <cjming> and then do your other backport
2025-02-19 21:37:57 <wikibugs> ('PS5) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:38:19 <wikibugs> ('CR) ''CI reject: [V:''-1] cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 21:39:41 <wikibugs> ('Merged) ''jenkins-bot: Lazy image loading Grade C fallback is broken [extensions/MobileFrontend] (wmf/1.44.0-wmf.17) - ''https://gerrit.wikimedia.org/r/1121077 (https://phabricator.wikimedia.org/T386400) (owner: ''Jdlrobson)'
2025-02-19 21:39:53 <wikibugs> ('PS6) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:39:55 <logmsgbot> !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1119579|Footer: Wikimedia icon should collapse at lower resolutions (T384619)]], [[gerrit:1120609|Update Search AB test config, increase bucketing/sampling rates for eu/ca, deploy to testwiki (T386734)]] (duration: 11m 47s)
2025-02-19 21:40:00 <stashbot> T384619: Update skins to support different logos at different resolutions - https://phabricator.wikimedia.org/T384619
2025-02-19 21:40:00 <stashbot> T386734: Deploy updated Search A/B test to eu/ca/test wiki - https://phabricator.wikimedia.org/T386734
2025-02-19 21:40:14 <wikibugs> ('PS1) ''TrainBranchBot: Revert "Footer: Wikimedia icon should collapse at lower resolutions" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121094'
2025-02-19 21:40:14 <wikibugs> ('CR) ''TrainBranchBot: "cjming@deploy2002 created a revert of this change as I6e16295ded46abf6ad2f7245921315ffca20d8b5" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1119579 (https://phabricator.wikimedia.org/T384619) (owner: ''Jdlrobson)'
2025-02-19 21:40:14 <wikibugs> ('CR) ''CI reject: [V:''-1] cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 21:40:30 <Jdlrobson> cjming: looking at it
2025-02-19 21:40:46 <cjming> looking at what?
2025-02-19 21:40:51 <Jdlrobson> the follow up
2025-02-19 21:40:55 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121094 (owner: ''TrainBranchBot)'
2025-02-19 21:41:26 <cjming> Jdlrobson: reverting 1119579 now
2025-02-19 21:41:27 <wikibugs> ('PS7) ''Bking: cirrus: add commands to configure opensearch keystore [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 21:41:41 <wikibugs> ('Merged) ''jenkins-bot: Revert "Footer: Wikimedia icon should collapse at lower resolutions" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121094 (owner: ''TrainBranchBot)'
2025-02-19 21:42:13 <logmsgbot> !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1121094|Revert "Footer: Wikimedia icon should collapse at lower resolutions"]]
2025-02-19 21:45:12 <logmsgbot> !log cjming@deploy2002 trainbranchbot, cjming: Backport for [[gerrit:1121094|Revert "Footer: Wikimedia icon should collapse at lower resolutions"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 21:45:16 <logmsgbot> !log cjming@deploy2002 trainbranchbot, cjming: Continuing with sync
2025-02-19 21:45:38 <Jdlrobson> cjming: ok looks like it is fixed in production now phew
2025-02-19 21:46:11 <cjming> ya - sorry about that - tried to cut corners and it ends up taking longer anyway
2025-02-19 21:46:54 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 21:47:34 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 01m 19s)
2025-02-19 21:47:41 <cjming> Jdlrobson: as soon as revert finishes, i'll move onto your backport - should be quick
2025-02-19 21:48:31 <Jdlrobson> cjming: sounds good
2025-02-19 21:49:32 <wikibugs> ('PS1) ''BryanDavis: toolhub: Add config for crawler jobs history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121095 (https://phabricator.wikimedia.org/T292861)'
2025-02-19 21:49:33 <wikibugs> ('PS1) ''BryanDavis: toolhub: Reduce crawler history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121096 (https://phabricator.wikimedia.org/T292861)'
2025-02-19 21:49:36 <wikibugs> ('PS1) ''BryanDavis: toolhub: Bump container to 2025-02-19-214003-production [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121097 (https://phabricator.wikimedia.org/T292861)'
2025-02-19 21:51:37 <wikibugs> ('PS1) ''Jdlrobson: Take 2: Footer: Wikimedia icon should collapse at lower resolutions"" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121098 (https://phabricator.wikimedia.org/T384619)'
2025-02-19 21:52:01 <logmsgbot> !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1121094|Revert "Footer: Wikimedia icon should collapse at lower resolutions"]] (duration: 09m 48s)
2025-02-19 21:52:37 <logmsgbot> !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1121077|Lazy image loading Grade C fallback is broken (T386400)]]
2025-02-19 21:52:41 <stashbot> T386400: [Regression] Lazy image loading Grade C fallback is broken - https://phabricator.wikimedia.org/T386400
2025-02-19 21:54:53 <cjming> Jdlrobson: backport on mwdebug if you want to check
2025-02-19 21:54:57 <Jdlrobson> on it
2025-02-19 21:55:39 <logmsgbot> !log cjming@deploy2002 cjming, jdlrobson: Backport for [[gerrit:1121077|Lazy image loading Grade C fallback is broken (T386400)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 21:56:02 <wikibugs> ('PS2) ''Arlolra: Revert parsoid read views on frwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121092 (https://phabricator.wikimedia.org/T356718)'
2025-02-19 21:56:20 <cjming> arlolra: still around? i can do your patch next
2025-02-19 21:56:29 <arlolra> yup, thanks
2025-02-19 21:57:40 <Jdlrobson> cjming: please sync
2025-02-19 21:57:44 <logmsgbot> !log cjming@deploy2002 cjming, jdlrobson: Continuing with sync
2025-02-19 22:00:05 <jouncebot> Deploy window Wikifunctions Services UTC Late (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T2200)
2025-02-19 22:00:38 <cjming> Jdlrobson: i need to run in a few minutes after i do arlolra's config patch -- sorry for the footer icon debacle - i thought there'd be more time to fix the revert
2025-02-19 22:01:09 <cjming> but the rest of your changes should be live -- backport will be live shortly
2025-02-19 22:01:26 <Jdlrobson> np
2025-02-19 22:01:28 <Jdlrobson> the footer icon can wait
2025-02-19 22:01:31 <Jdlrobson> not urgent!
2025-02-19 22:01:47 <cjming> cool - thx
2025-02-19 22:02:18 <cjming> if Abstract Wikipedia folks are around - is it ok to do one more config patch?
2025-02-19 22:04:11 <arlolra> I think they would agree in the abstract
2025-02-19 22:04:19 <logmsgbot> !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1121077|Lazy image loading Grade C fallback is broken (T386400)]] (duration: 11m 41s)
2025-02-19 22:04:23 <stashbot> T386400: [Regression] Lazy image loading Grade C fallback is broken - https://phabricator.wikimedia.org/T386400
2025-02-19 22:04:30 <wikibugs> ('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121092 (https://phabricator.wikimedia.org/T356718) (owner: ''Arlolra)'
2025-02-19 22:05:16 <wikibugs> ('Merged) ''jenkins-bot: Revert parsoid read views on frwiktionary [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1121092 (https://phabricator.wikimedia.org/T356718) (owner: ''Arlolra)'
2025-02-19 22:05:42 <logmsgbot> !log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1121092|Revert parsoid read views on frwiktionary (T356718 T386272)]]
2025-02-19 22:05:47 <stashbot> T356718: Support nested special page transclusion - https://phabricator.wikimedia.org/T356718
2025-02-19 22:05:48 <stashbot> T386272: Parsoid Read Views to Wiktionary deploy ~2025-02-13 - https://phabricator.wikimedia.org/T386272
2025-02-19 22:06:16 <jinxer-wm> FIRING: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.378s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 22:08:12 <cjming> arlolra: on test servers if you want to verify - lmk if i can sync
2025-02-19 22:08:15 <wikibugs> ('CR) ''Bking: [V:''+2 C:''+2] "self-merging in the interest of time, as this will not affect any production hosts." [puppet] - ''https://gerrit.wikimedia.org/r/1121087 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 22:08:35 <arlolra> cjming: looks good, please continue
2025-02-19 22:08:43 <logmsgbot> !log cjming@deploy2002 arlolra, cjming: Backport for [[gerrit:1121092|Revert parsoid read views on frwiktionary (T356718 T386272)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
2025-02-19 22:09:19 <logmsgbot> !log cjming@deploy2002 arlolra, cjming: Continuing with sync
2025-02-19 22:10:16 <wikibugs> 'SRE, ''SRE-Access-Requests, ''LDAP-Access-Requests: Requesting access to Dashboards in Superset / Hive interfaces (like Hue) that do access private data for Mariya Shilova - https://phabricator.wikimedia.org/T386754#10565822 (''MShilova_WMF) Thank you, @Dzahn . I confirm that I signed the document.'
2025-02-19 22:11:16 <jinxer-wm> RESOLVED: MediaWikiLatencyExceeded: p75 latency high: eqiad mw-parsoid/main (k8s) 1.378s - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook#Average_latency_exceeded - https://grafana.wikimedia.org/d/U7JT--knk/mw-on-k8s?orgId=1&viewPanel=55&var-dc=eqiad%20prometheus/k8s&var-service=mediawiki&var-namespace=mw-parsoid&var-release=main - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiLatencyExceeded
2025-02-19 22:15:53 <logmsgbot> !log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1121092|Revert parsoid read views on frwiktionary (T356718 T386272)]] (duration: 10m 10s)
2025-02-19 22:16:02 <cjming> arlolra: should be live :)
2025-02-19 22:16:06 <arlolra> cjming: thank you!
2025-02-19 22:16:10 <icinga-wm> PROBLEM - Router interfaces on cr1-eqiad is CRITICAL: CRITICAL: host 208.80.154.196, interfaces up: 219, down: 1, dormant: 0, excluded: 0, unused: 0: https://wikitech.wikimedia.org/wiki/Network_monitoring%23Router_interface_down
2025-02-19 22:16:10 <cjming> yw!
2025-02-19 22:16:46 <cjming> !log end of UTC late backport window
2025-02-19 22:18:07 <wikibugs> ('CR) ''BryanDavis: [C:''+2] toolhub: Add config for crawler jobs history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121095 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:19:06 <wikibugs> ('CR) ''BryanDavis: [C:''+2] toolhub: Reduce crawler history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121096 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:19:20 <wikibugs> ('Merged) ''jenkins-bot: toolhub: Add config for crawler jobs history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121095 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:20:15 <wikibugs> ('Merged) ''jenkins-bot: toolhub: Reduce crawler history limits [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121096 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:20:50 <wikibugs> ('CR) ''BryanDavis: [C:''+2] toolhub: Bump container to 2025-02-19-214003-production [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121097 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:22:05 <wikibugs> ('Merged) ''jenkins-bot: toolhub: Bump container to 2025-02-19-214003-production [deployment-charts] - ''https://gerrit.wikimedia.org/r/1121097 (https://phabricator.wikimedia.org/T292861) (owner: ''BryanDavis)'
2025-02-19 22:25:26 <wikibugs> ('CR) ''Alexandros Kosiaris: [C:''-1] "Minor pedantic comment, plus waiting for Moritz's answer on the sysuser thing, but otherwise LGTM" [puppet] - ''https://gerrit.wikimedia.org/r/1094531 (https://phabricator.wikimedia.org/T383945) (owner: ''Ahmon Dancy)'
2025-02-19 22:30:02 <logmsgbot> !log fab@deploy2002 Started deploy [airflow-dags/research@b5ce354]: (no justification provided)
2025-02-19 22:30:21 <logmsgbot> !log bd808@deploy2002 helmfile [staging] START helmfile.d/services/toolhub: apply
2025-02-19 22:30:37 <logmsgbot> !log fab@deploy2002 Finished deploy [airflow-dags/research@b5ce354]: (no justification provided) (duration: 00m 38s)
2025-02-19 22:30:59 <wikibugs> ('CR) ''Alexandros Kosiaris: [C:''-1] "I 've just noticed that what also be required is an include of" [puppet] - ''https://gerrit.wikimedia.org/r/1094531 (https://phabricator.wikimedia.org/T383945) (owner: ''Ahmon Dancy)'
2025-02-19 22:31:31 <logmsgbot> !log bd808@deploy2002 helmfile [staging] DONE helmfile.d/services/toolhub: apply
2025-02-19 22:34:03 <logmsgbot> !log bd808@deploy2002 helmfile [codfw] START helmfile.d/services/toolhub: apply
2025-02-19 22:35:18 <icinga-wm> PROBLEM - Host ganeti1025 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:35:34 <icinga-wm> PROBLEM - BGP status on cr1-eqiad is CRITICAL: BGP CRITICAL - AS6939/IPv4: Active - HE, AS6939/IPv6: Connect - HE https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
2025-02-19 22:35:40 <logmsgbot> !log bd808@deploy2002 helmfile [codfw] DONE helmfile.d/services/toolhub: apply
2025-02-19 22:35:48 <logmsgbot> !log bd808@deploy2002 helmfile [eqiad] START helmfile.d/services/toolhub: apply
2025-02-19 22:36:16 <icinga-wm> PROBLEM - Host mr1-ulsfo.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:36:16 <icinga-wm> PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64606/IPv4: Connect - kubernetes-ml-eqiad, AS64605/IPv4: Connect - Anycast https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
2025-02-19 22:36:44 <icinga-wm> RECOVERY - Host ganeti1025 is UP: PING OK - Packet loss = 0%, RTA = 0.29 ms
2025-02-19 22:36:58 <logmsgbot> !log bd808@deploy2002 helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
2025-02-19 22:37:06 <icinga-wm> PROBLEM - Host ml-serve-ctrl1001 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:37:24 <icinga-wm> PROBLEM - Host centrallog1002 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:37:36 <icinga-wm> PROBLEM - Etcd cluster health on kubestagemaster1004 is CRITICAL: The etcd server is unhealthy https://wikitech.wikimedia.org/wiki/Etcd
2025-02-19 22:38:24 <icinga-wm> RECOVERY - Host centrallog1002 is UP: PING OK - Packet loss = 0%, RTA = 0.38 ms
2025-02-19 22:38:30 <icinga-wm> PROBLEM - Host netboxdb1003 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:38:36 <icinga-wm> RECOVERY - Etcd cluster health on kubestagemaster1004 is OK: The etcd server is healthy https://wikitech.wikimedia.org/wiki/Etcd
2025-02-19 22:39:22 <icinga-wm> PROBLEM - BFD status on cr1-eqiad is CRITICAL: Down: 1 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status
2025-02-19 22:39:32 <jinxer-wm> FIRING: [3x] ProbeDown: Service kubestagemaster1004:6443 has failed probes (http_staging_eqiad_kube_apiserver_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 22:39:34 <wikibugs> ('PS1) ''Bking: cirrus: rename s3 resources [puppet] - ''https://gerrit.wikimedia.org/r/1121101 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 22:39:36 <icinga-wm> RECOVERY - Host ml-serve-ctrl1001 is UP: PING OK - Packet loss = 0%, RTA = 0.88 ms
2025-02-19 22:40:00 <icinga-wm> RECOVERY - Host netboxdb1003 is UP: PING OK - Packet loss = 0%, RTA = 0.56 ms
2025-02-19 22:40:20 <icinga-wm> RECOVERY - BFD status on cr1-eqiad is OK: UP: 19 AdminDown: 0 Down: 0 https://wikitech.wikimedia.org/wiki/Network_monitoring%23BFD_status
2025-02-19 22:40:30 <wikibugs> ('PS2) ''Bking: cirrus: rename s3 resources [puppet] - ''https://gerrit.wikimedia.org/r/1121101 (https://phabricator.wikimedia.org/T380752)'
2025-02-19 22:40:35 <wikibugs> ('CR) ''Bking: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1121101 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 22:41:18 <icinga-wm> RECOVERY - Host mr1-ulsfo.oob IPv6 is UP: PING OK - Packet loss = 0%, RTA = 71.92 ms
2025-02-19 22:41:20 <icinga-wm> PROBLEM - BGP status on cr2-eqiad is CRITICAL: BGP CRITICAL - AS64600/IPv4: Connect - PyBal https://wikitech.wikimedia.org/wiki/Network_monitoring%23BGP_status
2025-02-19 22:42:04 <icinga-wm> PROBLEM - Host lvs1017 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:42:21 <jinxer-wm> RESOLVED: [3x] ProbeDown: Service kubestagemaster1004:6443 has failed probes (http_staging_eqiad_kube_apiserver_ip4) - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
2025-02-19 22:42:44 <icinga-wm> RECOVERY - Host lvs1017 is UP: PING OK - Packet loss = 0%, RTA = 0.22 ms
2025-02-19 22:44:14 <wikibugs> ('CR) ''Bking: [C:''+2] cirrus: rename s3 resources [puppet] - ''https://gerrit.wikimedia.org/r/1121101 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 22:44:36 <wikibugs> ('CR) ''Bking: [V:''+2 C:''+2] "self-merging, as this does not affect production hosts." [puppet] - ''https://gerrit.wikimedia.org/r/1121101 (https://phabricator.wikimedia.org/T380752) (owner: ''Bking)'
2025-02-19 22:49:00 <icinga-wm> PROBLEM - mailman list info on lists1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
2025-02-19 22:49:00 <icinga-wm> PROBLEM - mailman archives on lists1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
2025-02-19 22:49:50 <icinga-wm> RECOVERY - mailman archives on lists1004 is OK: HTTP OK: HTTP/1.1 200 OK - 53513 bytes in 0.115 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
2025-02-19 22:49:50 <icinga-wm> RECOVERY - mailman list info on lists1004 is OK: HTTP OK: HTTP/1.1 200 OK - 8922 bytes in 0.180 second response time https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
2025-02-19 22:52:04 <wikibugs> ('PS1) ''Eevans: cassandra: setup 'dev' target for Cassandra 4.1.8 [puppet] - ''https://gerrit.wikimedia.org/r/1121102 (https://phabricator.wikimedia.org/T385819)'
2025-02-19 22:52:20 <bvibber> ok confirmed our temp revert of the broken type hints has hit beta, and all is well in JsonConfig-land <3
2025-02-19 22:52:20 <bvibber> after more thorough testing the fixed version patch will be restored
2025-02-19 22:56:34 <wikibugs> 'SRE, ''SRE-Access-Requests, ''LDAP-Access-Requests: Requesting access to Dashboards in Superset / Hive interfaces (like Hue) that do access private data for Mariya Shilova - https://phabricator.wikimedia.org/T386754#10565955 (''Dzahn)'
2025-02-19 22:56:38 <icinga-wm> PROBLEM - Host mr1-magru.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 22:57:20 <icinga-wm> PROBLEM - Host mr1-esams.oob IPv6 is DOWN: PING CRITICAL - Packet loss = 100%
2025-02-19 23:00:05 <jouncebot> Deploy window Web Team deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20250219T2300)
2025-02-19 23:01:40 <icinga-wm> RECOVERY - Host mr1-magru.oob IPv6 is UP: PING OK - Packet loss = 0%, RTA = 123.14 ms
2025-02-19 23:02:22 <icinga-wm> RECOVERY - Host mr1-esams.oob IPv6 is UP: PING OK - Packet loss = 0%, RTA = 94.29 ms
2025-02-19 23:04:43 <wikibugs> ('PS1) ''Bking: Revert "cirrus: rename s3 resources" [puppet] - ''https://gerrit.wikimedia.org/r/1121104'
2025-02-19 23:06:22 <wikibugs> ('CR) ''Ryan Kemper: [C:''+1] "we fixed the puppetserver hiera secret path which made this code unnecessary" [puppet] - ''https://gerrit.wikimedia.org/r/1121104 (owner: ''Bking)'
2025-02-19 23:06:35 <wikibugs> ('CR) ''Bking: [V:''+2 C:''+2] Revert "cirrus: rename s3 resources" [puppet] - ''https://gerrit.wikimedia.org/r/1121104 (owner: ''Bking)'
2025-02-19 23:07:19 <wikibugs> ('PS1) ''Bking: Revert "cirrus: add commands to configure opensearch keystore" [puppet] - ''https://gerrit.wikimedia.org/r/1121106'
2025-02-19 23:07:24 <wikibugs> ('CR) ''Ryan Kemper: [C:''+1] "we fixed the puppetserver hiera secret path which made this code unnecessary" [puppet] - ''https://gerrit.wikimedia.org/r/1121106 (owner: ''Bking)'
2025-02-19 23:07:29 <wikibugs> ('CR) ''Ryan Kemper: [C:''+2] Revert "cirrus: add commands to configure opensearch keystore" [puppet] - ''https://gerrit.wikimedia.org/r/1121106 (owner: ''Bking)'
2025-02-19 23:07:31 <wikibugs> ('CR) ''Ryan Kemper: [V:''+2 C:''+2] Revert "cirrus: add commands to configure opensearch keystore" [puppet] - ''https://gerrit.wikimedia.org/r/1121106 (owner: ''Bking)'
2025-02-19 23:18:43 <jinxer-wm> FIRING: [2x] RipeAtlasAnchorUnreachable: ipv6 ping to codfw RIPE Atlas anchor: failures over threshold for measurement 32391312 - https://wikitech.wikimedia.org/wiki/Network_monitoring#Atlas_alerts - https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DRipeAtlasAnchorUnreachable
2025-02-19 23:23:43 <jinxer-wm> RESOLVED: [2x] RipeAtlasAnchorUnreachable: ipv6 ping to codfw RIPE Atlas anchor: failures over threshold for measurement 32391312 - https://wikitech.wikimedia.org/wiki/Network_monitoring#Atlas_alerts - https://grafana.wikimedia.org/d/K1qm1j-Wz/ripe-atlas?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DRipeAtlasAnchorUnreachable
2025-02-19 23:42:06 <wikibugs> 'ops-eqiad, ''SRE, ''DC-Ops, ''Infrastructure-Foundations: Q2:rack/setup/install ganeti105[34].eqiad.wmnet - https://phabricator.wikimedia.org/T381576#10566082 (''Papaul) @VRiley-WMF any updates on those 2 hosts?'

This page is generated from SQL logs, you can also download static txt files from here