|
2026-01-15 00:00:47
|
<wikibugs>
|
('Merged) ''jenkins-bot: Start reading from il_target_id on testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226965 (https://phabricator.wikimedia.org/T413669) (owner: ''Zabe)'
|
|
2026-01-15 00:01:23
|
<logmsgbot>
|
!log zabe@deploy2002 Started scap sync-world: Backport for [[gerrit:1226965|Start reading from il_target_id on testwiki (T413669)]]
|
|
2026-01-15 00:01:29
|
<stashbot>
|
T413669: Set imagelinks migration to read new - https://phabricator.wikimedia.org/T413669
|
|
2026-01-15 00:03:30
|
<logmsgbot>
|
!log zabe@deploy2002 zabe: Backport for [[gerrit:1226965|Start reading from il_target_id on testwiki (T413669)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 00:05:30
|
<logmsgbot>
|
!log zabe@deploy2002 zabe: Continuing with sync
|
|
2026-01-15 00:09:36
|
<logmsgbot>
|
!log zabe@deploy2002 Finished scap sync-world: Backport for [[gerrit:1226965|Start reading from il_target_id on testwiki (T413669)]] (duration: 08m 13s)
|
|
2026-01-15 00:09:41
|
<stashbot>
|
T413669: Set imagelinks migration to read new - https://phabricator.wikimedia.org/T413669
|
|
2026-01-15 00:14:45
|
<wikibugs>
|
'ops-ulsfo, ''SRE, ''DC-Ops, ''Infrastructure-Foundations, ''netops: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11523618 (''Papaul) Phase 1 of ULSFO migration which was changing the loopback addresses of cr1,cr4 ,mr1 and the IP address of the link between
cr3 and cr4 was...'
|
|
2026-01-15 00:23:57
|
<icinga-wm>
|
PROBLEM - Host an-worker1159 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 00:23:57
|
<icinga-wm>
|
PROBLEM - Host an-worker1160 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 00:38:40
|
<jinxer-wm>
|
FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 00:41:16
|
<wikibugs>
|
('PS1) ''TrainBranchBot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1226973'
|
|
2026-01-15 00:41:16
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1226973 (owner: ''TrainBranchBot)'
|
|
2026-01-15 00:50:20
|
<wikibugs>
|
('PS1) ''Sbisson: CX3 Build 1.0.0+20260114 [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226976 (https://phabricator.wikimedia.org/T413646)'
|
|
2026-01-15 00:50:43
|
<wikibugs>
|
('PS1) ''Sbisson: Fallback to source title if target title is not provided by cxserver [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226977 (https://phabricator.wikimedia.org/T414558)'
|
|
2026-01-15 00:51:41
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226976 (https://phabricator.wikimedia.org/T413646) (owner: ''Sbisson)'
|
|
2026-01-15 00:52:08
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226977 (https://phabricator.wikimedia.org/T414558) (owner: ''Sbisson)'
|
|
2026-01-15 00:54:26
|
<wikibugs>
|
('Merged) ''jenkins-bot: Branch commit for wmf/branch_cut_pretest [core] (wmf/branch_cut_pretest) - ''https://gerrit.wikimedia.org/r/1226973 (owner: ''TrainBranchBot)'
|
|
2026-01-15 00:56:59
|
<logmsgbot>
|
ryankemper@cumin2002 reboot-workers (PID 2845277) is awaiting input
|
|
2026-01-15 00:57:44
|
<logmsgbot>
|
!log ryankemper@cumin2002 END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
|
|
2026-01-15 01:00:49
|
<logmsgbot>
|
!log mwpresync@deploy2002 Started scap build-images: Publishing wmf/next image
|
|
2026-01-15 01:10:46
|
<wikibugs>
|
('PS1) ''TrainBranchBot: Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1226980'
|
|
2026-01-15 01:10:47
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1226980 (owner: ''TrainBranchBot)'
|
|
2026-01-15 01:13:47
|
<logmsgbot>
|
!log mwpresync@deploy2002 Finished scap build-images: Publishing wmf/next image (duration: 12m 57s)
|
|
2026-01-15 01:18:39
|
<wikibugs>
|
('PS1) ''Jdrewniak: Update portals submodule for WP25 birthday preview. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226981 (https://phabricator.wikimedia.org/T128546)'
|
|
2026-01-15 01:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 01:27:28
|
<wikibugs>
|
('Abandoned) ''Jdrewniak: Bumping portals to master [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226477 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 01:33:20
|
<wikibugs>
|
('Merged) ''jenkins-bot: Branch commit for wmf/next [core] (wmf/next) - ''https://gerrit.wikimedia.org/r/1226980 (owner: ''TrainBranchBot)'
|
|
2026-01-15 01:41:45
|
<jinxer-wm>
|
FIRING: [4x] LibericaUnhealthyRealserverPooled: Liberica service gerrit-sshlb6_29418 has 2 unhealthy realservers pooled on lvs7001:3003 - https://wikitech.wikimedia.org/wiki/Liberica#LibericaUnhealthyRealserverPooled - https://alerts.wikimedia.org/?q=alertname%3DLibericaUnhealthyRealserverPooled
|
|
2026-01-15 02:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 03:40:12
|
<wikibugs>
|
'ops-ulsfo, ''SRE, ''DC-Ops, ''Infrastructure-Foundations, ''netops: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11523794 (''Papaul)'
|
|
2026-01-15 04:00:37
|
<wikibugs>
|
('PS1) ''Clare Ming: Enable Test Kitchen on all prod wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227004 (https://phabricator.wikimedia.org/T407806)'
|
|
2026-01-15 04:02:17
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2172 (T413525)', diff saved to https://phabricator.wikimedia.org/P87525 and previous config saved to /var/cache/conftool/dbconfig/20260115-040216-marostegui.json
|
|
2026-01-15 04:02:22
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 04:06:59
|
<icinga-wm>
|
PROBLEM - Backup freshness on backup1014 is CRITICAL: Stale: 1 (dbprov1004), Fresh: 139 jobs https://wikitech.wikimedia.org/wiki/Bacula%23Monitoring
|
|
2026-01-15 04:12:26
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P87526 and previous config saved to /var/cache/conftool/dbconfig/20260115-041225-marostegui.json
|
|
2026-01-15 04:22:35
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P87527 and previous config saved to /var/cache/conftool/dbconfig/20260115-042233-marostegui.json
|
|
2026-01-15 04:28:45
|
<wikibugs>
|
'ops-ulsfo, ''SRE, ''DC-Ops, ''Infrastructure-Foundations, ''netops: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11523834 (''Papaul)'
|
|
2026-01-15 04:32:43
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2172 (T413525)', diff saved to https://phabricator.wikimedia.org/P87528 and previous config saved to /var/cache/conftool/dbconfig/20260115-043242-marostegui.json
|
|
2026-01-15 04:32:48
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 04:33:00
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 04:38:40
|
<jinxer-wm>
|
FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 05:04:49
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1261 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87529 and previous config saved to /var/cache/conftool/dbconfig/20260115-050448-marostegui.json
|
|
2026-01-15 05:04:55
|
<stashbot>
|
T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163
|
|
2026-01-15 05:04:55
|
<stashbot>
|
T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164
|
|
2026-01-15 05:09:11
|
<jinxer-wm>
|
FIRING: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 05:14:56
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P87530 and previous config saved to /var/cache/conftool/dbconfig/20260115-051455-marostegui.json
|
|
2026-01-15 05:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 05:25:04
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P87532 and previous config saved to /var/cache/conftool/dbconfig/20260115-052504-marostegui.json
|
|
2026-01-15 05:34:11
|
<jinxer-wm>
|
RESOLVED: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 05:35:09
|
<wikibugs>
|
'ops-ulsfo, ''SRE, ''DC-Ops, ''Infrastructure-Foundations, ''netops: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11523872 (''Papaul)'
|
|
2026-01-15 05:35:13
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1261 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87533 and previous config saved to /var/cache/conftool/dbconfig/20260115-053512-marostegui.json
|
|
2026-01-15 05:35:19
|
<stashbot>
|
T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163
|
|
2026-01-15 05:35:19
|
<stashbot>
|
T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164
|
|
2026-01-15 05:35:30
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1262.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 05:35:38
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1262 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87534 and previous config saved to /var/cache/conftool/dbconfig/20260115-053537-marostegui.json
|
|
2026-01-15 06:28:55
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 06:29:03
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1169 (T413525)', diff saved to https://phabricator.wikimedia.org/P87535 and previous config saved to /var/cache/conftool/dbconfig/20260115-062902-marostegui.json
|
|
2026-01-15 06:29:07
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 06:30:12
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1169 (T413525)', diff saved to https://phabricator.wikimedia.org/P87536 and previous config saved to /var/cache/conftool/dbconfig/20260115-063011-marostegui.json
|
|
2026-01-15 06:32:55
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 06:33:21
|
<logmsgbot>
|
!log marostegui@cumin1003 START - Cookbook sre.mysql.pool db1169 gradually with 4 steps - After schema change
|
|
2026-01-15 06:35:25
|
<wikibugs>
|
('CR) ''Marostegui: [C:''+1] sre.mysql.newpool: [de]pool various section kinds [cookbooks] - ''https://gerrit.wikimedia.org/r/1215575 (https://phabricator.wikimedia.org/T411573) (owner: ''Federico Ceratto)'
|
|
2026-01-15 06:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 06:43:12
|
<wikibugs>
|
('PS1) ''Giuseppe Lavagetto: cache::upload: rate-limit rather than blocking bingbot [puppet] - ''https://gerrit.wikimedia.org/r/1227202'
|
|
2026-01-15 06:45:13
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11523917 (''Dzahn) a:''ATitkov→''Dzahn - site updated to
version: 2026-01-14-150341 https://gerrit.wikimedia.org/r/c/operations/deploymen...'
|
|
2026-01-15 06:46:01
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11523919 (''Dzahn) ''Open→''In
progress'
|
|
2026-01-15 07:00:05
|
<jouncebot>
|
Deploy window MediaWiki infrastructure (UTC early) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T0700)
|
|
2026-01-15 07:00:05
|
<jouncebot>
|
marostegui, Amir1, and federico3: gettimeofday() says it's time for Primary database switchover. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T0700)
|
|
2026-01-15 07:01:17
|
<XioNoX>
|
!log restart snmp and MIB processes on asw1-b12-drmrs - T413181
|
|
2026-01-15 07:01:20
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 07:01:21
|
<stashbot>
|
T413181: asw1-b12-drmrs stopped reporting metrics - https://phabricator.wikimedia.org/T413181
|
|
2026-01-15 07:02:46
|
<wikibugs>
|
('CR) ''Dzahn: [C:''+2] Revert "trafficserver: disable wikipedia25" [puppet] - ''https://gerrit.wikimedia.org/r/1224959 (https://phabricator.wikimedia.org/T408592) (owner: ''Dzahn)'
|
|
2026-01-15 07:03:43
|
<logmsgbot>
|
!log marostegui@cumin1003 END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db1169 gradually with 4 steps - After schema change
|
|
2026-01-15 07:06:43
|
<wikibugs>
|
('PS1) ''Marostegui: dbproxy2005: Add Debian Trixie note [puppet] - ''https://gerrit.wikimedia.org/r/1227204 (https://phabricator.wikimedia.org/T409398)'
|
|
2026-01-15 07:08:55
|
<wikibugs>
|
('CR) ''Marostegui: [C:''+2] dbproxy2005: Add Debian Trixie note [puppet] - ''https://gerrit.wikimedia.org/r/1227204 (https://phabricator.wikimedia.org/T409398) (owner: ''Marostegui)'
|
|
2026-01-15 07:16:14
|
<wikibugs>
|
('CR) ''JMeybohm: [C:''+1] "sgtm" [puppet] - ''https://gerrit.wikimedia.org/r/1226914 (https://phabricator.wikimedia.org/T394476) (owner: ''Elukey)'
|
|
2026-01-15 07:18:13
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11523949 (''Dzahn) The site is active: https://www.wikipedia25.org'
|
|
2026-01-15 07:25:26
|
<wikibugs>
|
('PS1) ''Superpes15: [slwiki] Fix temporary logo for Wikipedia 25 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227210 (https://phabricator.wikimedia.org/T414265)'
|
|
2026-01-15 07:28:34
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 07:33:14
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11523959 (''A_smart_kitten) Just a note (apologies if there's a better place to raise this): When I click on any of the 'Transcript' buttons...'
|
|
2026-01-15 07:51:59
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Record LDAP access for tadeleye [puppet] - ''https://gerrit.wikimedia.org/r/1227214'
|
|
2026-01-15 07:53:48
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Record LDAP access for tadeleye [puppet] - ''https://gerrit.wikimedia.org/r/1227214 (owner: ''Muehlenhoff)'
|
|
2026-01-15 07:54:36
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 07:54:44
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2206 (T413525)', diff saved to https://phabricator.wikimedia.org/P87540 and previous config saved to /var/cache/conftool/dbconfig/20260115-075444-marostegui.json
|
|
2026-01-15 07:54:49
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 07:54:56
|
<wikibugs>
|
('PS2) ''Gergő Tisza: debug: Add X-Provenance header to Logstash [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226903 (https://phabricator.wikimedia.org/T412396)'
|
|
2026-01-15 07:55:05
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] debug: Add X-Provenance header to Logstash [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226903 (https://phabricator.wikimedia.org/T412396) (owner: ''Gergő Tisza)'
|
|
2026-01-15 07:55:42
|
<wikibugs>
|
('PS3) ''Gergő Tisza: debug: Add X-Provenance header to Logstash [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226903 (https://phabricator.wikimedia.org/T412396)'
|
|
2026-01-15 07:55:51
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] debug: Add X-Provenance header to Logstash [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226903 (https://phabricator.wikimedia.org/T412396) (owner: ''Gergő Tisza)'
|
|
2026-01-15 07:55:54
|
<wikibugs>
|
('PS4) ''Gergő Tisza: debug: Add X-Provenance header to Logstash [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226903 (https://phabricator.wikimedia.org/T412396)'
|
|
2026-01-15 08:00:52
|
<hashar>
|
good morning
|
|
2026-01-15 08:01:37
|
<hashar>
|
Superpes: hello, I'll deploy your change
|
|
2026-01-15 08:01:47
|
<Superpes>
|
Hi thanks hashar :)
|
|
2026-01-15 08:01:57
|
<hashar>
|
artemkloko: good morning, I am going to deploy the WP25 change for portals
|
|
2026-01-15 08:02:22
|
<hashar>
|
reads the changes
|
|
2026-01-15 08:03:32
|
<wikibugs>
|
'SRE, ''Data-Platform-SRE (2026.01.05 - 2026.01.23), ''Patch-For-Review: October 2025 Bullseye reboots: Data Platform Engineering-owned hosts - https://phabricator.wikimedia.org/T411568#11523973 (''RKemper) Got about 40 `an-worker*` hosts done, but there's still another ~80 left to be done'
|
|
2026-01-15 08:04:43
|
<hashar>
|
I'll start
|
|
2026-01-15 08:05:46
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by hashar@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227210 (https://phabricator.wikimedia.org/T414265) (owner: ''Superpes15)'
|
|
2026-01-15 08:05:46
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by hashar@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226981 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 08:06:32
|
<hashar>
|
changes are in the pipe https://integration.wikimedia.org/zuul/#q=mediawiki-config
|
|
2026-01-15 08:06:38
|
<wikibugs>
|
('Merged) ''jenkins-bot: [slwiki] Fix temporary logo for Wikipedia 25 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227210 (https://phabricator.wikimedia.org/T414265) (owner: ''Superpes15)'
|
|
2026-01-15 08:06:42
|
<wikibugs>
|
('Merged) ''jenkins-bot: Update portals submodule for WP25 birthday preview. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226981 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 08:07:52
|
<logmsgbot>
|
!log hashar@deploy2002 Started scap sync-world: Backport for [[gerrit:1227210|[slwiki] Fix temporary logo for Wikipedia 25 (T414265)]], [[gerrit:1226981|Update portals submodule for WP25 birthday preview. (T128546)]]
|
|
2026-01-15 08:07:57
|
<stashbot>
|
T414265: Requesting temporary logo change for sl.wikipedia.org (WP25) - https://phabricator.wikimedia.org/T414265
|
|
2026-01-15 08:07:57
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 08:10:18
|
<logmsgbot>
|
!log hashar@deploy2002 hashar, jdrewniak, superpes: Backport for [[gerrit:1227210|[slwiki] Fix temporary logo for Wikipedia 25 (T414265)]], [[gerrit:1226981|Update portals submodule for WP25 birthday preview. (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 08:10:24
|
<Superpes>
|
Testing!
|
|
2026-01-15 08:12:55
|
<Superpes>
|
Uhm... looks weird! hashar Are you able to quickly test via browser?
|
|
2026-01-15 08:13:12
|
<Superpes>
|
Oh now it looks fine lmao
|
|
2026-01-15 08:13:18
|
<Superpes>
|
Maybe a cache issue?
|
|
2026-01-15 08:13:18
|
<hashar>
|
caches!! :b
|
|
2026-01-15 08:13:38
|
<Superpes>
|
Yep lol It's fine thanks :)
|
|
2026-01-15 08:13:47
|
<hashar>
|
of course I have a wrong link
|
|
2026-01-15 08:13:48
|
<hashar>
|
:b
|
|
2026-01-15 08:14:09
|
<hashar>
|
artemkloko: I have pushed the change for the portal and the orange button points to a link that does not exist :/
|
|
2026-01-15 08:14:40
|
<hashar>
|
I guess cause the wikimediafoundation.org page has not been published
|
|
2026-01-15 08:14:43
|
<wikibugs>
|
('PS4) ''Dreamy Jazz: Write new for CheckUser user agent table migration on group1 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1223674 (https://phabricator.wikimedia.org/T361196)'
|
|
2026-01-15 08:14:44
|
<wikibugs>
|
('PS4) ''Dreamy Jazz: Write new for CheckUser user agent table migration everywhere [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1223675 (https://phabricator.wikimedia.org/T361196)'
|
|
2026-01-15 08:15:21
|
<hashar>
|
Superpes: great thanks
|
|
2026-01-15 08:15:37
|
<hashar>
|
I'll most probably cancel, revert the portals update change and deploy again
|
|
2026-01-15 08:16:22
|
<logmsgbot>
|
!log hashar@deploy2002 Sync cancelled.
|
|
2026-01-15 08:17:58
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11523999 (''Dzahn) @A_smart_kitten Thanks for reporting. The issue is known and currently a fix is being worked on.'
|
|
2026-01-15 08:18:44
|
<wikibugs>
|
('PS1) ''Hashar: Revert "Update portals submodule for WP25 birthday preview." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227256 (https://phabricator.wikimedia.org/T128546)'
|
|
2026-01-15 08:19:24
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by dreamyjazz@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1223674 (https://phabricator.wikimedia.org/T361196) (owner: ''Dreamy Jazz)'
|
|
2026-01-15 08:20:04
|
<Dreamy_Jazz>
|
jouncebot: nowandnext
|
|
2026-01-15 08:20:05
|
<jouncebot>
|
For the next 0 hour(s) and 39 minute(s): UTC morning backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T0800)
|
|
2026-01-15 08:20:05
|
<jouncebot>
|
In 2 hour(s) and 39 minute(s): MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1100)
|
|
2026-01-15 08:20:30
|
<Dreamy_Jazz>
|
I've stopped the +2, waiting for others to finish their changes
|
|
2026-01-15 08:21:00
|
<Dreamy_Jazz>
|
hashar: Could you ping me when you are done?
|
|
2026-01-15 08:21:35
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by hashar@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227256 (https://phabricator.wikimedia.org/T128546) (owner: ''Hashar)'
|
|
2026-01-15 08:21:37
|
<hashar>
|
Dreamy_Jazz: sure!
|
|
2026-01-15 08:22:23
|
<wikibugs>
|
('Merged) ''jenkins-bot: Revert "Update portals submodule for WP25 birthday preview." [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227256 (https://phabricator.wikimedia.org/T128546) (owner: ''Hashar)'
|
|
2026-01-15 08:22:54
|
<logmsgbot>
|
!log hashar@deploy2002 Started scap sync-world: Backport for [[gerrit:1227256|Revert "Update portals submodule for WP25 birthday preview." (T128546 T414533)]]
|
|
2026-01-15 08:23:00
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 08:23:00
|
<stashbot>
|
T414533: Update the url of the CTA button for Wikipedia25 portal customisation - https://phabricator.wikimedia.org/T414533
|
|
2026-01-15 08:23:43
|
<wikibugs>
|
('PS1) ''Hashar: Update portals submodule for WP25 birthday preview [2] [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227258 (https://phabricator.wikimedia.org/T128546)'
|
|
2026-01-15 08:25:15
|
<logmsgbot>
|
!log hashar@deploy2002 hashar: Backport for [[gerrit:1227256|Revert "Update portals submodule for WP25 birthday preview." (T128546 T414533)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 08:25:47
|
<wikibugs>
|
'SRE-Sprint-Week-Sustainability-March2023, ''DBA, ''Sustainability (Incident Followup): Improve slow read query handling - https://phabricator.wikimedia.org/T293530#11524041 (''Marostegui) ''Open→''Resolved a:''Ladsgroup I think we can consider this done. @Ladsgroup has done lots of work to 1) re...'
|
|
2026-01-15 08:25:52
|
<logmsgbot>
|
!log hashar@deploy2002 hashar: Continuing with sync
|
|
2026-01-15 08:28:28
|
<hashar>
|
Dreamy_Jazz: my changes are syncing
|
|
2026-01-15 08:29:21
|
<Dreamy_Jazz>
|
Thanks
|
|
2026-01-15 08:29:52
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Wikimedia-Mailing-lists, ''Patch-For-Review: Put lists.wikimedia.org web interface behind LVS - https://phabricator.wikimedia.org/T286066#11524049 (''ABran-WMF)'
|
|
2026-01-15 08:29:58
|
<logmsgbot>
|
!log hashar@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227256|Revert "Update portals submodule for WP25 birthday preview." (T128546 T414533)]] (duration: 07m 04s)
|
|
2026-01-15 08:30:03
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 08:30:04
|
<stashbot>
|
T414533: Update the url of the CTA button for Wikipedia25 portal customisation - https://phabricator.wikimedia.org/T414533
|
|
2026-01-15 08:30:56
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by dreamyjazz@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1223674 (https://phabricator.wikimedia.org/T361196) (owner: ''Dreamy Jazz)'
|
|
2026-01-15 08:31:46
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''+1] "VTCs are happy and condition properly matches the intended traffic" [puppet] - ''https://gerrit.wikimedia.org/r/1227202 (owner: ''Giuseppe Lavagetto)'
|
|
2026-01-15 08:31:50
|
<wikibugs>
|
('Merged) ''jenkins-bot: Write new for CheckUser user agent table migration on group1 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1223674 (https://phabricator.wikimedia.org/T361196) (owner: ''Dreamy Jazz)'
|
|
2026-01-15 08:32:21
|
<logmsgbot>
|
!log dreamyjazz@deploy2002 Started scap sync-world: Backport for [[gerrit:1223674|Write new for CheckUser user agent table migration on group1 (T361196)]]
|
|
2026-01-15 08:32:25
|
<stashbot>
|
T361196: Write to the cu_useragent table and agent_id columns on WMF wikis - https://phabricator.wikimedia.org/T361196
|
|
2026-01-15 08:32:53
|
<hashar>
|
still running
|
|
2026-01-15 08:32:56
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524061 (''Dzahn) Unrelated to the issue reported above, but for the record. We had an initial problem with the bare domain without www be...'
|
|
2026-01-15 08:34:27
|
<phuedx>
|
jouncebot: next
|
|
2026-01-15 08:34:27
|
<jouncebot>
|
In 2 hour(s) and 25 minute(s): MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1100)
|
|
2026-01-15 08:34:32
|
<logmsgbot>
|
!log dreamyjazz@deploy2002 dreamyjazz: Backport for [[gerrit:1223674|Write new for CheckUser user agent table migration on group1 (T361196)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 08:36:17
|
<logmsgbot>
|
!log dreamyjazz@deploy2002 dreamyjazz: Continuing with sync
|
|
2026-01-15 08:36:40
|
<Dreamy_Jazz>
|
I hadn't finished testing?
|
|
2026-01-15 08:36:48
|
<hashar>
|
I did
|
|
2026-01-15 08:36:56
|
<Dreamy_Jazz>
|
Okay
|
|
2026-01-15 08:36:57
|
<hashar>
|
I pushed a rollback :b
|
|
2026-01-15 08:37:56
|
<Dreamy_Jazz>
|
Ah, okay
|
|
2026-01-15 08:38:07
|
<hashar>
|
pff
|
|
2026-01-15 08:38:12
|
<hashar>
|
of course the page has been published now
|
|
2026-01-15 08:38:17
|
<wikibugs>
|
('PS1) ''Dzahn: miscweb: update wikipedia25 image to latest version [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227260 (https://phabricator.wikimedia.org/T408592)'
|
|
2026-01-15 08:38:19
|
<phuedx>
|
hashar, Dreamy_Jazz: I'd like to enable the TestKitchen extension everywhere. It looks like we've got a lot of time after the window. If not, I can do it in the afternoon window
|
|
2026-01-15 08:38:20
|
<hashar>
|
so I gotta deploy again
|
|
2026-01-15 08:38:30
|
<Dreamy_Jazz>
|
:D
|
|
2026-01-15 08:38:32
|
<phuedx>
|
Or maybe not :D :D :D
|
|
2026-01-15 08:38:40
|
<jinxer-wm>
|
FIRING: SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 08:39:09
|
<hashar>
|
phuedx: has that TestKitchen extension been fixed? It overlapped/clashed with MetricsPlatform :b
|
|
2026-01-15 08:39:25
|
<wikibugs>
|
('CR) ''Dzahn: [C:''+2] miscweb: update wikipedia25 image to latest version [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227260 (https://phabricator.wikimedia.org/T408592) (owner: ''Dzahn)'
|
|
2026-01-15 08:39:48
|
<hashar>
|
(I suspect the code got copy pasted between repos loosing the history but I digress)
|
|
2026-01-15 08:39:51
|
<hashar>
|
anyway yea
|
|
2026-01-15 08:39:58
|
<hashar>
|
but I have to push again that portals update change
|
|
2026-01-15 08:40:15
|
<mutante>
|
same here with updating the birthday page.. in progress
|
|
2026-01-15 08:40:15
|
<logmsgbot>
|
!log dreamyjazz@deploy2002 Finished scap sync-world: Backport for [[gerrit:1223674|Write new for CheckUser user agent table migration on group1 (T361196)]] (duration: 07m 54s)
|
|
2026-01-15 08:40:19
|
<stashbot>
|
T361196: Write to the cu_useragent table and agent_id columns on WMF wikis - https://phabricator.wikimedia.org/T361196
|
|
2026-01-15 08:40:39
|
<phuedx>
|
hashar: Yes. It's currently enabled on testwiki. I believe the CI issues have been fixed
|
|
2026-01-15 08:41:25
|
<wikibugs>
|
('Merged) ''jenkins-bot: miscweb: update wikipedia25 image to latest version [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227260 (https://phabricator.wikimedia.org/T408592) (owner: ''Dzahn)'
|
|
2026-01-15 08:41:46
|
<hashar>
|
phuedx: great :]
|
|
2026-01-15 08:41:51
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by hashar@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227258 (https://phabricator.wikimedia.org/T128546) (owner: ''Hashar)'
|
|
2026-01-15 08:42:41
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [staging] START helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:42:44
|
<wikibugs>
|
('Merged) ''jenkins-bot: Update portals submodule for WP25 birthday preview [2] [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227258 (https://phabricator.wikimedia.org/T128546) (owner: ''Hashar)'
|
|
2026-01-15 08:43:02
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [staging] DONE helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:43:15
|
<logmsgbot>
|
!log hashar@deploy2002 Started scap sync-world: Backport for [[gerrit:1227258|Update portals submodule for WP25 birthday preview [2] (T128546 T414533)]]
|
|
2026-01-15 08:43:21
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 08:43:21
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [codfw] START helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:43:21
|
<stashbot>
|
T414533: Update the url of the CTA button for Wikipedia25 portal customisation - https://phabricator.wikimedia.org/T414533
|
|
2026-01-15 08:43:40
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [codfw] DONE helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:44:06
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [eqiad] START helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:44:30
|
<logmsgbot>
|
!log dzahn@deploy2002 helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
|
|
2026-01-15 08:44:45
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from observability roles [puppet] - ''https://gerrit.wikimedia.org/r/1226178 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 08:45:30
|
<logmsgbot>
|
!log hashar@deploy2002 hashar: Backport for [[gerrit:1227258|Update portals submodule for WP25 birthday preview [2] (T128546 T414533)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 08:45:52
|
<logmsgbot>
|
!log hashar@deploy2002 hashar: Continuing with sync
|
|
2026-01-15 08:46:05
|
<hashar>
|
ah this time the link worked
|
|
2026-01-15 08:46:29
|
<hashar>
|
so that is poor synchronization with me deploying the www.wikipedia.org update before the target page got published by comm
|
|
2026-01-15 08:46:31
|
<hashar>
|
fun times
|
|
2026-01-15 08:46:48
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524096 (''Dzahn) deployed latest version 2026-01-15-080024 - @A_smart_kitten is it gone for you too?'
|
|
2026-01-15 08:48:11
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524099 (''A_smart_kitten) @dzahn checking just now on the device I used before, the 'Not Found' page was initially cached, but once I refre...'
|
|
2026-01-15 08:49:57
|
<logmsgbot>
|
!log hashar@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227258|Update portals submodule for WP25 birthday preview [2] (T128546 T414533)]] (duration: 06m 42s)
|
|
2026-01-15 08:50:03
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 08:50:03
|
<stashbot>
|
T414533: Update the url of the CTA button for Wikipedia25 portal customisation - https://phabricator.wikimedia.org/T414533
|
|
2026-01-15 08:50:52
|
<hashar>
|
lets burst the cache
|
|
2026-01-15 08:52:28
|
<hashar>
|
!log purged portals URLs using: `cat /srv/mediawiki-staging/portals/urls-to-purge.txt | MEDIAWIKI_STAGING_DIR=/srv/mediawiki-staging mwscript purgeList.php` # T414533
|
|
2026-01-15 08:52:31
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 08:52:34
|
<hashar>
|
!log https://www.wikipedia.org/ and click that orange button! # T414533
|
|
2026-01-15 08:52:38
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 08:52:43
|
<hashar>
|
artemkloko: change is live!
|
|
2026-01-15 08:52:52
|
<hashar>
|
Dreamy_Jazz: phuedx: it is all your
|
|
2026-01-15 08:53:01
|
<hashar>
|
https://www.wikipedia.org/ has been updated
|
|
2026-01-15 08:53:44
|
<Superpes>
|
hashar What about my patch? :)
|
|
2026-01-15 08:53:59
|
<hashar>
|
Superpes: yes it should be live now
|
|
2026-01-15 08:54:55
|
<Superpes>
|
Wonderful! I asked because I didn't check SAL
|
|
2026-01-15 08:55:00
|
<Superpes>
|
Thanks for your assistance :3
|
|
2026-01-15 08:55:49
|
<hashar>
|
Superpes: thank you for the logo fix!
|
|
2026-01-15 08:56:45
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from serviceops roles [puppet] - ''https://gerrit.wikimedia.org/r/1227261 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 08:57:44
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524113 (''Dzahn) @A_smart_kitten Yea, that is also what we saw over here. Thanks!:)'
|
|
2026-01-15 08:57:46
|
<phuedx>
|
Hrrm. I think I can see a bug with the TestKitchen config. I'm going to hold off on the deployment until others in my team are online
|
|
2026-01-15 08:57:54
|
<phuedx>
|
hashar: I think you can close the window now
|
|
2026-01-15 08:58:54
|
<Dreamy_Jazz>
|
Thanks for the ping hashar, mine should have been done by that one scap I did
|
|
2026-01-15 08:59:05
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524114 (''Dzahn) ''In progress→''Resolved We are live - QA happening now.'
|
|
2026-01-15 09:04:01
|
<icinga-wm>
|
PROBLEM - Check unit status of statograph_post on alert1002 is CRITICAL: CRITICAL: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
|
|
2026-01-15 09:04:46
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to L3 data access for kimpham (developer name Kim.pham) - https://phabricator.wikimedia.org/T414660 (''kimpham) ''NEW'
|
|
2026-01-15 09:05:12
|
<kostajh>
|
hashar: I have a patch to wmf.11 backport, but I could do it later as well
|
|
2026-01-15 09:06:13
|
<hashar>
|
kostajh: looks like phuedx and Dreamy_Jazz have finished so feel free to deploy
|
|
2026-01-15 09:06:23
|
<hashar>
|
I am off, I have an appointment
|
|
2026-01-15 09:06:29
|
<artemkloko>
|
Hello everyone, is there someone knowledgable of how to deploy the portals?
|
|
2026-01-15 09:06:45
|
<artemkloko>
|
We just deployed a version, but it seems to need a fix
|
|
2026-01-15 09:06:51
|
<wikibugs>
|
('CR) ''Elukey: [C:''+2] profile::docker_registry: tune the s3 config for /restricted [puppet] - ''https://gerrit.wikimedia.org/r/1226914 (https://phabricator.wikimedia.org/T394476) (owner: ''Elukey)'
|
|
2026-01-15 09:07:17
|
<mutante>
|
artemkloko: hashar just had to go
|
|
2026-01-15 09:07:32
|
<kostajh>
|
thanks
|
|
2026-01-15 09:07:45
|
<kostajh>
|
will start deployment soon
|
|
2026-01-15 09:08:04
|
<mutante>
|
kostajh: would you be able to deploy portal changes like hashar just did?
|
|
2026-01-15 09:08:10
|
<mutante>
|
to help out artemkloko
|
|
2026-01-15 09:08:23
|
<icinga-wm>
|
RECOVERY - Host an-conf1006 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms
|
|
2026-01-15 09:08:49
|
<artemkloko>
|
i have a doc that could help kostajh
|
|
2026-01-15 09:09:03
|
<kostajh>
|
artemkloko: sure, I can look at it
|
|
2026-01-15 09:09:13
|
<kostajh>
|
can you share the document with me please?
|
|
2026-01-15 09:09:13
|
<hashar>
|
I think there is an issue in the build step that generate the assets for wikimedia/portals/deploy
|
|
2026-01-15 09:09:33
|
<hashar>
|
there is a Gulp project in wikimedia/portals which is built/invoked by a CI job which build the assets
|
|
2026-01-15 09:09:48
|
<hashar>
|
and some .webm files are not added to the assets dir
|
|
2026-01-15 09:10:12
|
<hashar>
|
they are thus not added when doing a `git commit -A`
|
|
2026-01-15 09:10:44
|
<hashar>
|
it looks like an issue with the `npm run build-all-portals` script from wikimedia/portals
|
|
2026-01-15 09:10:58
|
<hashar>
|
thus I imagine that potentially needs Jan to look into
|
|
2026-01-15 09:11:56
|
<hashar>
|
and the job building the assets is https://integration.wikimedia.org/ci/job/wikimedia-portals-build/ (which result in pubshing a change for the deploy repo at https://gerrit.wikimedia.org/r/q/project:wikimedia/portals/deploy )
|
|
2026-01-15 09:12:01
|
<hashar>
|
so it is not trivial :\
|
|
2026-01-15 09:12:02
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from cloud roles [puppet] - ''https://gerrit.wikimedia.org/r/1227264 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 09:12:08
|
<hashar>
|
I am off for that appointment, I'll be back at 13:30
|
|
2026-01-15 09:13:15
|
<wikibugs>
|
('PS1) ''Kosta Harlan: WebRequest::getSecurityLogContext: Log if user is a bot [core] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227265 (https://phabricator.wikimedia.org/T395204)'
|
|
2026-01-15 09:13:39
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by kharlan@deploy2002 using scap backport" [core] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227265 (https://phabricator.wikimedia.org/T395204) (owner: ''Kosta Harlan)'
|
|
2026-01-15 09:13:47
|
<kostajh>
|
artemkloko: which patch are you trying to deploy?
|
|
2026-01-15 09:14:01
|
<icinga-wm>
|
RECOVERY - Check unit status of statograph_post on alert1002 is OK: OK: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
|
|
2026-01-15 09:16:55
|
<artemkloko>
|
I am still looking into the bug, have to look into what hashar mentioned
|
|
2026-01-15 09:18:47
|
<wikibugs>
|
('PS2) ''Dzahn: microsites: monitor wikipedia25.org (WIP) [puppet] - ''https://gerrit.wikimedia.org/r/1224575'
|
|
2026-01-15 09:19:13
|
<wikibugs>
|
('PS3) ''Dzahn: microsites: monitor wikipedia25.org [puppet] - ''https://gerrit.wikimedia.org/r/1224575'
|
|
2026-01-15 09:19:25
|
<wikibugs>
|
('CR) ''Dzahn: microsites: monitor wikipedia25.org [puppet] - ''https://gerrit.wikimedia.org/r/1224575 (owner: ''Dzahn)'
|
|
2026-01-15 09:22:20
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests, ''Patch-For-Review: Yubikey-SSH-FIDO access for dduvall - https://phabricator.wikimedia.org/T414619#11524174 (''JMeybohm) a:''MoritzMuehlenhoff @MoritzMuehlenhoff assigning to you so the next clinic duty person knows you're working on this with Dan, thanks'
|
|
2026-01-15 09:22:33
|
<wikibugs>
|
('PS4) ''Dzahn: microsites: monitor wikipedia25.org [puppet] - ''https://gerrit.wikimedia.org/r/1224575'
|
|
2026-01-15 09:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 09:24:34
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC afternoon backport window](https://wikitech.wikimedia.org/wiki/Deployments#deplo"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227004 (https://phabricator.wikimedia.org/T407806) (owner: ''Clare Ming)'
|
|
2026-01-15 09:24:57
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from search roles [puppet] - ''https://gerrit.wikimedia.org/r/1227270 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 09:25:00
|
<wikibugs>
|
('PS13) ''Daniel Kinzler: rest gateway: add tests for chart rendering [deployment-charts] - ''https://gerrit.wikimedia.org/r/1225085'
|
|
2026-01-15 09:25:06
|
<wikibugs>
|
('CR) ''Daniel Kinzler: rest gateway: add tests for chart rendering (''1 comment) [deployment-charts] - ''https://gerrit.wikimedia.org/r/1225085 (owner: ''Daniel Kinzler)'
|
|
2026-01-15 09:26:48
|
<wikibugs>
|
('PS2) ''Arnaudb: gerrit: Switchover gerrit1003 → gerrit2003 [puppet] - ''https://gerrit.wikimedia.org/r/1217133 (https://phabricator.wikimedia.org/T338470)'
|
|
2026-01-15 09:26:49
|
<wikibugs>
|
('PS5) ''Dzahn: microsites: monitor wikipedia25.org [puppet] - ''https://gerrit.wikimedia.org/r/1224575'
|
|
2026-01-15 09:27:13
|
<wikibugs>
|
('CR) ''Arnaudb: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1217133 (https://phabricator.wikimedia.org/T338470) (owner: ''Arnaudb)'
|
|
2026-01-15 09:27:17
|
<wikibugs>
|
('Merged) ''jenkins-bot: WebRequest::getSecurityLogContext: Log if user is a bot [core] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227265 (https://phabricator.wikimedia.org/T395204) (owner: ''Kosta Harlan)'
|
|
2026-01-15 09:27:47
|
<logmsgbot>
|
!log kharlan@deploy2002 Started scap sync-world: Backport for [[gerrit:1227265|WebRequest::getSecurityLogContext: Log if user is a bot (T395204)]]
|
|
2026-01-15 09:27:52
|
<stashbot>
|
T395204: MediaWiki should log request information (IP, user agent, referrer, HTTP method, etc) in a more uniform and predictable way - https://phabricator.wikimedia.org/T395204
|
|
2026-01-15 09:28:52
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations, ''netops: Cloudcephosd: migrate to single network uplink - https://phabricator.wikimedia.org/T399180#11524188 (''fgiunchedi) ''Open→''Resolved a:''fgiunchedi
All hosts that are not pending decom have been migrated to single uplink, resolving.'
|
|
2026-01-15 09:29:53
|
<logmsgbot>
|
!log kharlan@deploy2002 kharlan: Backport for [[gerrit:1227265|WebRequest::getSecurityLogContext: Log if user is a bot (T395204)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 09:30:12
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests, ''Data-Engineering, ''Data-Platform-SRE: Grant Access to analytics-privatedata-users for hmonroy - https://phabricator.wikimedia.org/T414375#11524193 (''JMeybohm) >>! In T414375#11523067, @HMonroy wrote: > @JMeybohm Hi! I'm trying a query wmf.mediawiki_history in superset. I'm...'
|
|
2026-01-15 09:32:47
|
<logmsgbot>
|
!log kharlan@deploy2002 kharlan: Continuing with sync
|
|
2026-01-15 09:33:13
|
<wikibugs>
|
('PS4) ''Daniel Kinzler: rest gateway: implement per-policy shadow mode [deployment-charts] - ''https://gerrit.wikimedia.org/r/1225699 (https://phabricator.wikimedia.org/T413183)'
|
|
2026-01-15 09:36:51
|
<logmsgbot>
|
!log kharlan@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227265|WebRequest::getSecurityLogContext: Log if user is a bot (T395204)]] (duration: 09m 04s)
|
|
2026-01-15 09:36:55
|
<stashbot>
|
T395204: MediaWiki should log request information (IP, user agent, referrer, HTTP method, etc) in a more uniform and predictable way - https://phabricator.wikimedia.org/T395204
|
|
2026-01-15 09:36:59
|
<wikibugs>
|
('CR) ''Dzahn: [C:''+2] microsites: monitor wikipedia25.org [puppet] - ''https://gerrit.wikimedia.org/r/1224575 (owner: ''Dzahn)'
|
|
2026-01-15 09:37:56
|
<wikibugs>
|
('PS5) ''Daniel Kinzler: rest-gateway: generate retry-after header for rate-limited requests [deployment-charts] - ''https://gerrit.wikimedia.org/r/1224937 (https://phabricator.wikimedia.org/T405636)'
|
|
2026-01-15 09:38:21
|
<wikibugs>
|
('CR) ''JMeybohm: [C:''+1] "🎉" [puppet] - ''https://gerrit.wikimedia.org/r/1227261 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 09:38:43
|
<wikibugs>
|
('CR) ''Daniel Kinzler: rest-gateway: generate retry-after header for rate-limited requests (''2 comments) [deployment-charts] - ''https://gerrit.wikimedia.org/r/1224937 (https://phabricator.wikimedia.org/T405636) (owner: ''Daniel Kinzler)'
|
|
2026-01-15 09:39:32
|
<wikibugs>
|
('PS2) ''Daniel Kinzler: rest gateway: include a meaningful body with 429 responses [deployment-charts] - ''https://gerrit.wikimedia.org/r/1226827 (https://phabricator.wikimedia.org/T405636)'
|
|
2026-01-15 09:39:41
|
<wikibugs>
|
('CR) ''Majavah: [C:''+1] Remove profile::puppet::agent::force_puppet7 from cloud roles [puppet] - ''https://gerrit.wikimedia.org/r/1227264 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 09:42:58
|
<wikibugs>
|
('PS14) ''Daniel Kinzler: charts: add redioscope chart and service [deployment-charts] - ''https://gerrit.wikimedia.org/r/1207256 (https://phabricator.wikimedia.org/T407999)'
|
|
2026-01-15 09:44:05
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+1] "Looks good" [puppet] - ''https://gerrit.wikimedia.org/r/1226774 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 09:45:55
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: [C:''+1] Remove profile::puppet::agent::force_puppet7 from cloud roles [puppet] - ''https://gerrit.wikimedia.org/r/1227264 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 09:46:38
|
<wikibugs>
|
('CR) ''Muehlenhoff: "Looks good" [puppet] - ''https://gerrit.wikimedia.org/r/1226775 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 09:47:43
|
<wikibugs>
|
('CR) ''Elukey: [C:''+2] admin: add the analytics-sre uid and gid [puppet] - ''https://gerrit.wikimedia.org/r/1226774 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 09:56:54
|
<logmsgbot>
|
!log btullis@cumin1003 START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
|
|
2026-01-15 09:57:16
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops, ''Data-Platform-SRE (2026.01.05 - 2026.01.23): Degraded RAID on an-worker1200 - https://phabricator.wikimedia.org/T413360#11524257 (''ops-monitoring-bot) Host an-worker1200.eqiad.wmnet rebooted by btullis@cumin1003 with reason: Rebooting to allow unmounting failed disk'
|
|
2026-01-15 09:58:32
|
<logmsgbot>
|
!log dzahn@cumin2002 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on tcp-proxy1001.eqiad.wmnet with reason: remove nftables
|
|
2026-01-15 10:04:01
|
<wikibugs>
|
('PS1) ''D3r1ck01: Control: Handle accepted consumers with "auth-only" grants [extensions/OAuth] (wmf/1.46.0-wmf.10) - ''https://gerrit.wikimedia.org/r/1227280 (https://phabricator.wikimedia.org/T413947)'
|
|
2026-01-15 10:04:36
|
<wikibugs>
|
('PS1) ''D3r1ck01: Control: When saving grants, ensure array has no gaps [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227281'
|
|
2026-01-15 10:05:01
|
<wikibugs>
|
('PS1) ''D3r1ck01: Control: Keep irrevocable grants when accepting new OAuth 2 consumers [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227282 (https://phabricator.wikimedia.org/T413947)'
|
|
2026-01-15 10:05:28
|
<logmsgbot>
|
!log dzahn@cumin2002 START - Cookbook sre.hosts.reboot-cluster
|
|
2026-01-15 10:05:29
|
<logmsgbot>
|
!log dzahn@cumin2002 END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
|
|
2026-01-15 10:06:01
|
<logmsgbot>
|
!log dzahn@cumin2002 START - Cookbook sre.hosts.reboot-cluster
|
|
2026-01-15 10:06:02
|
<logmsgbot>
|
!log dzahn@cumin2002 END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
|
|
2026-01-15 10:07:19
|
<logmsgbot>
|
!log dzahn@cumin2002 START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
|
|
2026-01-15 10:07:20
|
<wikibugs>
|
('Abandoned) ''D3r1ck01: Control: Handle accepted consumers with "auth-only" grants [extensions/OAuth] (wmf/1.46.0-wmf.10) - ''https://gerrit.wikimedia.org/r/1227280 (https://phabricator.wikimedia.org/T413947) (owner: ''D3r1ck01)'
|
|
2026-01-15 10:08:43
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227281 (owner: ''D3r1ck01)'
|
|
2026-01-15 10:08:57
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227282 (https://phabricator.wikimedia.org/T413947) (owner: ''D3r1ck01)'
|
|
2026-01-15 10:09:21
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''+2] cache::upload: rate-limit rather than blocking bingbot [puppet] - ''https://gerrit.wikimedia.org/r/1227202 (owner: ''Giuseppe Lavagetto)'
|
|
2026-01-15 10:10:39
|
<wikibugs>
|
'ops-ulsfo, ''SRE, ''DC-Ops, ''Infrastructure-Foundations, ''netops: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11524278 (''cmooney) >>! In T408892#11523618, @Papaul wrote: > Phase 1 of ULSFO migration which was changing the loopback addresses of
cr1,cr4 ,mr1 and the IP...'
|
|
2026-01-15 10:11:07
|
<logmsgbot>
|
!log dzahn@cumin2002 END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
|
|
2026-01-15 10:12:05
|
<wikibugs>
|
('PS2) ''Elukey: role::puppetserver: deploy kerberos keytab for analytics-sre [puppet] - ''https://gerrit.wikimedia.org/r/1226775 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:13:24
|
<wikibugs>
|
('CR) ''Elukey: [C:''+2] role::puppetserver: deploy kerberos keytab for analytics-sre [puppet] - ''https://gerrit.wikimedia.org/r/1226775 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:14:54
|
<wikibugs>
|
('PS2) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:16:57
|
<icinga-wm>
|
PROBLEM - Host an-worker1200 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 10:19:41
|
<wikibugs>
|
'SRE, ''Kubernetes, ''ServiceOps new: Failing docker registry tests - https://phabricator.wikimedia.org/T414576#11524310 (''JMeybohm) p:''Triage→''Medium The 403 vs. 401 or 404 are the result of the tests being run against a read-only (`profile::docker_registry::read_only_mode`) instance of
the registry...'
|
|
2026-01-15 10:19:53
|
<wikibugs>
|
'SRE, ''Kubernetes, ''ServiceOps new: Failing docker registry httpbb tests - https://phabricator.wikimedia.org/T414576#11524313 (''JMeybohm)'
|
|
2026-01-15 10:20:24
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from cloud roles [puppet] - ''https://gerrit.wikimedia.org/r/1227264 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 10:22:32
|
<wikibugs>
|
('CR) ''Dzahn: [C:''+2] "had to follow-up and remove the nftables package via cumin and reboot the hosts - normally we don't have this case where we move from nfta" [puppet] - ''https://gerrit.wikimedia.org/r/1215284 (https://phabricator.wikimedia.org/T408532) (owner: ''Dzahn)'
|
|
2026-01-15 10:23:03
|
<wikibugs>
|
('PS3) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:23:03
|
<wikibugs>
|
('PS1) ''Elukey: role::puppetserver: add the profile to fetch the krb keytabs [puppet] - ''https://gerrit.wikimedia.org/r/1227285 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:23:49
|
<wikibugs>
|
('CR) ''Elukey: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:26:05
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from serviceops roles [puppet] - ''https://gerrit.wikimedia.org/r/1227261 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 10:27:36
|
<wikibugs>
|
('PS4) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:27:47
|
<wikibugs>
|
('CR) ''Elukey: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:28:24
|
<wikibugs>
|
('CR) ''Elukey: [C:''+2] role::puppetserver: add the profile to fetch the krb keytabs [puppet] - ''https://gerrit.wikimedia.org/r/1227285 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:30:45
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1190.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 10:30:53
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1190 (T413525)', diff saved to https://phabricator.wikimedia.org/P87541 and previous config saved to /var/cache/conftool/dbconfig/20260115-103053-marostegui.json
|
|
2026-01-15 10:30:57
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 10:34:40
|
<wikibugs>
|
('PS1) ''Elukey: Add fake kerberos keytabs for the Puppetserver hosts [labs/private] - ''https://gerrit.wikimedia.org/r/1227290 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:35:01
|
<wikibugs>
|
('CR) ''Elukey: [V:''+2 C:''+2] Add fake kerberos keytabs for the Puppetserver hosts [labs/private] - ''https://gerrit.wikimedia.org/r/1227290 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:35:47
|
<wikibugs>
|
('CR) ''Elukey: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:35:59
|
<wikibugs>
|
('PS5) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 10:38:25
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 10:39:11
|
<wikibugs>
|
('PS6) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:39:55
|
<wikibugs>
|
('CR) ''Elukey: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:41:49
|
<wikibugs>
|
('PS7) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 10:42:14
|
<wikibugs>
|
('CR) ''Elukey: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 10:42:21
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops, ''Data-Platform-SRE (2026.01.05 - 2026.01.23): hw troubleshooting: PERC1 battery failure for an-worker1148 - https://phabricator.wikimedia.org/T411919#11524338 (''BTullis) The RAID controller firmware is already the latest version. {F71530261} {F71530265} I'm continuing to...'
|
|
2026-01-15 10:51:00
|
<logmsgbot>
|
!log vgutierrez@cumin1003 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - haproxy 2.8.18 upgrade (T414318)
|
|
2026-01-15 10:51:04
|
<stashbot>
|
T414318: upgrade to HAProxy 2.8.18 - https://phabricator.wikimedia.org/T414318
|
|
2026-01-15 10:51:16
|
<logmsgbot>
|
!log vgutierrez@cumin1003 START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - haproxy 2.8.18 upgrade (T414318)
|
|
2026-01-15 11:00:05
|
<jouncebot>
|
Deploy window MediaWiki infrastructure (UTC mid-day) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1100)
|
|
2026-01-15 11:03:25
|
<jinxer-wm>
|
FIRING: [15x] SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 11:06:45
|
<wikibugs>
|
('Abandoned) ''Giuseppe Lavagetto: Revert "Move status, commit status/history to database" [software/hiddenparma/deploy] - ''https://gerrit.wikimedia.org/r/1226867 (owner: ''Giuseppe Lavagetto)'
|
|
2026-01-15 11:10:00
|
<jynus>
|
!log force dbprov1004 restart
|
|
2026-01-15 11:10:01
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 11:11:49
|
<wikibugs>
|
('PS15) ''Daniel Kinzler: charts: add redioscope chart and service [deployment-charts] - ''https://gerrit.wikimedia.org/r/1207256 (https://phabricator.wikimedia.org/T407999)'
|
|
2026-01-15 11:11:58
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] charts: add redioscope chart and service [deployment-charts] - ''https://gerrit.wikimedia.org/r/1207256 (https://phabricator.wikimedia.org/T407999) (owner: ''Daniel Kinzler)'
|
|
2026-01-15 11:12:02
|
<wikibugs>
|
('CR) ''Daniel Kinzler: charts: add redioscope chart and service (''8 comments) [deployment-charts] - ''https://gerrit.wikimedia.org/r/1207256 (https://phabricator.wikimedia.org/T407999) (owner: ''Daniel Kinzler)'
|
|
2026-01-15 11:13:24
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from IF roles [puppet] - ''https://gerrit.wikimedia.org/r/1227292 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 11:15:27
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524436 (''ATitkov) QA was successful. Some people report needed a refresh for the first visit on https://wikipedia25.org/ or https://w...'
|
|
2026-01-15 11:16:56
|
<logmsgbot>
|
!log btullis@cumin1003 END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1200.eqiad.wmnet
|
|
2026-01-15 11:21:13
|
<moritzm>
|
!log installing nginx security updates
|
|
2026-01-15 11:21:15
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 11:25:19
|
<wikibugs>
|
('CR) ''Elukey: [C:''+1] Remove profile::puppet::agent::force_puppet7 from IF roles [puppet] - ''https://gerrit.wikimedia.org/r/1227292 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 11:26:18
|
<wikibugs>
|
('PS1) ''Vgutierrez: tcpproxy: Accept connections from the internet [puppet] - ''https://gerrit.wikimedia.org/r/1227294'
|
|
2026-01-15 11:26:41
|
<wikibugs>
|
('CR) ''Vgutierrez: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227294 (owner: ''Vgutierrez)'
|
|
2026-01-15 11:26:48
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] tcpproxy: Accept connections from the internet [puppet] - ''https://gerrit.wikimedia.org/r/1227294 (owner: ''Vgutierrez)'
|
|
2026-01-15 11:26:54
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from IF roles [puppet] - ''https://gerrit.wikimedia.org/r/1227292 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 11:29:26
|
<wikibugs>
|
('PS2) ''Vgutierrez: tcpproxy: Accept connections from the internet [puppet] - ''https://gerrit.wikimedia.org/r/1227294'
|
|
2026-01-15 11:29:39
|
<logmsgbot>
|
!log vgutierrez@cumin1003 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - haproxy 2.8.18 upgrade (T414318)
|
|
2026-01-15 11:29:42
|
<stashbot>
|
T414318: upgrade to HAProxy 2.8.18 - https://phabricator.wikimedia.org/T414318
|
|
2026-01-15 11:29:51
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11524474 (''ATitkov) I know it might look too soon, but I want to request either scheduled re-deployments or the ability to deploy
myself...'
|
|
2026-01-15 11:30:19
|
<wikibugs>
|
'ops-eqiad, ''DC-Ops: dbprov1004 lost connectivity, leading to a pause in eqiad database backups - https://phabricator.wikimedia.org/T414668 (''jcrespo) ''NEW'
|
|
2026-01-15 11:31:45
|
<wikibugs>
|
('CR) ''Vgutierrez: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227294 (owner: ''Vgutierrez)'
|
|
2026-01-15 11:33:26
|
<wikibugs>
|
('PS1) ''Gkyziridis: ml-services: Deploy rr-multilingual model using bookworm base image. [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227296 (https://phabricator.wikimedia.org/T411786)'
|
|
2026-01-15 11:33:51
|
<logmsgbot>
|
!log vgutierrez@cumin1003 END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - haproxy 2.8.18 upgrade (T414318)
|
|
2026-01-15 11:35:52
|
<wikibugs>
|
('PS2) ''Gkyziridis: ml-services: Deploy rr-multilingual model using bookworm base image. [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227296 (https://phabricator.wikimedia.org/T411786)'
|
|
2026-01-15 11:37:00
|
<wikibugs>
|
'SRE, ''Observability-Metrics: Change units for "network utilization" on "host overview" dashboard to bits/sec - https://phabricator.wikimedia.org/T414670 (''cmooney) ''NEW p:''Triage→''Low'
|
|
2026-01-15 11:37:21
|
<wikibugs>
|
'SRE, ''Observability-Metrics: Change units for "network utilization" on "host overview" dashboard to bits/sec - https://phabricator.wikimedia.org/T414670#11524521 (''cmooney)'
|
|
2026-01-15 11:37:52
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to L3 data access for kimpham (developer name Kim.pham) - https://phabricator.wikimedia.org/T414660#11524522 (''WMDE-leszek) I approve this request on WMDE's end. Thank you'
|
|
2026-01-15 11:39:18
|
<wikibugs>
|
('CR) ''Kevin Bazira: [C:''+1] ml-services: Deploy rr-multilingual model using bookworm base image. [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227296 (https://phabricator.wikimedia.org/T411786) (owner: ''Gkyziridis)'
|
|
2026-01-15 11:40:16
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2206 (T413525)', diff saved to https://phabricator.wikimedia.org/P87542 and previous config saved to /var/cache/conftool/dbconfig/20260115-114015-marostegui.json
|
|
2026-01-15 11:40:19
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 11:46:17
|
<wikibugs>
|
('CR) ''Gkyziridis: [C:''+2] ml-services: Deploy rr-multilingual model using bookworm base image. [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227296 (https://phabricator.wikimedia.org/T411786) (owner: ''Gkyziridis)'
|
|
2026-01-15 11:48:06
|
<wikibugs>
|
('Merged) ''jenkins-bot: ml-services: Deploy rr-multilingual model using bookworm base image. [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227296 (https://phabricator.wikimedia.org/T411786) (owner: ''Gkyziridis)'
|
|
2026-01-15 11:50:24
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P87543 and previous config saved to /var/cache/conftool/dbconfig/20260115-115023-marostegui.json
|
|
2026-01-15 11:51:03
|
<wikibugs>
|
('PS2) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from traffic hosts [puppet] - ''https://gerrit.wikimedia.org/r/1225524 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 11:51:40
|
<wikibugs>
|
'ops-eqiad, ''DC-Ops: dbprov1004 lost connectivity, leading to a pause in eqiad database backups - https://phabricator.wikimedia.org/T414668#11524548 (''jcrespo) For context, rebooting the host didn't fix the issue.'
|
|
2026-01-15 11:52:11
|
<logmsgbot>
|
!log gkyziridis@deploy2002 helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
|
|
2026-01-15 11:52:28
|
<logmsgbot>
|
!log gkyziridis@deploy2002 helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
|
|
2026-01-15 11:52:34
|
<wikibugs>
|
('CR) ''Muehlenhoff: "Thanks, these were already removed (hcaptcha via https://gerrit.wikimedia.org/r/c/operations/puppet/+/1227261 and the insetup role via htt" [puppet] - ''https://gerrit.wikimedia.org/r/1225524 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 12:00:32
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P87544 and previous config saved to /var/cache/conftool/dbconfig/20260115-120032-marostegui.json
|
|
2026-01-15 12:00:51
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to SRE/production access for Kim.pham (kimpham in phab) - https://phabricator.wikimedia.org/T414671 (''kimpham) ''NEW'
|
|
2026-01-15 12:02:46
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations, ''netops, ''Data-Platform-SRE (2026.01.05 - 2026.01.23): Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11524578 (''cmooney) //dse-k8s-worker1013// seems fairly happy in terms of the original problem since we made the change y...'
|
|
2026-01-15 12:10:41
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2206 (T413525)', diff saved to https://phabricator.wikimedia.org/P87545 and previous config saved to /var/cache/conftool/dbconfig/20260115-121040-marostegui.json
|
|
2026-01-15 12:10:44
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 12:10:57
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 12:11:05
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2210 (T413525)', diff saved to https://phabricator.wikimedia.org/P87546 and previous config saved to /var/cache/conftool/dbconfig/20260115-121105-marostegui.json
|
|
2026-01-15 12:16:36
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations, ''netops, ''Data-Platform-SRE (2026.01.05 - 2026.01.23): Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11524635 (''BTullis) >>! In T414460#11521367, @CDanis wrote: >>>! In T414460#11521085, @cmooney wrote: >> The k8s host
sen...'
|
|
2026-01-15 12:22:08
|
<wikibugs>
|
('PS1) ''Muehlenhoff: conf/etcd: Remove now obsolete cert [puppet] - ''https://gerrit.wikimedia.org/r/1227307 (https://phabricator.wikimedia.org/T352245)'
|
|
2026-01-15 12:23:17
|
<wikibugs>
|
('PS1) ''Muehlenhoff: conf/etcd: Remove now obsolete cert [puppet] - ''https://gerrit.wikimedia.org/r/1227309 (https://phabricator.wikimedia.org/T352245)'
|
|
2026-01-15 12:23:43
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] wikidough: Enable Bird 2.18 for all servers [puppet] - ''https://gerrit.wikimedia.org/r/1224708 (https://phabricator.wikimedia.org/T413740) (owner: ''Muehlenhoff)'
|
|
2026-01-15 12:24:31
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations: Integrate Bookworm 12.12 point update - https://phabricator.wikimedia.org/T403852#11524649 (''MoritzMuehlenhoff)'
|
|
2026-01-15 12:26:14
|
<wikibugs>
|
('PS1) ''PipelineBot: mobileapps: pipeline bot promote [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227310'
|
|
2026-01-15 12:27:50
|
<wikibugs>
|
'SRE, ''serviceops, ''Kubernetes: Fix nginx config and caching for docker registry - https://phabricator.wikimedia.org/T256762#11524650 (''JMeybohm) ''Open→''Resolved a:''JMeybohm Since there is clearly no need for optimization here, I'll resolve this now.'
|
|
2026-01-15 12:28:34
|
<wikibugs>
|
('PS1) ''JMeybohm: httpbb: Remove assertions for X-Cache-Status [puppet] - ''https://gerrit.wikimedia.org/r/1227311 (https://phabricator.wikimedia.org/T414576)'
|
|
2026-01-15 12:28:43
|
<ihurbain>
|
jouncebot: nowandnext
|
|
2026-01-15 12:28:43
|
<jouncebot>
|
No deployments scheduled for the next 0 hour(s) and 31 minute(s)
|
|
2026-01-15 12:28:43
|
<jouncebot>
|
In 0 hour(s) and 31 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1300)
|
|
2026-01-15 12:29:48
|
<ihurbain>
|
can I deploy a config patch? (https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1226232)
|
|
2026-01-15 12:31:52
|
<claime>
|
ihurbain: no objection from me. That sampling rate definition is confusing af
|
|
2026-01-15 12:33:57
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from Data Platform roles [puppet] - ''https://gerrit.wikimedia.org/r/1227313 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 12:34:48
|
<ihurbain>
|
claime: the fact that i got confused by it is probably a good sign (but it's also how we apparently sample, and i get that, integers are good, etc)
|
|
2026-01-15 12:34:56
|
<ihurbain>
|
anyway spiderpigging.
|
|
2026-01-15 12:35:42
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by ihurbain@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226232 (https://phabricator.wikimedia.org/T412803) (owner: ''Isabelle Hurbain-Palatin)'
|
|
2026-01-15 12:36:30
|
<wikibugs>
|
('Merged) ''jenkins-bot: Turn on debugging for unsafe postproc cache entries logging [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226232 (https://phabricator.wikimedia.org/T412803) (owner: ''Isabelle Hurbain-Palatin)'
|
|
2026-01-15 12:37:05
|
<logmsgbot>
|
!log ihurbain@deploy2002 Started scap sync-world: Backport for [[gerrit:1226232|Turn on debugging for unsafe postproc cache entries logging (T412803)]]
|
|
2026-01-15 12:37:09
|
<stashbot>
|
T412803: Tweak unsafe post-processing cache keys - https://phabricator.wikimedia.org/T412803
|
|
2026-01-15 12:39:14
|
<logmsgbot>
|
!log ihurbain@deploy2002 ihurbain: Backport for [[gerrit:1226232|Turn on debugging for unsafe postproc cache entries logging (T412803)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 12:39:35
|
<icinga-wm>
|
RECOVERY - Host an-worker1200 is UP: PING OK - Packet loss = 0%, RTA = 0.20 ms
|
|
2026-01-15 12:41:09
|
<wikibugs>
|
('PS2) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 from Data Platform roles [puppet] - ''https://gerrit.wikimedia.org/r/1227313 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 12:41:24
|
<logmsgbot>
|
!log ihurbain@deploy2002 ihurbain: Continuing with sync
|
|
2026-01-15 12:45:27
|
<icinga-wm>
|
RECOVERY - Host an-worker1148 is UP: PING OK - Packet loss = 0%, RTA = 0.34 ms
|
|
2026-01-15 12:45:29
|
<logmsgbot>
|
!log ihurbain@deploy2002 Finished scap sync-world: Backport for [[gerrit:1226232|Turn on debugging for unsafe postproc cache entries logging (T412803)]] (duration: 08m 24s)
|
|
2026-01-15 12:45:33
|
<stashbot>
|
T412803: Tweak unsafe post-processing cache keys - https://phabricator.wikimedia.org/T412803
|
|
2026-01-15 12:45:38
|
<ihurbain>
|
woot.
|
|
2026-01-15 12:46:27
|
<ihurbain>
|
and yay, i'm seeing my new logs!
|
|
2026-01-15 12:48:01
|
<wikibugs>
|
('CR) ''Muehlenhoff: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227307 (https://phabricator.wikimedia.org/T352245) (owner: ''Muehlenhoff)'
|
|
2026-01-15 12:49:07
|
<icinga-wm>
|
RECOVERY - Dell PowerEdge or Supermicro Broadcom RAID Controller on an-worker1200 is OK: communication: 0 OK : controller: 0 OK : physical_disk: 0 OK : virtual_disk: 0 OK : bbu: 0 OK : enclosure: 0 OK https://wikitech.wikimedia.org/wiki/PERCCli%23Monitoring
|
|
2026-01-15 12:50:37
|
<wikibugs>
|
'SRE-SLO, ''Citoid, ''VisualEditor, ''Editing-team (Tracking): Seperate SLO for requests made from Citoid Extension, possible wmf deployed extension only, vs bots etc. - https://phabricator.wikimedia.org/T345627#11524721 (''Mvolz) So we're running at around 10% error for mediawikijs requests, we're allowe...'
|
|
2026-01-15 12:51:23
|
<wikibugs>
|
('CR) ''Muehlenhoff: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227309 (https://phabricator.wikimedia.org/T352245) (owner: ''Muehlenhoff)'
|
|
2026-01-15 12:53:42
|
<topranks>
|
!log drainin Arelion transit circuit on cr1-codfw in advance of adding second 10G port to bundle
|
|
2026-01-15 12:53:44
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 12:55:00
|
<wikibugs>
|
('PS4) ''Muehlenhoff: etcd: Remove the use_pki_certs flag [puppet] - ''https://gerrit.wikimedia.org/r/978615'
|
|
2026-01-15 12:55:28
|
<wikibugs>
|
'SRE-SLO, ''Citoid, ''VisualEditor, ''Editing-team (Tracking): Seperate SLO for requests made from Citoid Extension, possible wmf deployed extension only, vs bots etc. - https://phabricator.wikimedia.org/T345627#11524727 (''Mvolz) If you look for https://thanos.wikimedia.org/graph?g0.expr=sum(rate(citoid_...'
|
|
2026-01-15 12:57:11
|
<wikibugs>
|
('CR) ''Elukey: [C:''+1] httpbb: Remove assertions for X-Cache-Status [puppet] - ''https://gerrit.wikimedia.org/r/1227311 (https://phabricator.wikimedia.org/T414576) (owner: ''JMeybohm)'
|
|
2026-01-15 12:59:08
|
<phuedx>
|
jouncebot: next
|
|
2026-01-15 12:59:09
|
<jouncebot>
|
In 0 hour(s) and 0 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1300)
|
|
2026-01-15 12:59:17
|
<phuedx>
|
jouncebot: nowandnext
|
|
2026-01-15 12:59:17
|
<jouncebot>
|
No deployments scheduled for the next 0 hour(s) and 0 minute(s)
|
|
2026-01-15 12:59:17
|
<jouncebot>
|
In 0 hour(s) and 0 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1300)
|
|
2026-01-15 13:00:05
|
<jouncebot>
|
Deploy window Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1300)
|
|
2026-01-15 13:00:10
|
<phuedx>
|
You win this time jouncebot
|
|
2026-01-15 13:01:36
|
<wikibugs>
|
('PS1) ''Majavah: P:toolforge: k8s: haproxy: Handle plain toolforge.org domain [puppet] - ''https://gerrit.wikimedia.org/r/1227321 (https://phabricator.wikimedia.org/T414674)'
|
|
2026-01-15 13:01:44
|
<wikibugs>
|
('CR) ''Muehlenhoff: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/978615 (owner: ''Muehlenhoff)'
|
|
2026-01-15 13:03:25
|
<jinxer-wm>
|
FIRING: [15x] SystemdUnitFailed: send_tile_invalidations.service on maps1011:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 13:05:01
|
<icinga-wm>
|
PROBLEM - Check unit status of statograph_post on alert1002 is CRITICAL: CRITICAL: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
|
|
2026-01-15 13:08:10
|
<wikibugs>
|
('CR) ''JMeybohm: [C:''+2] httpbb: Remove assertions for X-Cache-Status [puppet] - ''https://gerrit.wikimedia.org/r/1227311 (https://phabricator.wikimedia.org/T414576) (owner: ''JMeybohm)'
|
|
2026-01-15 13:09:37
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Remove profile::puppet::agent::force_puppet7 for Cloud VPS [puppet] - ''https://gerrit.wikimedia.org/r/1227322 (https://phabricator.wikimedia.org/T365798)'
|
|
2026-01-15 13:15:01
|
<icinga-wm>
|
RECOVERY - Check unit status of statograph_post on alert1002 is OK: OK: Status of the systemd unit statograph_post https://wikitech.wikimedia.org/wiki/Monitoring/systemd_unit_state
|
|
2026-01-15 13:22:57
|
<wikibugs>
|
('CR) ''Elukey: "Left a nit but we are close!" [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1146891 (https://phabricator.wikimedia.org/T385173) (owner: ''Kevin Bazira)'
|
|
2026-01-15 13:23:58
|
<wikibugs>
|
'SRE, ''Kubernetes, ''Patch-For-Review, ''ServiceOps new: Failing docker registry httpbb tests - https://phabricator.wikimedia.org/T414576#11524771 (''JMeybohm) a:''DPogorzelski-WMF The X-Cache-Status failures are gone now: ` jayme@cumin1003:~$ sudo httpbb
/srv/deployment/httpbb-tests/docker-registry/te...'
|
|
2026-01-15 13:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 13:25:51
|
<logmsgbot>
|
!log jclark@cumin1003 END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog1003.eqiad.wmnet with OS bookworm
|
|
2026-01-15 13:26:01
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops, ''SRE Observability (FY2025/2026-Q3): Q2:rack/setup/install mwlog1003 - https://phabricator.wikimedia.org/T412230#11524779 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host mwlog1003.eqiad.wmnet with OS bookworm executed with erro...'
|
|
2026-01-15 13:26:17
|
<wikibugs>
|
('PS1) ''Jdrewniak: Bumping portals to master [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227330 (https://phabricator.wikimedia.org/T128546)'
|
|
2026-01-15 13:26:24
|
<wikibugs>
|
('PS1) ''Muehlenhoff: Record LDAP access for aramilferaxa [puppet] - ''https://gerrit.wikimedia.org/r/1227331'
|
|
2026-01-15 13:27:05
|
<logmsgbot>
|
!log jclark@cumin1003 START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS bookworm
|
|
2026-01-15 13:27:08
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] Record LDAP access for aramilferaxa [puppet] - ''https://gerrit.wikimedia.org/r/1227331 (owner: ''Muehlenhoff)'
|
|
2026-01-15 13:27:16
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops, ''SRE Observability (FY2025/2026-Q3): Q2:rack/setup/install mwlog1003 - https://phabricator.wikimedia.org/T412230#11524785 (''ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host mwlog1003.eqiad.wmnet with OS bookworm'
|
|
2026-01-15 13:27:33
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: [C:''+1] P:toolforge: k8s: haproxy: Handle plain toolforge.org domain [puppet] - ''https://gerrit.wikimedia.org/r/1227321 (https://phabricator.wikimedia.org/T414674) (owner: ''Majavah)'
|
|
2026-01-15 13:27:48
|
<moritzm>
|
!log installing squid security updates
|
|
2026-01-15 13:27:50
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 13:27:57
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: [C:''+1] Remove profile::puppet::agent::force_puppet7 for Cloud VPS [puppet] - ''https://gerrit.wikimedia.org/r/1227322 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 13:28:34
|
<wikibugs>
|
('CR) ''Majavah: [V:''+1] "PCC SUCCESS (CORE_DIFF 1): https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/label=puppet7-compiler-node/7899/co"; [puppet] - ''https://gerrit.wikimedia.org/r/1227321 (https://phabricator.wikimedia.org/T414674) (owner: ''Majavah)'
|
|
2026-01-15 13:29:06
|
<wikibugs>
|
('CR) ''Majavah: [V:''+1 C:''+2] P:toolforge: k8s: haproxy: Handle plain toolforge.org domain [puppet] - ''https://gerrit.wikimedia.org/r/1227321 (https://phabricator.wikimedia.org/T414674) (owner: ''Majavah)'
|
|
2026-01-15 13:29:31
|
<hashar>
|
jan_drewniak: we can do https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1227330 I think
|
|
2026-01-15 13:29:35
|
<hashar>
|
jouncebot: nowandnext
|
|
2026-01-15 13:29:35
|
<jouncebot>
|
For the next 0 hour(s) and 30 minute(s): Mobileapps/RESTBase/Wikifeeds (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1300)
|
|
2026-01-15 13:29:35
|
<jouncebot>
|
In 0 hour(s) and 30 minute(s): UTC afternoon backport window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1400)
|
|
2026-01-15 13:30:01
|
<jan_drewniak>
|
hey folks, I'm going to be deploying a portals updates now just ahead of the backport window
|
|
2026-01-15 13:30:02
|
<wikibugs>
|
('CR) ''Hashar: [C:''+1] Bumping portals to master [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227330 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 13:31:32
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by jdrewniak@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227330 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 13:32:40
|
<wikibugs>
|
('Merged) ''jenkins-bot: Bumping portals to master [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227330 (https://phabricator.wikimedia.org/T128546) (owner: ''Jdrewniak)'
|
|
2026-01-15 13:33:12
|
<logmsgbot>
|
!log jdrewniak@deploy2002 Started scap sync-world: Backport for [[gerrit:1227330|Bumping portals to master (T128546)]]
|
|
2026-01-15 13:33:17
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 13:34:24
|
<wikibugs>
|
('Abandoned) ''Muehlenhoff: Record LDAP access for aramilferaxa [puppet] - ''https://gerrit.wikimedia.org/r/1227331 (owner: ''Muehlenhoff)'
|
|
2026-01-15 13:35:18
|
<wikibugs>
|
('PS1) ''Filippo Giunchedi: wmcs: remove value from CephSlowOps summary [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669)'
|
|
2026-01-15 13:35:26
|
<logmsgbot>
|
!log jdrewniak@deploy2002 jdrewniak: Backport for [[gerrit:1227330|Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 13:36:27
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to SRE/production access for Kim.pham (kimpham in phab) - https://phabricator.wikimedia.org/T414671#11524827 (''Novem_Linguae) Are you requesting `deployment` access? > backlog deployment windows Do you mean [[ https://wikitech.wikimedia.org/wiki/Backport_windo...'
|
|
2026-01-15 13:37:07
|
<logmsgbot>
|
!log jdrewniak@deploy2002 jdrewniak: Continuing with sync
|
|
2026-01-15 13:37:51
|
<wikibugs>
|
('CR) ''Majavah: [C:''-1] "The number is useful to see in some form, so can it be added to the description if it can't be in the summary?" [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 13:40:23
|
<wikibugs>
|
('PS2) ''Filippo Giunchedi: wmcs: remove value from CephSlowOps summary [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669)'
|
|
2026-01-15 13:40:35
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: "Fair point, {{done}}" [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 13:41:10
|
<logmsgbot>
|
!log jdrewniak@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227330|Bumping portals to master (T128546)]] (duration: 07m 58s)
|
|
2026-01-15 13:41:14
|
<stashbot>
|
T128546: [Recurring Task] Update Wikipedia and sister projects portals statistics - https://phabricator.wikimedia.org/T128546
|
|
2026-01-15 13:42:02
|
<moritzm>
|
!log upgrade wikidough to Bird 2.18 T413740
|
|
2026-01-15 13:42:05
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 13:42:06
|
<stashbot>
|
T413740: Backport and test Bird 2.18 - https://phabricator.wikimedia.org/T413740
|
|
2026-01-15 13:42:46
|
<wikibugs>
|
('CR) ''Majavah: [C:''+1] wmcs: remove value from CephSlowOps summary [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 13:43:11
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 for Cloud VPS [puppet] - ''https://gerrit.wikimedia.org/r/1227322 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 13:43:59
|
<jan_drewniak>
|
hashar: I just ran the sync through spiderpig. Now I logged into deploy2002 and run `MEDIAWIKI_STAGING_DIR=/srv/mediawiki-staging | mwscript purgeList.php`
|
|
2026-01-15 13:44:43
|
<wikibugs>
|
('PS1) ''Filippo Giunchedi: sre: remove value from MaxConntrack summary [alerts] - ''https://gerrit.wikimedia.org/r/1227335 (https://phabricator.wikimedia.org/T414669)'
|
|
2026-01-15 13:45:38
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: [C:''+2] wmcs: remove value from CephSlowOps summary [alerts] - ''https://gerrit.wikimedia.org/r/1227334 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 13:47:23
|
<jan_drewniak>
|
hashar: ok, deployed and purged successfully!
|
|
2026-01-15 13:47:33
|
<hashar>
|
well done!
|
|
2026-01-15 13:48:04
|
<hashar>
|
I have sent some changes to the docs on https://gerrit.wikimedia.org/r/q/project:wikimedia/portals+is:open+owner:hashar
|
|
2026-01-15 13:48:11
|
<hashar>
|
then I don't know whether they are accurate
|
|
2026-01-15 13:57:30
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations: Integrate Bookworm 12.13 point update - https://phabricator.wikimedia.org/T414205#11524895 (''MoritzMuehlenhoff)'
|
|
2026-01-15 13:58:19
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops: dbprov1004 lost connectivity, leading to a pause in eqiad database backups - https://phabricator.wikimedia.org/T414668#11524898 (''Jclark-ctr) a:''Jclark-ctr'
|
|
2026-01-15 13:58:45
|
<wikibugs>
|
('PS1) ''Elukey: role::puppetserver: remove kerberos config [puppet] - ''https://gerrit.wikimedia.org/r/1227338 (https://phabricator.wikimedia.org/T402512)'
|
|
2026-01-15 14:00:04
|
<jouncebot>
|
Lucas_WMDE, Urbanecm, and TheresNoTime: May I have your attention please! UTC afternoon backport window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1400)
|
|
2026-01-15 14:00:05
|
<jouncebot>
|
Seawolf35, JSherman, stephanebisson, and phuedx: A patch you scheduled for UTC afternoon backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
|
|
2026-01-15 14:00:08
|
<JSherman>
|
o/
|
|
2026-01-15 14:00:12
|
<phuedx>
|
o/
|
|
2026-01-15 14:00:14
|
<Lucas_WMDE>
|
o/
|
|
2026-01-15 14:00:18
|
<Seawolf35>
|
o/
|
|
2026-01-15 14:00:19
|
<stephanebisson>
|
o/
|
|
2026-01-15 14:00:26
|
<Lucas_WMDE>
|
I can deploy!
|
|
2026-01-15 14:00:43
|
<Lucas_WMDE>
|
let’s start with Seawolf35 ^^
|
|
2026-01-15 14:00:52
|
<Seawolf35>
|
Ok
|
|
2026-01-15 14:01:04
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by lucaswerkmeister-wmde@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1225596 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 14:01:18
|
<wikibugs>
|
('Abandoned) ''Elukey: WIP: profile::puppetserver::volatile: add hdfs rsync job [puppet] - ''https://gerrit.wikimedia.org/r/1226776 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 14:01:51
|
<wikibugs>
|
('PS1) ''Cathal Mooney: Remove offload of Comcast traffic from Arelion [homer/public] - ''https://gerrit.wikimedia.org/r/1227341 (https://phabricator.wikimedia.org/T261867)'
|
|
2026-01-15 14:02:17
|
<wikibugs>
|
('Merged) ''jenkins-bot: ukwiki: Various changes to user rights. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1225596 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 14:02:49
|
<logmsgbot>
|
!log lucaswerkmeister-wmde@deploy2002 Started scap sync-world: Backport for [[gerrit:1225596|ukwiki: Various changes to user rights. (T414277)]]
|
|
2026-01-15 14:02:53
|
<stashbot>
|
T414277: Some changes in user group rights in ukwiki - https://phabricator.wikimedia.org/T414277
|
|
2026-01-15 14:05:00
|
<logmsgbot>
|
!log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, seawolf35gerrit: Backport for [[gerrit:1225596|ukwiki: Various changes to user rights. (T414277)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 14:05:28
|
<Lucas_WMDE>
|
Seawolf35: please test!
|
|
2026-01-15 14:05:49
|
<Seawolf35>
|
I’m using the debug cookie on my phone fyi
|
|
2026-01-15 14:06:18
|
<Lucas_WMDE>
|
hmm, I still see the movestable right in the autoconfirmed group I think
|
|
2026-01-15 14:06:19
|
<wikibugs>
|
('PS2) ''Cathal Mooney: Remove offload of Comcast traffic from Arelion [homer/public] - ''https://gerrit.wikimedia.org/r/1227341 (https://phabricator.wikimedia.org/T261867)'
|
|
2026-01-15 14:06:25
|
<icinga-wm>
|
RECOVERY - Host dbprov1004 is UP: PING OK - Packet loss = 0%, RTA = 0.35 ms
|
|
2026-01-15 14:06:51
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops: dbprov1004 lost connectivity, leading to a pause in eqiad database backups - https://phabricator.wikimedia.org/T414668#11524921 (''Jclark-ctr) @jcrespo Replaced Dac cable link came up.'
|
|
2026-01-15 14:07:04
|
<Lucas_WMDE>
|
same for the confirmed group
|
|
2026-01-15 14:07:26
|
<Seawolf35>
|
Everything else seemed to work.
|
|
2026-01-15 14:08:01
|
<wikibugs>
|
('CR) ''Ayounsi: [C:''+1] "lgtm" [homer/public] - ''https://gerrit.wikimedia.org/r/1227341 (https://phabricator.wikimedia.org/T261867) (owner: ''Cathal Mooney)'
|
|
2026-01-15 14:08:37
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''DC-Ops: dbprov1004 lost connectivity, leading to a pause in eqiad database backups - https://phabricator.wikimedia.org/T414668#11524927 (''Jclark-ctr) ''Open→''Resolved updated netbox
cableid'
|
|
2026-01-15 14:09:01
|
<Lucas_WMDE>
|
looks like the same is also true for ruwikinews, despite its 'autoconfirmed' => [ 'movestable' => false, ]
|
|
2026-01-15 14:09:04
|
<wikibugs>
|
('CR) ''Elukey: [C:''+2] role::puppetserver: remove kerberos config [puppet] - ''https://gerrit.wikimedia.org/r/1227338 (https://phabricator.wikimedia.org/T402512) (owner: ''Elukey)'
|
|
2026-01-15 14:09:22
|
<Lucas_WMDE>
|
searches phabricator
|
|
2026-01-15 14:09:24
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] proton: Bump image [deployment-charts] - ''https://gerrit.wikimedia.org/r/1226218 (owner: ''Muehlenhoff)'
|
|
2026-01-15 14:09:51
|
<wikibugs>
|
('PS1) ''Elukey: Revert "Add fake kerberos keytabs for the Puppetserver hosts" [labs/private] - ''https://gerrit.wikimedia.org/r/1227342'
|
|
2026-01-15 14:09:56
|
<wikibugs>
|
('CR) ''Elukey: [V:''+2 C:''+2] Revert "Add fake kerberos keytabs for the Puppetserver hosts" [labs/private] - ''https://gerrit.wikimedia.org/r/1227342 (owner: ''Elukey)'
|
|
2026-01-15 14:10:26
|
<wikibugs>
|
('CR) ''Cathal Mooney: [C:''+2] Remove offload of Comcast traffic from Arelion [homer/public] - ''https://gerrit.wikimedia.org/r/1227341 (https://phabricator.wikimedia.org/T261867) (owner: ''Cathal Mooney)'
|
|
2026-01-15 14:11:05
|
<Lucas_WMDE>
|
Seawolf35: I think let’s deploy the config change anyway, but the task should then stay open for further investigation what’s going on with this right
|
|
2026-01-15 14:11:07
|
<Lucas_WMDE>
|
does that sound okay?
|
|
2026-01-15 14:11:35
|
<Seawolf35>
|
Sounds good.
|
|
2026-01-15 14:11:47
|
<wikibugs>
|
('Merged) ''jenkins-bot: Remove offload of Comcast traffic from Arelion [homer/public] - ''https://gerrit.wikimedia.org/r/1227341 (https://phabricator.wikimedia.org/T261867) (owner: ''Cathal Mooney)'
|
|
2026-01-15 14:11:52
|
<logmsgbot>
|
!log lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde, seawolf35gerrit: Continuing with sync
|
|
2026-01-15 14:11:56
|
<Seawolf35>
|
Everything else like change tags looks good on my end
|
|
2026-01-15 14:11:57
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [staging] START helmfile.d/services/proton: apply
|
|
2026-01-15 14:12:05
|
<Lucas_WMDE>
|
alright, thanks
|
|
2026-01-15 14:13:18
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to analytics-privatedata-users for johannesrichterwmde - https://phabricator.wikimedia.org/T414678 (''Johannes_Richter_WMDE) ''NEW'
|
|
2026-01-15 14:13:57
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [staging] DONE helmfile.d/services/proton: apply
|
|
2026-01-15 14:15:28
|
<Lucas_WMDE>
|
JSherman: want to self-service once the current deployment is done?
|
|
2026-01-15 14:16:02
|
<logmsgbot>
|
!log lucaswerkmeister-wmde@deploy2002 Finished scap sync-world: Backport for [[gerrit:1225596|ukwiki: Various changes to user rights. (T414277)]] (duration: 13m 13s)
|
|
2026-01-15 14:16:06
|
<stashbot>
|
T414277: Some changes in user group rights in ukwiki - https://phabricator.wikimedia.org/T414277
|
|
2026-01-15 14:16:06
|
<JSherman>
|
Lucas_WMDE: on it
|
|
2026-01-15 14:16:10
|
<logmsgbot>
|
jclark@cumin1003 reimage (PID 1651082) is awaiting input
|
|
2026-01-15 14:16:10
|
<JSherman>
|
sounds good
|
|
2026-01-15 14:16:15
|
<Lucas_WMDE>
|
ok!
|
|
2026-01-15 14:16:19
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [codfw] START helmfile.d/services/proton: apply
|
|
2026-01-15 14:17:20
|
<Lucas_WMDE>
|
(my spiderpig finished, you’re good to go)
|
|
2026-01-15 14:17:24
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to analytics-privatedata-users for johannesrichterwmde - https://phabricator.wikimedia.org/T414678#11524963 (''Johannes_Richter_WMDE) By the way I noticed {T358578} – is that still common practice @Dzahn? (I'm not in the #wmf-nda group despite signing the NDA in...'
|
|
2026-01-15 14:17:31
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [codfw] DONE helmfile.d/services/proton: apply
|
|
2026-01-15 14:18:00
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by jsn@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226862 (https://phabricator.wikimedia.org/T403982) (owner: ''Jsn.sherman)'
|
|
2026-01-15 14:18:27
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [eqiad] START helmfile.d/services/proton: apply
|
|
2026-01-15 14:18:37
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to analytics-privatedata-users for johannesrichterwmde - https://phabricator.wikimedia.org/T414678#11524970 (''Tobi_WMDE_SW) @Johannes_Richter_WMDE is part of the WMDE TechWish team, and I endorse this request.'
|
|
2026-01-15 14:18:43
|
<wikibugs>
|
('Merged) ''jenkins-bot: Deploy PersonalDashboard to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1226862 (https://phabricator.wikimedia.org/T403982) (owner: ''Jsn.sherman)'
|
|
2026-01-15 14:19:04
|
<logmsgbot>
|
!log cmooney@cumin1003 START - Cookbook sre.dns.netbox
|
|
2026-01-15 14:19:08
|
<Lucas_WMDE>
|
stephanebisson: do you want to do your deploy afterwards? you could probably start the gate-and-submit builds already
|
|
2026-01-15 14:19:14
|
<logmsgbot>
|
!log jsn@deploy2002 Started scap sync-world: Backport for [[gerrit:1226862|Deploy PersonalDashboard to testwiki (T403982)]]
|
|
2026-01-15 14:19:18
|
<stashbot>
|
T403982: Create and deploy Extension:PersonalDashboard - https://phabricator.wikimedia.org/T403982
|
|
2026-01-15 14:19:44
|
<logmsgbot>
|
!log jmm@deploy2002 helmfile [eqiad] DONE helmfile.d/services/proton: apply
|
|
2026-01-15 14:20:08
|
<stephanebisson>
|
Lucas_WMDE: yes I'll do them, getting started soon
|
|
2026-01-15 14:20:37
|
<Lucas_WMDE>
|
ok!
|
|
2026-01-15 14:21:24
|
<logmsgbot>
|
!log jsn@deploy2002 jsn: Backport for [[gerrit:1226862|Deploy PersonalDashboard to testwiki (T403982)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 14:22:06
|
<wikibugs>
|
('CR) ''CDanis: [C:''+2] tcpproxy: Accept connections from the internet [puppet] - ''https://gerrit.wikimedia.org/r/1227294 (owner: ''Vgutierrez)'
|
|
2026-01-15 14:22:22
|
<vgutierrez>
|
that was a highly motivated review lol
|
|
2026-01-15 14:22:38
|
<logmsgbot>
|
!log jsn@deploy2002 jsn: Continuing with sync
|
|
2026-01-15 14:22:43
|
<logmsgbot>
|
!log cmooney@cumin1003 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for arelion link ips - cmooney@cumin1003"
|
|
2026-01-15 14:23:27
|
<logmsgbot>
|
!log cmooney@cumin1003 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for arelion link ips - cmooney@cumin1003"
|
|
2026-01-15 14:23:27
|
<logmsgbot>
|
!log cmooney@cumin1003 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
|
|
2026-01-15 14:25:01
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+1] "Looks good" [alerts] - ''https://gerrit.wikimedia.org/r/1227335 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 14:25:20
|
<logmsgbot>
|
!log btullis@cumin1003 START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1159.eqiad.wmnet
|
|
2026-01-15 14:25:31
|
<logmsgbot>
|
!log btullis@cumin1003 END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1159.eqiad.wmnet
|
|
2026-01-15 14:26:26
|
<wikibugs>
|
'SRE, ''Infrastructure-Foundations, ''netops, ''Data-Platform-SRE (2026.01.05 - 2026.01.23): Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11525006 (''CDanis) >>! In T414460#11524635, @BTullis wrote: > My assumption is that this is more likely related to the ce...'
|
|
2026-01-15 14:27:46
|
<stephanebisson>
|
Lucas_WMDE can I just +2 the patches manually and start the real deployment later?
|
|
2026-01-15 14:28:12
|
<Lucas_WMDE>
|
stephanebisson: yes
|
|
2026-01-15 14:28:19
|
<wikibugs>
|
('CR) ''Sbisson: [C:''+2] CX3 Build 1.0.0+20260114 [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226976 (https://phabricator.wikimedia.org/T413646) (owner: ''Sbisson)'
|
|
2026-01-15 14:28:23
|
<Lucas_WMDE>
|
as long as nobody else is planning to deploy, because then they would pull in your changes ww
|
|
2026-01-15 14:28:24
|
<Lucas_WMDE>
|
* ^^
|
|
2026-01-15 14:28:33
|
<wikibugs>
|
('PS8) ''CDanis: gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:28:42
|
<stephanebisson>
|
I think I'm next in line
|
|
2026-01-15 14:28:45
|
<Lucas_WMDE>
|
yeah
|
|
2026-01-15 14:28:50
|
<wikibugs>
|
('CR) ''Sbisson: [C:''+2] Fallback to source title if target title is not provided by cxserver [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226977 (https://phabricator.wikimedia.org/T414558) (owner: ''Sbisson)'
|
|
2026-01-15 14:28:50
|
<JSherman>
|
we're about 3/4 through syncing prod k8s on mine, so I think you're good to +2
|
|
2026-01-15 14:28:56
|
<logmsgbot>
|
!log jsn@deploy2002 Finished scap sync-world: Backport for [[gerrit:1226862|Deploy PersonalDashboard to testwiki (T403982)]] (duration: 09m 41s)
|
|
2026-01-15 14:29:00
|
<stashbot>
|
T403982: Create and deploy Extension:PersonalDashboard - https://phabricator.wikimedia.org/T403982
|
|
2026-01-15 14:29:01
|
<JSherman>
|
stephanebisson: over to you
|
|
2026-01-15 14:29:03
|
<JSherman>
|
finished!
|
|
2026-01-15 14:29:07
|
<stephanebisson>
|
Thanks!
|
|
2026-01-15 14:29:34
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by sbisson@deploy2002 using scap backport" [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226976 (https://phabricator.wikimedia.org/T413646) (owner: ''Sbisson)'
|
|
2026-01-15 14:29:34
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by sbisson@deploy2002 using scap backport" [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226977 (https://phabricator.wikimedia.org/T414558) (owner: ''Sbisson)'
|
|
2026-01-15 14:30:18
|
<Lucas_WMDE>
|
depending on how long that gate-and-submit will take we could’ve tried to squeeze in phuedx in between
|
|
2026-01-15 14:30:22
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:30:27
|
<Lucas_WMDE>
|
but I don’t think it’s necessary, there should be enough time afterwards
|
|
2026-01-15 14:32:06
|
<A_smart_kitten>
|
Lucas_WMDE: fwiw my gut instinct is that the movestable permissions thing might be something to do with FlaggedRevs
|
|
2026-01-15 14:33:11
|
<Lucas_WMDE>
|
ah, our favorite codebase?
|
|
2026-01-15 14:33:20
|
<A_smart_kitten>
|
just the one :D
|
|
2026-01-15 14:33:34
|
<Lucas_WMDE>
|
when in doubt, blame FlaggedRevs
|
|
2026-01-15 14:33:49
|
<Seawolf35>
|
Beyond my pay grade
|
|
2026-01-15 14:33:52
|
<wikibugs>
|
('PS9) ''CDanis: gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:33:54
|
<A_smart_kitten>
|
maybe some subtasks of T225144 are similar
|
|
2026-01-15 14:33:54
|
<stashbot>
|
T225144: Flagged Revs configuration may be broken - https://phabricator.wikimedia.org/T225144
|
|
2026-01-15 14:34:01
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:34:02
|
<Lucas_WMDE>
|
(I found some other Phabricator tasks that sounded related, though not quite the same)
|
|
2026-01-15 14:34:21
|
<Lucas_WMDE>
|
T275370
|
|
2026-01-15 14:34:22
|
<stashbot>
|
T275370: Unable to move pages despite being autoconfirmed on wikis with FlaggedRevs - https://phabricator.wikimedia.org/T275370
|
|
2026-01-15 14:35:15
|
<A_smart_kitten>
|
my gut instinct (untested) would be to move the FlaggedRevs user group-related config that isn't currently working out of core-Permissions.php & add it to the MediaWikiServices hook in flaggedrevs.php
|
|
2026-01-15 14:37:41
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to SRE/production access for Kim.pham (kimpham in phab) - https://phabricator.wikimedia.org/T414671#11525053 (''WMDE-leszek) I approve this request on WMDE's end, and take the responsibility for the backlog instead of backport brainfart. @kimpham should not have...'
|
|
2026-01-15 14:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 14:38:14
|
<Lucas_WMDE>
|
A_smart_kitten: geeeez
|
|
2026-01-15 14:38:18
|
<wikibugs>
|
('Merged) ''jenkins-bot: CX3 Build 1.0.0+20260114 [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226976 (https://phabricator.wikimedia.org/T413646) (owner: ''Sbisson)'
|
|
2026-01-15 14:38:27
|
<wikibugs>
|
('PS1) ''Jsn.sherman: Deploy PersonalDashboard to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227346 (https://phabricator.wikimedia.org/T403982)'
|
|
2026-01-15 14:38:30
|
<Lucas_WMDE>
|
I hadn’t seen that hook before. that’s… something
|
|
2026-01-15 14:38:50
|
<wikibugs>
|
('Merged) ''jenkins-bot: Fallback to source title if target title is not provided by cxserver [extensions/ContentTranslation] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1226977 (https://phabricator.wikimedia.org/T414558) (owner: ''Sbisson)'
|
|
2026-01-15 14:39:00
|
<Lucas_WMDE>
|
yeah there’s some stuff like $wgGroupPermissions['editor']['autoreview'] = false; there
|
|
2026-01-15 14:39:03
|
<wikibugs>
|
('CR) ''Dzahn: [C:''+1] tcpproxy: Accept connections from the internet [puppet] - ''https://gerrit.wikimedia.org/r/1227294 (owner: ''Vgutierrez)'
|
|
2026-01-15 14:39:11
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests: Requesting access to SRE/production access for Kim.pham (kimpham in phab) - https://phabricator.wikimedia.org/T414671#11525059 (''WMDE-leszek)'
|
|
2026-01-15 14:39:24
|
<logmsgbot>
|
!log sbisson@deploy2002 Started scap sync-world: Backport for [[gerrit:1226976|CX3 Build 1.0.0+20260114 (T413646)]], [[gerrit:1226977|Fallback to source title if target title is not provided by cxserver (T414558)]]
|
|
2026-01-15 14:39:28
|
<Lucas_WMDE>
|
I’ll go make a task
|
|
2026-01-15 14:39:30
|
<stashbot>
|
T413646: Content Translation: cannot select an existing target article; section translation is published to a redirect instead of the main article (target language: Russian). - https://phabricator.wikimedia.org/T413646
|
|
2026-01-15 14:39:31
|
<stashbot>
|
T414558: Wikipedia Content Translation Tool displays blank page and never loads - https://phabricator.wikimedia.org/T414558
|
|
2026-01-15 14:39:39
|
<wikibugs>
|
('PS10) ''CDanis: gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:39:42
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:40:15
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''-1] "hiera files target eqsin, not drmrs" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:41:32
|
<logmsgbot>
|
!log sbisson@deploy2002 sbisson: Backport for [[gerrit:1226976|CX3 Build 1.0.0+20260114 (T413646)]], [[gerrit:1226977|Fallback to source title if target title is not provided by cxserver (T414558)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 14:42:27
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] hcaptcha proxy: Enable Bird 2.18 for all servers [puppet] - ''https://gerrit.wikimedia.org/r/1224709 (https://phabricator.wikimedia.org/T413740) (owner: ''Muehlenhoff)'
|
|
2026-01-15 14:43:19
|
<logmsgbot>
|
!log sbisson@deploy2002 sbisson: Continuing with sync
|
|
2026-01-15 14:43:35
|
<JSherman>
|
Lucas_WMDE: just noting that I forgot to add the extension load to common settings to enable personaldashboard on testwiki, making my patch a noop. I just kept it moving and created a new patch to complete the enablement. Will followup in another window.
|
|
2026-01-15 14:43:48
|
<wikibugs>
|
'ops-eqiad, ''DC-Ops: Power Supply - PS1 Status - issue on clouddb1024:9290 - https://phabricator.wikimedia.org/T414681 (''phaultfinder) ''NEW'
|
|
2026-01-15 14:44:08
|
<moritzm>
|
!log installing net-snmp security updates
|
|
2026-01-15 14:44:10
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 14:44:31
|
<wikibugs>
|
('CR) ''Filippo Giunchedi: [C:''+2] sre: remove value from MaxConntrack summary [alerts] - ''https://gerrit.wikimedia.org/r/1227335 (https://phabricator.wikimedia.org/T414669) (owner: ''Filippo Giunchedi)'
|
|
2026-01-15 14:45:06
|
<wikibugs>
|
('PS11) ''CDanis: gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:45:06
|
<wikibugs>
|
('PS1) ''CDanis: gerrit/Liberica: eqsin [puppet] - ''https://gerrit.wikimedia.org/r/1227348 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:45:22
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227348 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:45:24
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:46:22
|
<Lucas_WMDE>
|
JSherman: ack
|
|
2026-01-15 14:47:00
|
<wikibugs>
|
'SRE, ''collaboration-services, ''Patch-For-Review, ''PES1.3.3 WP25 Easter Eggs: Request: Wikipedia 25 microsite hosting - https://phabricator.wikimedia.org/T408592#11525113 (''Dzahn) I can help with another deployment tomorrow, Friday 16, but not after that until next month. Whether deployment
right...'
|
|
2026-01-15 14:47:31
|
<logmsgbot>
|
!log sbisson@deploy2002 Finished scap sync-world: Backport for [[gerrit:1226976|CX3 Build 1.0.0+20260114 (T413646)]], [[gerrit:1226977|Fallback to source title if target title is not provided by cxserver (T414558)]] (duration: 08m 07s)
|
|
2026-01-15 14:47:36
|
<stashbot>
|
T413646: Content Translation: cannot select an existing target article; section translation is published to a redirect instead of the main article (target language: Russian). - https://phabricator.wikimedia.org/T413646
|
|
2026-01-15 14:47:37
|
<stashbot>
|
T414558: Wikipedia Content Translation Tool displays blank page and never loads - https://phabricator.wikimedia.org/T414558
|
|
2026-01-15 14:48:30
|
<Lucas_WMDE>
|
phuedx: over to you, do you also want to self-service?
|
|
2026-01-15 14:49:09
|
<phuedx>
|
I can self service
|
|
2026-01-15 14:49:16
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''+1] gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:49:16
|
<Lucas_WMDE>
|
ok, go ahead :)
|
|
2026-01-15 14:50:06
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by phuedx@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227004 (https://phabricator.wikimedia.org/T407806) (owner: ''Clare Ming)'
|
|
2026-01-15 14:50:56
|
<wikibugs>
|
('PS12) ''CDanis: gerrit/Liberica: expand to drmrs [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:50:56
|
<wikibugs>
|
('PS2) ''CDanis: gerrit/Liberica: eqsin [puppet] - ''https://gerrit.wikimedia.org/r/1227348 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 14:50:59
|
<wikibugs>
|
('Merged) ''jenkins-bot: Enable Test Kitchen on all prod wikis [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227004 (https://phabricator.wikimedia.org/T407806) (owner: ''Clare Ming)'
|
|
2026-01-15 14:51:00
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:51:06
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227348 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 14:51:30
|
<logmsgbot>
|
!log phuedx@deploy2002 Started scap sync-world: Backport for [[gerrit:1227004|Enable Test Kitchen on all prod wikis (T407806)]]
|
|
2026-01-15 14:51:34
|
<stashbot>
|
T407806: Rename Metrics Platform Extension to Test Kitchen - https://phabricator.wikimedia.org/T407806
|
|
2026-01-15 14:51:47
|
<A_smart_kitten>
|
Lucas_WMDE: aaahhhh the autoconfirmed movestable permission is *overridden* in the flaggedrevs.php MediaWikiServices hook
|
|
2026-01-15 14:51:48
|
<A_smart_kitten>
|
https://gerrit.wikimedia.org/g/operations/mediawiki-config/+/19adfae2241be7a72c651d64dd318dd57f560c59/wmf-config/flaggedrevs.php#207
|
|
2026-01-15 14:51:58
|
<logmsgbot>
|
!log cdanis@cumin1003 conftool action : set/pooled=yes; selector: cluster=tcp-proxy,service=gerrit
|
|
2026-01-15 14:53:52
|
<logmsgbot>
|
!log phuedx@deploy2002 cjming, phuedx: Backport for [[gerrit:1227004|Enable Test Kitchen on all prod wikis (T407806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 14:53:58
|
<Lucas_WMDE>
|
A_smart_kitten: created T414684
|
|
2026-01-15 14:53:58
|
<stashbot>
|
T414684: FlaggedRevs-specific group rights from core-Permissions.php get overridden by flaggedrevs.php - https://phabricator.wikimedia.org/T414684
|
|
2026-01-15 14:54:30
|
<phuedx>
|
Looking at the test servers now
|
|
2026-01-15 14:54:52
|
<A_smart_kitten>
|
ty Lucas_WMDE!
|
|
2026-01-15 14:56:07
|
<wikibugs>
|
('CR) ''Scott French: [C:''+1] conf/etcd: Remove now obsolete cert [puppet] - ''https://gerrit.wikimedia.org/r/1227307 (https://phabricator.wikimedia.org/T352245) (owner: ''Muehlenhoff)'
|
|
2026-01-15 14:56:09
|
<wikibugs>
|
('CR) ''Scott French: [C:''+1] conf/etcd: Remove now obsolete cert [puppet] - ''https://gerrit.wikimedia.org/r/1227309 (https://phabricator.wikimedia.org/T352245) (owner: ''Muehlenhoff)'
|
|
2026-01-15 14:56:42
|
<phuedx>
|
Configuration is coming through OK. There aren't any instruments or experiments using TestKitchen codepaths so I'm not expecting to see anything in the console
|
|
2026-01-15 14:57:28
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1190 (T413525)', diff saved to https://phabricator.wikimedia.org/P87549 and previous config saved to /var/cache/conftool/dbconfig/20260115-145727-marostegui.json
|
|
2026-01-15 14:57:31
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 14:57:50
|
<wikibugs>
|
('CR) ''Scott French: [C:''+1] "Thanks, Moritz!" [puppet] - ''https://gerrit.wikimedia.org/r/978615 (owner: ''Muehlenhoff)'
|
|
2026-01-15 14:59:08
|
<phuedx>
|
The SDKs are available as expected
|
|
2026-01-15 14:59:57
|
<Lucas_WMDE>
|
I’m going afk, I hope everything goes fine with the rest of the window
|
|
2026-01-15 15:00:25
|
<icinga-wm>
|
RECOVERY - Host an-worker1159 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms
|
|
2026-01-15 15:00:47
|
<icinga-wm>
|
PROBLEM - SSH on an-worker1159 is CRITICAL: connect to address 10.64.153.4 and port 22: Connection refused https://wikitech.wikimedia.org/wiki/SSH/monitoring
|
|
2026-01-15 15:01:06
|
<phuedx>
|
Continuing with sync
|
|
2026-01-15 15:01:15
|
<logmsgbot>
|
!log phuedx@deploy2002 cjming, phuedx: Continuing with sync
|
|
2026-01-15 15:02:15
|
<wikibugs>
|
('CR) ''CDanis: [V:''+1 C:''+2] "https://puppet-compiler.wmflabs.org/output/1215693/5634/"; [puppet] - ''https://gerrit.wikimedia.org/r/1215693 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:05:16
|
<logmsgbot>
|
!log phuedx@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227004|Enable Test Kitchen on all prod wikis (T407806)]] (duration: 13m 46s)
|
|
2026-01-15 15:05:18
|
<logmsgbot>
|
!log cdanis@cumin1003 START - Cookbook sre.loadbalancer.admin config_reloading P{lvs6003.drmrs.wmnet} and A:liberica
|
|
2026-01-15 15:05:20
|
<stashbot>
|
T407806: Rename Metrics Platform Extension to Test Kitchen - https://phabricator.wikimedia.org/T407806
|
|
2026-01-15 15:05:37
|
<logmsgbot>
|
!log cdanis@cumin1003 END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs6003.drmrs.wmnet} and A:liberica
|
|
2026-01-15 15:06:03
|
<logmsgbot>
|
!log cdanis@cumin1003 START - Cookbook sre.loadbalancer.admin config_reloading P{lvs6001.drmrs.wmnet} and A:liberica
|
|
2026-01-15 15:06:23
|
<logmsgbot>
|
!log cdanis@cumin1003 END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs6001.drmrs.wmnet} and A:liberica
|
|
2026-01-15 15:06:30
|
<jinxer-wm>
|
FIRING: LibericaStaleConfig: Liberica instance lvs6003 is running a stale configuration - https://wikitech.wikimedia.org/wiki/Liberica#LibericaStaleConfig - https://grafana.wikimedia.org/d/fa4de97a-7114-48c7-a91a-f56089ef554f/liberica?orgId=1&viewPanel=10&var-site=drmrs&var-instance=lvs6003 - https://alerts.wikimedia.org/?q=alertname%3DLibericaStaleConfig
|
|
2026-01-15 15:06:40
|
<cdanis>
|
lol
|
|
2026-01-15 15:07:15
|
<taavi>
|
hey, at least the alerting works!
|
|
2026-01-15 15:07:36
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P87551 and previous config saved to /var/cache/conftool/dbconfig/20260115-150735-marostegui.json
|
|
2026-01-15 15:09:06
|
<wikibugs>
|
('CR) ''Kevin Bazira: Add vLLM image in ML namespace (''1 comment) [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1146891 (https://phabricator.wikimedia.org/T385173) (owner: ''Kevin Bazira)'
|
|
2026-01-15 15:09:11
|
<jinxer-wm>
|
FIRING: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 15:11:30
|
<jinxer-wm>
|
RESOLVED: LibericaStaleConfig: Liberica instance lvs6003 is running a stale configuration - https://wikitech.wikimedia.org/wiki/Liberica#LibericaStaleConfig - https://grafana.wikimedia.org/d/fa4de97a-7114-48c7-a91a-f56089ef554f/liberica?orgId=1&viewPanel=10&var-site=drmrs&var-instance=lvs6003 - https://alerts.wikimedia.org/?q=alertname%3DLibericaStaleConfig
|
|
2026-01-15 15:11:33
|
<wikibugs>
|
('PS1) ''DCausse: search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351'
|
|
2026-01-15 15:12:03
|
<jinxer-wm>
|
FIRING: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures
|
|
2026-01-15 15:13:14
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351 (owner: ''DCausse)'
|
|
2026-01-15 15:13:24
|
<wikibugs>
|
('CR) ''Cwhite: [C:''+1] Remove profile::puppet::agent::force_puppet7 from search roles [puppet] - ''https://gerrit.wikimedia.org/r/1227270 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 15:13:59
|
<wikibugs>
|
('CR) ''Cwhite: [C:''+1] Remove profile::puppet::agent::force_puppet7 from Data Platform roles [puppet] - ''https://gerrit.wikimedia.org/r/1227313 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 15:14:56
|
<wikibugs>
|
('CR) ''Elukey: [C:''+1] "LGTM for a test" [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1146891 (https://phabricator.wikimedia.org/T385173) (owner: ''Kevin Bazira)'
|
|
2026-01-15 15:15:01
|
<wikibugs>
|
('PS1) ''Ayounsi: Routed ganeti: move v6_prefixes to Hiera [puppet] - ''https://gerrit.wikimedia.org/r/1227352 (https://phabricator.wikimedia.org/T410314)'
|
|
2026-01-15 15:16:53
|
<icinga-wm>
|
PROBLEM - Host an-worker1159 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 15:17:06
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from search roles [puppet] - ''https://gerrit.wikimedia.org/r/1227270 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 15:17:40
|
<wikibugs>
|
('PS2) ''Ayounsi: Routed ganeti: move v6_prefixes to Hiera [puppet] - ''https://gerrit.wikimedia.org/r/1227352 (https://phabricator.wikimedia.org/T410314)'
|
|
2026-01-15 15:17:44
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P87552 and previous config saved to /var/cache/conftool/dbconfig/20260115-151744-marostegui.json
|
|
2026-01-15 15:17:50
|
<wikibugs>
|
('CR) ''Ayounsi: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227352 (https://phabricator.wikimedia.org/T410314) (owner: ''Ayounsi)'
|
|
2026-01-15 15:21:50
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] Remove profile::puppet::agent::force_puppet7 from Data Platform roles [puppet] - ''https://gerrit.wikimedia.org/r/1227313 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 15:21:55
|
<icinga-wm>
|
RECOVERY - Host an-worker1159 is UP: PING OK - Packet loss = 0%, RTA = 0.30 ms
|
|
2026-01-15 15:22:27
|
<wikibugs>
|
('PS2) ''DCausse: search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351'
|
|
2026-01-15 15:23:01
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1262 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87553 and previous config saved to /var/cache/conftool/dbconfig/20260115-152301-marostegui.json
|
|
2026-01-15 15:23:07
|
<stashbot>
|
T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163
|
|
2026-01-15 15:23:07
|
<stashbot>
|
T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164
|
|
2026-01-15 15:24:10
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351 (owner: ''DCausse)'
|
|
2026-01-15 15:25:59
|
<wikibugs>
|
('PS3) ''DCausse: search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351'
|
|
2026-01-15 15:27:53
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1190 (T413525)', diff saved to https://phabricator.wikimedia.org/P87554 and previous config saved to /var/cache/conftool/dbconfig/20260115-152752-marostegui.json
|
|
2026-01-15 15:27:57
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 15:28:09
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1199.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 15:28:16
|
<wikibugs>
|
('PS1) ''CDanis: Liberica/gerrit: 🌍‼️ 🎊 [puppet] - ''https://gerrit.wikimedia.org/r/1227356 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 15:28:17
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1199 (T413525)', diff saved to https://phabricator.wikimedia.org/P87555 and previous config saved to /var/cache/conftool/dbconfig/20260115-152817-marostegui.json
|
|
2026-01-15 15:28:19
|
<icinga-wm>
|
PROBLEM - Host an-worker1159 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 15:28:53
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227356 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:29:49
|
<icinga-wm>
|
RECOVERY - SSH on an-worker1159 is OK: SSH OK - OpenSSH_8.4p1 Debian-5+deb11u5 (protocol 2.0) https://wikitech.wikimedia.org/wiki/SSH/monitoring
|
|
2026-01-15 15:29:51
|
<icinga-wm>
|
RECOVERY - Host an-worker1159 is UP: PING OK - Packet loss = 0%, RTA = 0.26 ms
|
|
2026-01-15 15:30:05
|
<jouncebot>
|
Deploy window Test Kitchen Experiment Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1530)
|
|
2026-01-15 15:32:51
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''+1] Liberica/gerrit: 🌍‼️ 🎊 [puppet] - ''https://gerrit.wikimedia.org/r/1227356 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:33:10
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87556 and previous config saved to /var/cache/conftool/dbconfig/20260115-153309-marostegui.json
|
|
2026-01-15 15:33:49
|
<logmsgbot>
|
!log cmooney@cumin1003 START - Cookbook sre.dns.netbox
|
|
2026-01-15 15:34:11
|
<jinxer-wm>
|
RESOLVED: [2x] JobUnavailable: Reduced availability for job sidekiq in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 15:35:31
|
<wikibugs>
|
('PS4) ''Ayounsi: Routed ganeti: move v6_prefixes to Hiera [puppet] - ''https://gerrit.wikimedia.org/r/1227352 (https://phabricator.wikimedia.org/T410314)'
|
|
2026-01-15 15:36:05
|
<wikibugs>
|
('CR) ''CDanis: [C:''+2] Liberica/gerrit: 🌍‼️ 🎊 [puppet] - ''https://gerrit.wikimedia.org/r/1227356 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:36:24
|
<wikibugs>
|
('CR) ''CDanis: [C:''+2] gerrit/Liberica: eqsin [puppet] - ''https://gerrit.wikimedia.org/r/1227348 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:39:42
|
<logmsgbot>
|
cmooney@cumin1003 netbox (PID 1669792) is awaiting input
|
|
2026-01-15 15:41:50
|
<logmsgbot>
|
!log cdanis@cumin1003 START - Cookbook sre.loadbalancer.admin config_reloading P{lvs4*} and A:liberica
|
|
2026-01-15 15:42:30
|
<jinxer-wm>
|
FIRING: [6x] LibericaStaleConfig: Liberica instance lvs3008 is running a stale configuration - https://wikitech.wikimedia.org/wiki/Liberica#LibericaStaleConfig - https://alerts.wikimedia.org/?q=alertname%3DLibericaStaleConfig
|
|
2026-01-15 15:43:12
|
<wikibugs>
|
('PS1) ''Sbisson: Fallback to source title if target title is not provided by cxserver [extensions/ContentTranslation] (wmf/1.46.0-wmf.10) - ''https://gerrit.wikimedia.org/r/1227361 (https://phabricator.wikimedia.org/T414558)'
|
|
2026-01-15 15:43:14
|
<dancy>
|
jouncebot nowandnext
|
|
2026-01-15 15:43:14
|
<jouncebot>
|
For the next 0 hour(s) and 16 minute(s): Test Kitchen Experiment Deployment Window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1530)
|
|
2026-01-15 15:43:14
|
<jouncebot>
|
In 1 hour(s) and 16 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1700)
|
|
2026-01-15 15:43:18
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87557 and previous config saved to /var/cache/conftool/dbconfig/20260115-154317-marostegui.json
|
|
2026-01-15 15:43:34
|
<logmsgbot>
|
!log cdanis@cumin1003 END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs4*} and A:liberica
|
|
2026-01-15 15:43:36
|
<logmsgbot>
|
!log dancy@deploy2002 Installing scap version "4.232.0" for 2 host(s)
|
|
2026-01-15 15:43:51
|
<wikibugs>
|
('Abandoned) ''Sbisson: Fallback to source title if target title is not provided by cxserver [extensions/ContentTranslation] (wmf/1.46.0-wmf.10) - ''https://gerrit.wikimedia.org/r/1227361 (https://phabricator.wikimedia.org/T414558) (owner: ''Sbisson)'
|
|
2026-01-15 15:43:51
|
<logmsgbot>
|
!log cdanis@cumin1003 START - Cookbook sre.loadbalancer.admin config_reloading P{lvs5*} and A:liberica
|
|
2026-01-15 15:44:04
|
<vgutierrez>
|
cdanis: poor high-traffic2 lvs reloading config for a NOOP ;P
|
|
2026-01-15 15:44:34
|
<cdanis>
|
I love all my liberica children
|
|
2026-01-15 15:45:27
|
<logmsgbot>
|
!log dancy@deploy2002 Installation of scap version "4.232.0" completed for 2 hosts
|
|
2026-01-15 15:45:46
|
<logmsgbot>
|
!log cdanis@cumin1003 START - Cookbook sre.loadbalancer.admin config_reloading P{lvs3*} and A:liberica
|
|
2026-01-15 15:45:54
|
<logmsgbot>
|
!log cdanis@cumin1003 END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs5*} and A:liberica
|
|
2026-01-15 15:47:30
|
<jinxer-wm>
|
FIRING: [6x] LibericaStaleConfig: Liberica instance lvs3008 is running a stale configuration - https://wikitech.wikimedia.org/wiki/Liberica#LibericaStaleConfig - https://alerts.wikimedia.org/?q=alertname%3DLibericaStaleConfig
|
|
2026-01-15 15:47:41
|
<logmsgbot>
|
!log cdanis@cumin1003 END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P{lvs3*} and A:liberica
|
|
2026-01-15 15:47:53
|
<vgutierrez>
|
some timing issue :)
|
|
2026-01-15 15:48:16
|
<wikibugs>
|
('CR) ''Bking: [C:''+2] search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351 (owner: ''DCausse)'
|
|
2026-01-15 15:50:04
|
<wikibugs>
|
('Merged) ''jenkins-bot: search: pull wme secrets out of the connections array [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227351 (owner: ''DCausse)'
|
|
2026-01-15 15:51:10
|
<wikibugs>
|
('PS1) ''CDanis: LVS/gerrit: eqiad [puppet] - ''https://gerrit.wikimedia.org/r/1227363 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 15:51:32
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227363 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 15:52:00
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2210 (T413525)', diff saved to https://phabricator.wikimedia.org/P87558 and previous config saved to /var/cache/conftool/dbconfig/20260115-155159-marostegui.json
|
|
2026-01-15 15:52:04
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 15:52:20
|
<wikibugs>
|
('PS1) ''Trueg: blazegraph: alert on ratio of failed queries increase [alerts] - ''https://gerrit.wikimedia.org/r/1227364 (https://phabricator.wikimedia.org/T414306)'
|
|
2026-01-15 15:52:30
|
<jinxer-wm>
|
RESOLVED: [6x] LibericaStaleConfig: Liberica instance lvs3008 is running a stale configuration - https://wikitech.wikimedia.org/wiki/Liberica#LibericaStaleConfig - https://alerts.wikimedia.org/?q=alertname%3DLibericaStaleConfig
|
|
2026-01-15 15:53:27
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1262 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87560 and previous config saved to /var/cache/conftool/dbconfig/20260115-155326-marostegui.json
|
|
2026-01-15 15:53:33
|
<stashbot>
|
T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163
|
|
2026-01-15 15:53:34
|
<stashbot>
|
T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164
|
|
2026-01-15 15:53:43
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1263.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 15:53:52
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1263 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87561 and previous config saved to /var/cache/conftool/dbconfig/20260115-155351-marostegui.json
|
|
2026-01-15 15:57:23
|
<wikibugs>
|
('CR) ''CDanis: [V:''+1] "https://puppet-compiler.wmflabs.org/output/1227363/5639/"; [puppet] - ''https://gerrit.wikimedia.org/r/1227363 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 16:01:50
|
<wikibugs>
|
'SRE, ''Release-Engineering-Team, ''Scap, ''serviceops, ''Datacenter-Switchover: Add scap lock/unlock steps to sre.switchdc.mediawiki cookbook - https://phabricator.wikimedia.org/T330996#11525496 (''dancy)'
|
|
2026-01-15 16:02:08
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P87562 and previous config saved to /var/cache/conftool/dbconfig/20260115-160208-marostegui.json
|
|
2026-01-15 16:03:20
|
<wikibugs>
|
'SRE, ''Release-Engineering-Team, ''Scap, ''serviceops, ''Datacenter-Switchover: Add scap lock/unlock steps to sre.switchdc.mediawiki cookbook - https://phabricator.wikimedia.org/T330996#11525507 (''dancy) @Blake I've installed a new release of scap on the deploy servers. You can now use
`scap lock --a...'
|
|
2026-01-15 16:03:49
|
<wikibugs>
|
('CR) ''Vgutierrez: [C:''+1] LVS/gerrit: eqiad [puppet] - ''https://gerrit.wikimedia.org/r/1227363 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 16:04:07
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227346 (https://phabricator.wikimedia.org/T403982) (owner: ''Jsn.sherman)'
|
|
2026-01-15 16:06:15
|
<logmsgbot>
|
!log dcausse@deploy2002 helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
|
|
2026-01-15 16:07:37
|
<icinga-wm>
|
PROBLEM - Host titan1002 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 16:07:42
|
<logmsgbot>
|
!log dcausse@deploy2002 helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
|
|
2026-01-15 16:09:11
|
<jinxer-wm>
|
FIRING: [2x] ProbeDown: Service titan1002:443 has failed probes (http_thanos_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#titan1002:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
|
|
2026-01-15 16:11:31
|
<icinga-wm>
|
RECOVERY - Host titan1002 is UP: PING OK - Packet loss = 0%, RTA = 10.77 ms
|
|
2026-01-15 16:12:03
|
<jinxer-wm>
|
RESOLVED: MediaWikiEditFailures: Elevated MediaWiki edit failures (session_loss) for cluster - https://wikitech.wikimedia.org/wiki/Application_servers/Runbook - https://grafana.wikimedia.org/d/000000208/edit-count?orgId=1&viewPanel=13 - https://alerts.wikimedia.org/?q=alertname%3DMediaWikiEditFailures
|
|
2026-01-15 16:12:17
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P87563 and previous config saved to /var/cache/conftool/dbconfig/20260115-161216-marostegui.json
|
|
2026-01-15 16:14:11
|
<jinxer-wm>
|
RESOLVED: [2x] ProbeDown: Service titan1002:443 has failed probes (http_thanos_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#titan1002:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
|
|
2026-01-15 16:14:11
|
<jinxer-wm>
|
FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 16:22:25
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2210 (T413525)', diff saved to https://phabricator.wikimedia.org/P87564 and previous config saved to /var/cache/conftool/dbconfig/20260115-162224-marostegui.json
|
|
2026-01-15 16:22:29
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 16:22:41
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 16:22:48
|
<wikibugs>
|
('CR) ''Trueg: "To start the discussion: I think 1.0 is way too high as a threshold." [alerts] - ''https://gerrit.wikimedia.org/r/1227364 (https://phabricator.wikimedia.org/T414306) (owner: ''Trueg)'
|
|
2026-01-15 16:22:49
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2219 (T413525)', diff saved to https://phabricator.wikimedia.org/P87565 and previous config saved to /var/cache/conftool/dbconfig/20260115-162249-marostegui.json
|
|
2026-01-15 16:24:19
|
<wikibugs>
|
('CR) ''Majavah: [C:''-1] "-1 for the prometheus_nodes issue specifically, but in general I'm not a huge fan of this as it relies on the realm global and in general " [puppet] - ''https://gerrit.wikimedia.org/r/1226944 (https://phabricator.wikimedia.org/T411089) (owner: ''JHathaway)'
|
|
2026-01-15 16:24:36
|
<wikibugs>
|
('CR) ''Ssingh: [C:''+1] "Sorry, my bad." [puppet] - ''https://gerrit.wikimedia.org/r/1225524 (https://phabricator.wikimedia.org/T365798) (owner: ''Muehlenhoff)'
|
|
2026-01-15 16:24:41
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests, ''Data-Platform-SRE: Requesting deployment access for AKhatun - https://phabricator.wikimedia.org/T414347#11525622 (''AKhatun_WMF) I also don't have access to `ssh an-launcher1003.eqiad.wmnet`. I get a permission denied. Is this related? Are we waiting for another approval (fro...'
|
|
2026-01-15 16:27:24
|
<wikibugs>
|
('CR) ''Gmodena: "Nice!" [alerts] - ''https://gerrit.wikimedia.org/r/1227364 (https://phabricator.wikimedia.org/T414306) (owner: ''Trueg)'
|
|
2026-01-15 16:33:48
|
<wikibugs>
|
('PS1) ''Vgutierrez: varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373'
|
|
2026-01-15 16:34:24
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373 (owner: ''Vgutierrez)'
|
|
2026-01-15 16:35:15
|
<wikibugs>
|
('PS2) ''Vgutierrez: varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373'
|
|
2026-01-15 16:42:20
|
<wikibugs>
|
('PS3) ''Vgutierrez: varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373'
|
|
2026-01-15 16:43:06
|
<wikibugs>
|
('CR) ''Vgutierrez: [V:''+1] "VTC is happy: # top TEST /wikimedia/varnish/text/55-vary-xee.vtc passed (3.024)" [puppet] - ''https://gerrit.wikimedia.org/r/1227373 (owner: ''Vgutierrez)'
|
|
2026-01-15 16:45:28
|
<hnowlan>
|
jouncebot: nowandnext
|
|
2026-01-15 16:45:28
|
<jouncebot>
|
No deployments scheduled for the next 0 hour(s) and 14 minute(s)
|
|
2026-01-15 16:45:28
|
<jouncebot>
|
In 0 hour(s) and 14 minute(s): Puppet request window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1700)
|
|
2026-01-15 16:48:21
|
<wikibugs>
|
('CR) ''Hnowlan: [C:''+2] thumbor: reimplement SVG max size feature [deployment-charts] - ''https://gerrit.wikimedia.org/r/1226286 (https://phabricator.wikimedia.org/T411076) (owner: ''Hnowlan)'
|
|
2026-01-15 16:48:50
|
<wikibugs>
|
('CR) ''Hnowlan: thumbor: reimplement SVG max size feature [deployment-charts] - ''https://gerrit.wikimedia.org/r/1226286 (https://phabricator.wikimedia.org/T411076) (owner: ''Hnowlan)'
|
|
2026-01-15 16:51:11
|
<wikibugs>
|
('PS2) ''Hnowlan: thumbor: reimplement SVG max size feature [deployment-charts] - ''https://gerrit.wikimedia.org/r/1226286 (https://phabricator.wikimedia.org/T411076)'
|
|
2026-01-15 16:52:00
|
<wikibugs>
|
('CR) ''CDanis: [C:''+1] varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373 (owner: ''Vgutierrez)'
|
|
2026-01-15 16:52:12
|
<wikibugs>
|
('CR) ''Trueg: "thresholds are indeed way too high." [alerts] - ''https://gerrit.wikimedia.org/r/1227364 (https://phabricator.wikimedia.org/T414306) (owner: ''Trueg)'
|
|
2026-01-15 16:54:05
|
<wikibugs>
|
('CR) ''Scott French: [C:''+1] varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373 (owner: ''Vgutierrez)'
|
|
2026-01-15 16:54:14
|
<wikibugs>
|
('CR) ''Vgutierrez: [V:''+1 C:''+2] varnish: Drop leading commas when X-E-E is present on Vary [puppet] - ''https://gerrit.wikimedia.org/r/1227373 (owner: ''Vgutierrez)'
|
|
2026-01-15 16:54:20
|
<wikibugs>
|
('PS1) ''Bking: java: create openjdk-21 image [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695)'
|
|
2026-01-15 16:55:42
|
<logmsgbot>
|
!log cmooney@cumin1003 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for arelion link ips - cmooney@cumin1003"
|
|
2026-01-15 16:55:49
|
<logmsgbot>
|
!log cmooney@cumin1003 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update reverse dns entries for arelion link ips - cmooney@cumin1003"
|
|
2026-01-15 16:55:49
|
<logmsgbot>
|
!log cmooney@cumin1003 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
|
|
2026-01-15 17:00:05
|
<jouncebot>
|
jhathaway and rzl: It is that lovely time of the day again! You are hereby commanded to deploy Puppet request window. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1700).
|
|
2026-01-15 17:00:05
|
<jouncebot>
|
No Gerrit patches in the queue for this window AFAICS.
|
|
2026-01-15 17:02:10
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕛☕ sudo cumin 'A:lvs-eqiad' 'disable-puppet T411895'
|
|
2026-01-15 17:02:13
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:02:14
|
<stashbot>
|
T411895: gerrit behind CDN - https://phabricator.wikimedia.org/T411895
|
|
2026-01-15 17:02:52
|
<wikibugs>
|
('CR) ''CDanis: [V:''+1 C:''+2] LVS/gerrit: eqiad [puppet] - ''https://gerrit.wikimedia.org/r/1227363 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 17:03:40
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 17:04:10
|
<cdanis>
|
lol
|
|
2026-01-15 17:05:03
|
<mutante>
|
grmbl
|
|
2026-01-15 17:06:51
|
<mutante>
|
wanted to silence/ACK it but already gone?
|
|
2026-01-15 17:08:38
|
<mutante>
|
removing unit file and resetting state in a moment
|
|
2026-01-15 17:09:09
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕛☕ sudo cumin A:lvs-secondary-eqiad 'systemctl restart pybal.service'
|
|
2026-01-15 17:09:10
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:10:35
|
<mutante>
|
!log [cumin2002:~] $ sudo cumin -b 15 'tcp-proxy*' 'rm /lib/systemd/system/prometheus-node-textfile-check-nft*'
|
|
2026-01-15 17:10:37
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:11:07
|
<mutante>
|
!log [cumin2002:~] $ sudo cumin -b 15 'tcp-proxy*' 'systemctl reset-failed'
|
|
2026-01-15 17:11:09
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:13:25
|
<jinxer-wm>
|
RESOLVED: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 17:17:35
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕛☕ sudo cumin A:lvs-high-traffic1-eqiad 'systemctl restart pybal.service'
|
|
2026-01-15 17:17:37
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:22:39
|
<wikibugs>
|
('PS1) ''CDanis: LVS/gerrit: codfw [puppet] - ''https://gerrit.wikimedia.org/r/1227391 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 17:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 17:33:20
|
<wikibugs>
|
'ops-codfw, ''DC-Ops: wikikube-worker2346 DOA - https://phabricator.wikimedia.org/T414708 (''Jhancock.wm) ''NEW'
|
|
2026-01-15 17:34:19
|
<wikibugs>
|
('CR) ''CDanis: [C:''+2] LVS/gerrit: codfw [puppet] - ''https://gerrit.wikimedia.org/r/1227391 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 17:34:25
|
<jinxer-wm>
|
FIRING: [2x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy4001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 17:34:38
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:lvs-codfw' 'disable-puppet T411895'
|
|
2026-01-15 17:34:42
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:34:43
|
<stashbot>
|
T411895: gerrit behind CDN - https://phabricator.wikimedia.org/T411895
|
|
2026-01-15 17:36:46
|
<wikibugs>
|
'ops-codfw, ''DC-Ops: wikikube-worker2346 DOA - https://phabricator.wikimedia.org/T414708#11525824 (''Jhancock.wm)'
|
|
2026-01-15 17:37:16
|
<wikibugs>
|
'ops-codfw, ''DC-Ops: wikikube-worker2346 DOA - https://phabricator.wikimedia.org/T414708#11525825 (''Jhancock.wm)'
|
|
2026-01-15 17:37:43
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin A:lvs-secondary-codfw 'systemctl restart pybal.service'
|
|
2026-01-15 17:37:44
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:39:25
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 17:41:53
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin A:lvs-high-traffic1-codfw 'systemctl restart pybal.service'
|
|
2026-01-15 17:41:56
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:44:59
|
<cdanis>
|
!log 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:lvs-codfw or A:lvs-eqiad' 'enable-puppet T411895'
|
|
2026-01-15 17:45:03
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 17:45:04
|
<stashbot>
|
T411895: gerrit behind CDN - https://phabricator.wikimedia.org/T411895
|
|
2026-01-15 17:56:13
|
<wikibugs>
|
('PS1) ''Milimetric: eventgate-analytics: increase instances to 30 [deployment-charts] - ''https://gerrit.wikimedia.org/r/1227392 (https://phabricator.wikimedia.org/T411454)'
|
|
2026-01-15 18:00:05
|
<jouncebot>
|
bd808: #bothumor Q:How do functions break up? A:They stop calling each other. Rise for Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker) deploy. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1800).
|
|
2026-01-15 18:00:05
|
<jouncebot>
|
Deploy window MediaWiki infrastructure (UTC late) (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1800)
|
|
2026-01-15 18:00:23
|
<wikibugs>
|
('PS1) ''CDanis: tunnelencabulator: Gerrit/CDN 🚀 [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 18:05:56
|
<wikibugs>
|
('PS2) ''Seawolf35gerrit: ukwiki: Add "changetags" to sysop user group. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227394'
|
|
2026-01-15 18:06:00
|
<wikibugs>
|
('CR) ''Ssingh: [C:''+1] "Strictly basing it on the additions to the existing code and modification for gerrit-cdn. I have not tested it so leave it to you :)" [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 18:06:25
|
<jinxer-wm>
|
FIRING: SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 18:06:25
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227394 (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 18:07:13
|
<bd808>
|
Nothing to deploy in my window today
|
|
2026-01-15 18:07:46
|
<wikibugs>
|
('PS3) ''Seawolf35gerrit: ukwiki: Add "changetags" to sysop user group. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227394 (https://phabricator.wikimedia.org/T414277)'
|
|
2026-01-15 18:14:01
|
<wikibugs>
|
('CR) ''Btullis: java: create openjdk-21 image (''1 comment) [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695) (owner: ''Bking)'
|
|
2026-01-15 18:23:11
|
<wikibugs>
|
('CR) ''Bking: java: create openjdk-21 image (''1 comment) [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695) (owner: ''Bking)'
|
|
2026-01-15 18:35:38
|
<wikibugs>
|
('PS1) ''Bking: opensearch-ipoid: Add codfw to list of sites [puppet] - ''https://gerrit.wikimedia.org/r/1227406 (https://phabricator.wikimedia.org/T412447)'
|
|
2026-01-15 18:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 18:41:48
|
<wikibugs>
|
('CR) ''Bking: [C:''+2] opensearch-ipoid: Add codfw to list of sites [puppet] - ''https://gerrit.wikimedia.org/r/1227406 (https://phabricator.wikimedia.org/T412447) (owner: ''Bking)'
|
|
2026-01-15 18:44:41
|
<logmsgbot>
|
!log pt1979@cumin2002 START - Cookbook sre.dns.netbox
|
|
2026-01-15 18:45:35
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+1] "Looks good and verified out of band" [puppet] - ''https://gerrit.wikimedia.org/r/1226922 (https://phabricator.wikimedia.org/T414619) (owner: ''Dduvall)'
|
|
2026-01-15 18:45:50
|
<wikibugs>
|
('CR) ''Muehlenhoff: [C:''+2] admin: Add new yubikey-ssh-fido keys for dduvall [puppet] - ''https://gerrit.wikimedia.org/r/1226922 (https://phabricator.wikimedia.org/T414619) (owner: ''Dduvall)'
|
|
2026-01-15 18:46:56
|
<logmsgbot>
|
!log pt1979@cumin2002 END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
|
|
2026-01-15 18:47:33
|
<wikibugs>
|
('PS2) ''CDanis: tunnelencabulator: Gerrit/CDN 🚀 [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 18:49:16
|
<wikibugs>
|
('CR) ''Ssingh: [C:''+1] "Yes, fair enough :) [PS2-PS1 diff]" [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 18:49:48
|
<wikibugs>
|
('PS3) ''CDanis: tunnelencabulator: Gerrit/CDN 🚀 [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 18:50:03
|
<logmsgbot>
|
!log pt1979@cumin2002 START - Cookbook sre.dns.netbox
|
|
2026-01-15 18:53:44
|
<logmsgbot>
|
!log pt1979@cumin2002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new wikikube-worker nodes - pt1979@cumin2002"
|
|
2026-01-15 18:53:44
|
<wikibugs>
|
('CR) ''Ssingh: [C:''+1] tunnelencabulator: Gerrit/CDN 🚀 [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 18:53:49
|
<logmsgbot>
|
!log pt1979@cumin2002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new wikikube-worker nodes - pt1979@cumin2002"
|
|
2026-01-15 18:53:49
|
<logmsgbot>
|
!log pt1979@cumin2002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
|
|
2026-01-15 18:55:48
|
<wikibugs>
|
('CR) ''CDanis: [V:''+2 C:''+2] tunnelencabulator: Gerrit/CDN 🚀 [debs/wmf-laptop] - ''https://gerrit.wikimedia.org/r/1227395 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 18:58:52
|
<wikibugs>
|
('CR) ''Ssingh: "@vgutierrez@wikimedia.org: We discussed this during the meeting and decided it was fine to merge. Can you stamp this please?" [puppet] - ''https://gerrit.wikimedia.org/r/1218817 (https://phabricator.wikimedia.org/T412863) (owner: ''Milimetric)'
|
|
2026-01-15 18:59:58
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests, ''Data-Engineering, ''Patch-For-Review: Requesting access to analytics-privatedata-users for kareid - https://phabricator.wikimedia.org/T413364#11526053 (''thcipriani) >>! In T413364#11521115, @JMeybohm wrote: > @thcipriani this needs sign-off from you as the approver for the...'
|
|
2026-01-15 19:00:05
|
<jouncebot>
|
jeena and dduvall: #bothumor Q:Why did functions stop calling each other? A:They had arguments. Rise for MediaWiki train - Utc-7 Version . (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T1900).
|
|
2026-01-15 19:01:51
|
<wikibugs>
|
'SRE, ''SRE-Access-Requests, ''Data-Platform-SRE: Requesting deployment access for AKhatun - https://phabricator.wikimedia.org/T414347#11526060 (''thcipriani) >>! In T414347#11512705, @BTullis wrote: > We will need approval from @Ahoelzl as your manager and from @thcipriani as the approver for the `deployme...'
|
|
2026-01-15 19:04:22
|
<wikibugs>
|
('PS1) ''TrainBranchBot: group2 to 1.46.0-wmf.11 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227410 (https://phabricator.wikimedia.org/T413802)'
|
|
2026-01-15 19:04:25
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Initiated by jhuneidi@deploy2002" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227410 (https://phabricator.wikimedia.org/T413802) (owner: ''TrainBranchBot)'
|
|
2026-01-15 19:05:17
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup200[34] - https://phabricator.wikimedia.org/T414717 (''RobH) ''NEW'
|
|
2026-01-15 19:05:21
|
<wikibugs>
|
('Merged) ''jenkins-bot: group2 to 1.46.0-wmf.11 [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227410 (https://phabricator.wikimedia.org/T413802) (owner: ''TrainBranchBot)'
|
|
2026-01-15 19:05:22
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup100[34] - https://phabricator.wikimedia.org/T414718 (''RobH) ''NEW'
|
|
2026-01-15 19:05:42
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup200[34] - https://phabricator.wikimedia.org/T414717#11526094 (''RobH)'
|
|
2026-01-15 19:08:39
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup100[34] - https://phabricator.wikimedia.org/T414718#11526107 (''RobH)'
|
|
2026-01-15 19:09:02
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup200[34] - https://phabricator.wikimedia.org/T414717#11526108 (''RobH) a:''jcrespo Jaime, I made assumptions on the racking details based on the existing ms-backup hosts. Please double-check the racking details in
this task...'
|
|
2026-01-15 19:09:05
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup100[34] - https://phabricator.wikimedia.org/T414718#11526112 (''RobH) a:''jcrespo Jaime, I made assumptions on the racking details based on the existing ms-backup hosts. Please double-check the racking details in
this task...'
|
|
2026-01-15 19:09:36
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup200[34] - https://phabricator.wikimedia.org/T414717#11526116 (''RobH)'
|
|
2026-01-15 19:09:37
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup200[34] - https://phabricator.wikimedia.org/T414717#11526117 (''jcrespo) Will do.'
|
|
2026-01-15 19:09:44
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install ms-backup100[34] - https://phabricator.wikimedia.org/T414718#11526118 (''RobH)'
|
|
2026-01-15 19:11:27
|
<logmsgbot>
|
!log jhuneidi@deploy2002 rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.11 refs T413802
|
|
2026-01-15 19:11:32
|
<stashbot>
|
T413802: 1.46.0-wmf.11 deployment blockers - https://phabricator.wikimedia.org/T413802
|
|
2026-01-15 19:27:59
|
<wikibugs>
|
('PS3) ''A smart kitten: ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277)'
|
|
2026-01-15 19:28:54
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277) (owner: ''A smart kitten)'
|
|
2026-01-15 19:29:12
|
<wikibugs>
|
('CR) ''A smart kitten: "Did some testing locally, this approach seems like it should (hopefully) work :)" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277) (owner: ''A smart kitten)'
|
|
2026-01-15 19:30:08
|
<wikibugs>
|
('CR) ''A smart kitten: ukwiki: Various changes to user rights. (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1225596 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 19:30:32
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 19:30:41
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2145 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87566 and previous config saved to /var/cache/conftool/dbconfig/20260115-193040-marostegui.json
|
|
2026-01-15 19:30:47
|
<stashbot>
|
T411163: Drop ar_sha1 from archive table in wmf production - https://phabricator.wikimedia.org/T411163
|
|
2026-01-15 19:30:47
|
<stashbot>
|
T411164: Drop rev_sha1 from revision table in wmf production - https://phabricator.wikimedia.org/T411164
|
|
2026-01-15 19:31:04
|
<wikibugs>
|
'SRE, ''DNS, ''serviceops, ''Traffic, and 2 others: Set up DNS for abstract.wikipedia.org to be recognised - https://phabricator.wikimedia.org/T411724#11526184 (''ssingh) This is typically done as part of a new wiki creation process, but Traffic is happy to help as required.'
|
|
2026-01-15 19:43:36
|
<wikibugs>
|
('CR) ''Seawolf35gerrit: ukwiki: Various changes to user rights. (''1 comment) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1225596 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 19:48:52
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence-Backup, ''DC-Ops: Q3:rack/setup/install backup2015 - https://phabricator.wikimedia.org/T414724 (''RobH) ''NEW'
|
|
2026-01-15 19:49:20
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence-Backup, ''DC-Ops: Q3:rack/setup/install backup2015 - https://phabricator.wikimedia.org/T414724#11526283 (''RobH)'
|
|
2026-01-15 19:49:31
|
<logmsgbot>
|
!log jasmine@cumin1003 START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2003-2004,2007-2010,2019-2032,2040,2043,2045,2048].codfw.wmnet
|
|
2026-01-15 19:50:21
|
<logmsgbot>
|
!log jasmine@cumin1003 END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2003-2004,2007-2010,2019-2032,2040,2043,2045,2048].codfw.wmnet
|
|
2026-01-15 19:51:10
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence-Backup, ''DC-Ops: Q3:rack/setup/install backup2015 - https://phabricator.wikimedia.org/T414724#11526289 (''RobH) a:''jcrespo Jaime, I had to split up the expansion and refresh budget lines for backup this quarter, so this racking task (and its parent
order task) on...'
|
|
2026-01-15 19:51:14
|
<wikibugs>
|
('CR) ''Jasmine: [C:''+2] wikikube: decommission worker[2003-2004,2007-2010,2019-2032,2040,2043,2045,2048] [puppet] - ''https://gerrit.wikimedia.org/r/1205225 (https://phabricator.wikimedia.org/T409102) (owner: ''Jasmine)'
|
|
2026-01-15 19:51:54
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1199 (T413525)', diff saved to https://phabricator.wikimedia.org/P87567 and previous config saved to /var/cache/conftool/dbconfig/20260115-195153-marostegui.json
|
|
2026-01-15 19:51:59
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 19:52:30
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup2015 - https://phabricator.wikimedia.org/T414724#11526301 (''RobH)'
|
|
2026-01-15 19:52:51
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q#:rack/setup/install X - https://phabricator.wikimedia.org/T414725 (''RobH) ''NEW'
|
|
2026-01-15 19:53:21
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q#:rack/setup/install X - https://phabricator.wikimedia.org/T414725#11526321 (''RobH) a:''jcrespo Jaime, I had to split up the expansion and refresh budget lines for backup this quarter, so this racking task (and its parent order task) only
covers the li...'
|
|
2026-01-15 19:53:53
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q#:rack/setup/install X - https://phabricator.wikimedia.org/T414725#11526331 (''RobH)'
|
|
2026-01-15 19:54:01
|
<jasmine_>
|
!log “homer run T409102”
|
|
2026-01-15 19:54:05
|
<stashbot>
|
Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
|
|
2026-01-15 19:54:05
|
<stashbot>
|
T409102: decommission wikikube-worker[2003-2004,2007-2010,2019-2032,2040,2043,2045,2048].codfw.wmnet - https://phabricator.wikimedia.org/T409102
|
|
2026-01-15 19:56:15
|
<wikibugs>
|
'SRE, ''DNS, ''serviceops, ''Traffic, and 2 others: Set up DNS for abstract.wikipedia.org to be recognised - https://phabricator.wikimedia.org/T411724#11526339 (''Jdforrester-WMF) >>! In T411724#11526184, @ssingh wrote: > This is typically done as part of a new wiki creation process, but Traffic is happy...'
|
|
2026-01-15 20:02:03
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P87568 and previous config saved to /var/cache/conftool/dbconfig/20260115-200202-marostegui.json
|
|
2026-01-15 20:02:52
|
<wikibugs>
|
('PS1) ''CDanis: services: gerrit* --> monitoring_setup [puppet] - ''https://gerrit.wikimedia.org/r/1227423 (https://phabricator.wikimedia.org/T411895)'
|
|
2026-01-15 20:03:05
|
<wikibugs>
|
('CR) ''CDanis: "check experimental" [puppet] - ''https://gerrit.wikimedia.org/r/1227423 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 20:05:17
|
<wikibugs>
|
'ops-codfw, ''DC-Ops, ''decommission-hardware, ''serviceops, ''Patch-For-Review: decommission wikikube-worker[2003-2004,2007-2010,2019-2032,2040,2043,2045,2048].codfw.wmnet - https://phabricator.wikimedia.org/T409102#11526351 (''jasmine_)'
|
|
2026-01-15 20:07:22
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2219 (T413525)', diff saved to https://phabricator.wikimedia.org/P87569 and previous config saved to /var/cache/conftool/dbconfig/20260115-200721-marostegui.json
|
|
2026-01-15 20:07:26
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 20:12:11
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P87570 and previous config saved to /var/cache/conftool/dbconfig/20260115-201210-marostegui.json
|
|
2026-01-15 20:12:53
|
<jinxer-wm>
|
FIRING: KubernetesAPILatency: High Kubernetes API latency (LIST secrets) on k8s@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency
|
|
2026-01-15 20:14:11
|
<jinxer-wm>
|
FIRING: JobUnavailable: Reduced availability for job thanos-compact in ops@codfw - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/NEJu05xZz/prometheus-targets - https://alerts.wikimedia.org/?q=alertname%3DJobUnavailable
|
|
2026-01-15 20:17:30
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P87571 and previous config saved to /var/cache/conftool/dbconfig/20260115-201730-marostegui.json
|
|
2026-01-15 20:19:05
|
<icinga-wm>
|
PROBLEM - mailman list info ssl expiry on lists1004 is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
|
|
2026-01-15 20:19:55
|
<icinga-wm>
|
RECOVERY - mailman list info ssl expiry on lists1004 is OK: OK - Certificate lists.wikimedia.org will expire on Sat 04 Apr 2026 07:22:16 PM GMT +0000. https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
|
|
2026-01-15 20:19:57
|
<wikibugs>
|
('PS4) ''A smart kitten: ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277)'
|
|
2026-01-15 20:20:56
|
<wikibugs>
|
('CR) ''A smart kitten: "PS4 is a rebase on top of https://gerrit.wikimedia.org/r/1227394, after I realised the two patches would probably have merge conflicts wit" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277) (owner: ''A smart kitten)'
|
|
2026-01-15 20:22:19
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db1199 (T413525)', diff saved to https://phabricator.wikimedia.org/P87572 and previous config saved to /var/cache/conftool/dbconfig/20260115-202218-marostegui.json
|
|
2026-01-15 20:22:23
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 20:22:36
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1221.eqiad.wmnet with reason: Maintenance
|
|
2026-01-15 20:22:53
|
<jinxer-wm>
|
RESOLVED: KubernetesAPILatency: High Kubernetes API latency (LIST secrets) on k8s@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency
|
|
2026-01-15 20:22:57
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
|
|
2026-01-15 20:23:06
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db1221 (T413525)', diff saved to https://phabricator.wikimedia.org/P87573 and previous config saved to /var/cache/conftool/dbconfig/20260115-202305-marostegui.json
|
|
2026-01-15 20:27:38
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P87574 and previous config saved to /var/cache/conftool/dbconfig/20260115-202738-marostegui.json
|
|
2026-01-15 20:31:35
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup20[16-20] - https://phabricator.wikimedia.org/T414727 (''RobH) ''NEW'
|
|
2026-01-15 20:31:52
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup20[16-20] - https://phabricator.wikimedia.org/T414727#11526457 (''RobH)'
|
|
2026-01-15 20:33:07
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup10[16-20] - https://phabricator.wikimedia.org/T414728 (''RobH) ''NEW'
|
|
2026-01-15 20:33:28
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup10[16-20] - https://phabricator.wikimedia.org/T414728#11526474 (''RobH)'
|
|
2026-01-15 20:35:46
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup1015 - https://phabricator.wikimedia.org/T414725#11526489 (''RobH)'
|
|
2026-01-15 20:36:35
|
<wikibugs>
|
'ops-codfw, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup20[16-20] - https://phabricator.wikimedia.org/T414727#11526492 (''RobH) a:''jcrespo Jaime, I had to split up the expansion and refresh budget lines for backup this quarter, so this racking task (and its parent order
task) only...'
|
|
2026-01-15 20:36:39
|
<wikibugs>
|
'ops-eqiad, ''SRE, ''Data-Persistence, ''DC-Ops: Q3:rack/setup/install backup10[16-20] - https://phabricator.wikimedia.org/T414728#11526498 (''RobH) a:''jcrespo Jaime, I had to split up the expansion and refresh budget lines for backup this quarter, so this racking task (and its parent order
task) only...'
|
|
2026-01-15 20:37:47
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Repooling after maintenance db2219 (T413525)', diff saved to https://phabricator.wikimedia.org/P87575 and previous config saved to /var/cache/conftool/dbconfig/20260115-203746-marostegui.json
|
|
2026-01-15 20:37:51
|
<stashbot>
|
T413525: Add il_target_id to imagelinks table in wmf production - https://phabricator.wikimedia.org/T413525
|
|
2026-01-15 20:37:52
|
<logmsgbot>
|
!log marostegui@cumin1003 DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
|
|
2026-01-15 20:38:00
|
<logmsgbot>
|
!log marostegui@cumin1003 dbctl commit (dc=all): 'Depooling db2236 (T413525)', diff saved to https://phabricator.wikimedia.org/P87576 and previous config saved to /var/cache/conftool/dbconfig/20260115-203759-marostegui.json
|
|
2026-01-15 20:39:07
|
<wikibugs>
|
('PS1) ''Jasmine: wikikube: decommission wikikube-worker[2052-2054,2063,2079-2084,2096-2101].codfw.wmnet [puppet] - ''https://gerrit.wikimedia.org/r/1227431 (https://phabricator.wikimedia.org/T409103)'
|
|
2026-01-15 20:39:38
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] wikikube: decommission wikikube-worker[2052-2054,2063,2079-2084,2096-2101].codfw.wmnet [puppet] - ''https://gerrit.wikimedia.org/r/1227431 (https://phabricator.wikimedia.org/T409103) (owner: ''Jasmine)'
|
|
2026-01-15 20:41:10
|
<wikibugs>
|
('PS2) ''Jasmine: wikikube: decommission worker[2052-2054,2063,2079-2084,2096-2101].codfw.wmnet [puppet] - ''https://gerrit.wikimedia.org/r/1227431 (https://phabricator.wikimedia.org/T409103)'
|
|
2026-01-15 20:44:47
|
<wikibugs>
|
('CR) ''Andrew Bogott: [C:''+2] Revert "wmcs cinder backups: move all backups to 2003 so 2004 can be reimaged" [puppet] - ''https://gerrit.wikimedia.org/r/1226952 (owner: ''Andrew Bogott)'
|
|
2026-01-15 20:59:10
|
<wikibugs>
|
('PS1) ''Clare Ming: Update experiment code for JS, PHP SDKs testing of TK [extensions/TestKitchen] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227435 (https://phabricator.wikimedia.org/T414528)'
|
|
2026-01-15 20:59:30
|
<wikibugs>
|
('CR) ''ScheduleDeploymentBot: "Scheduled for deployment in the [Thursday, January 15 UTC late backport window](https://wikitech.wikimedia.org/wiki/Deployments#deploycal-"; [extensions/TestKitchen] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227435 (https://phabricator.wikimedia.org/T414528) (owner: ''Clare
Ming)'
|
|
2026-01-15 21:00:05
|
<jouncebot>
|
RoanKattouw, Urbanecm, TheresNoTime, kindrobot, and cjming: #bothumor Q:Why did functions stop calling each other? A:They had arguments. Rise for UTC late backport window . (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T2100).
|
|
2026-01-15 21:00:05
|
<jouncebot>
|
xSavitar, katherine_g, Seawolf35, A_smart_kitten, and cjming: A patch you scheduled for UTC late backport window is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
|
|
2026-01-15 21:00:16
|
<A_smart_kitten>
|
heya, i'm here :)
|
|
2026-01-15 21:00:17
|
<katherine_g>
|
o/
|
|
2026-01-15 21:00:20
|
<xSavitar>
|
o/
|
|
2026-01-15 21:00:29
|
<Seawolf35>
|
o/
|
|
2026-01-15 21:01:05
|
<xSavitar>
|
I can self-service my backports then deployers/others can carry one 🙏🏽
|
|
2026-01-15 21:01:21
|
<xSavitar>
|
*on
|
|
2026-01-15 21:01:23
|
<jeena>
|
I can help with backporting if needed
|
|
2026-01-15 21:01:29
|
<A_smart_kitten>
|
I will need a deployer
|
|
2026-01-15 21:01:36
|
<Seawolf35>
|
Me as well
|
|
2026-01-15 21:01:39
|
<xSavitar>
|
jeena, I'll poke you once I'm done.
|
|
2026-01-15 21:01:58
|
<jeena>
|
xSavitar: 👍 Thank you!
|
|
2026-01-15 21:03:12
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by derick@deploy2002 using scap backport" [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227281 (owner: ''D3r1ck01)'
|
|
2026-01-15 21:03:12
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by derick@deploy2002 using scap backport" [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227282 (https://phabricator.wikimedia.org/T413947) (owner: ''D3r1ck01)'
|
|
2026-01-15 21:04:01
|
<cjming>
|
also happy to deploy if needed - will self-service when it's my turn
|
|
2026-01-15 21:05:25
|
<wikibugs>
|
('CR) ''Gmodena: blazegraph: alert on ratio of failed queries increase (''1 comment) [alerts] - ''https://gerrit.wikimedia.org/r/1227364 (https://phabricator.wikimedia.org/T414306) (owner: ''Trueg)'
|
|
2026-01-15 21:06:18
|
<wikibugs>
|
('Merged) ''jenkins-bot: Control: When saving grants, ensure array has no gaps [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227281 (owner: ''D3r1ck01)'
|
|
2026-01-15 21:06:18
|
<wikibugs>
|
('Merged) ''jenkins-bot: Control: Keep irrevocable grants when accepting new OAuth 2 consumers [extensions/OAuth] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227282 (https://phabricator.wikimedia.org/T413947) (owner: ''D3r1ck01)'
|
|
2026-01-15 21:06:39
|
<logmsgbot>
|
!log derick@deploy2002 Started scap sync-world: Backport for [[gerrit:1227281|Control: When saving grants, ensure array has no gaps]], [[gerrit:1227282|Control: Keep irrevocable grants when accepting new OAuth 2 consumers (T413947)]]
|
|
2026-01-15 21:06:43
|
<stashbot>
|
T413947: Updating grants (via Special:OAuthManageMyGrants) of OAuth accepted consumers overrides its grants with empty array - https://phabricator.wikimedia.org/T413947
|
|
2026-01-15 21:08:36
|
<logmsgbot>
|
!log derick@deploy2002 derick, d3r1ck01: Backport for [[gerrit:1227281|Control: When saving grants, ensure array has no gaps]], [[gerrit:1227282|Control: Keep irrevocable grants when accepting new OAuth 2 consumers (T413947)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 21:09:07
|
<xSavitar>
|
testing...
|
|
2026-01-15 21:09:54
|
<jinxer-wm>
|
FIRING: KubernetesAPILatency: High Kubernetes API latency (LIST secrets) on k8s@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency
|
|
2026-01-15 21:11:58
|
<xSavitar>
|
Things look good and issues seem to have been resolved. Syncing
|
|
2026-01-15 21:12:09
|
<logmsgbot>
|
!log derick@deploy2002 derick, d3r1ck01: Continuing with sync
|
|
2026-01-15 21:14:53
|
<jinxer-wm>
|
RESOLVED: KubernetesAPILatency: High Kubernetes API latency (LIST secrets) on k8s@codfw - https://wikitech.wikimedia.org/wiki/Kubernetes - https://grafana.wikimedia.org/d/ddNd-sLnk/kubernetes-api-details?var-site=codfw&var-cluster=k8s&var-latency_percentile=0.95&var-verb=LIST - https://alerts.wikimedia.org/?q=alertname%3DKubernetesAPILatency
|
|
2026-01-15 21:16:20
|
<logmsgbot>
|
!log derick@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227281|Control: When saving grants, ensure array has no gaps]], [[gerrit:1227282|Control: Keep irrevocable grants when accepting new OAuth 2 consumers (T413947)]] (duration: 09m 41s)
|
|
2026-01-15 21:16:24
|
<stashbot>
|
T413947: Updating grants (via Special:OAuthManageMyGrants) of OAuth accepted consumers overrides its grants with empty array - https://phabricator.wikimedia.org/T413947
|
|
2026-01-15 21:16:27
|
<Seawolf35>
|
I’ll be afk for 5 min or so, I’ll be back in time for my patch.
|
|
2026-01-15 21:16:47
|
<xSavitar>
|
respectfully yours, jeena / cjming, over to you 🙏🏽
|
|
2026-01-15 21:17:01
|
<xSavitar>
|
I'm done!
|
|
2026-01-15 21:17:57
|
<jeena>
|
okay well I was going to see if I could do A_smart_kitten and Seawolf35 's one together
|
|
2026-01-15 21:18:21
|
<jeena>
|
but do you want to go ahead first cjming ?
|
|
2026-01-15 21:18:22
|
<A_smart_kitten>
|
jeena: that's actually what i was thinking myself as well (so long as Seawolf35 is okay with it)
|
|
2026-01-15 21:18:30
|
<cjming>
|
bows to jeena
|
|
2026-01-15 21:18:32
|
<jeena>
|
yeah they just left the channel
|
|
2026-01-15 21:18:41
|
<A_smart_kitten>
|
yeah, as they're AFK right now maybe cjming or katherine_g could go first?
|
|
2026-01-15 21:18:50
|
<cjming>
|
jeena: go ahead - i have to fiddle with something first
|
|
2026-01-15 21:19:01
|
<jeena>
|
okay I'll do yours katherine_g
|
|
2026-01-15 21:19:07
|
<katherine_g>
|
ok, ty
|
|
2026-01-15 21:19:47
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by jhuneidi@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227346 (https://phabricator.wikimedia.org/T403982) (owner: ''Jsn.sherman)'
|
|
2026-01-15 21:20:40
|
<wikibugs>
|
('Merged) ''jenkins-bot: Deploy PersonalDashboard to testwiki [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227346 (https://phabricator.wikimedia.org/T403982) (owner: ''Jsn.sherman)'
|
|
2026-01-15 21:20:59
|
<logmsgbot>
|
!log jhuneidi@deploy2002 Started scap sync-world: Backport for [[gerrit:1227346|Deploy PersonalDashboard to testwiki (T403982)]]
|
|
2026-01-15 21:21:04
|
<stashbot>
|
T403982: Create and deploy Extension:PersonalDashboard - https://phabricator.wikimedia.org/T403982
|
|
2026-01-15 21:21:57
|
<wikibugs>
|
('Abandoned) ''Arlolra: Support incremental roll out of Parsoid Read Views [extensions/ParserMigration] (wmf/1.46.0-wmf.7) - ''https://gerrit.wikimedia.org/r/1224837 (https://phabricator.wikimedia.org/T391881) (owner: ''Arlolra)'
|
|
2026-01-15 21:22:59
|
<logmsgbot>
|
!log jhuneidi@deploy2002 jhuneidi, jsn: Backport for [[gerrit:1227346|Deploy PersonalDashboard to testwiki (T403982)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 21:23:06
|
<Seawolf35>
|
Back
|
|
2026-01-15 21:24:11
|
<jinxer-wm>
|
FIRING: KubernetesCalicoDown: ml-serve2004.codfw.wmnet is not running calico-node Pod - https://wikitech.wikimedia.org/wiki/Calico#Operations - https://grafana.wikimedia.org/d/G8zPL7-Wz/?var-dc=codfw%20prometheus%2Fk8s-mlserve&var-instance=ml-serve2004.codfw.wmnet - https://alerts.wikimedia.org/?q=alertname%3DKubernetesCalicoDown
|
|
2026-01-15 21:25:45
|
<jeena>
|
katherine_g: do you need to check anything on the testservers?
|
|
2026-01-15 21:25:50
|
<katherine_g>
|
alright, tested and it looks good on my end
|
|
2026-01-15 21:25:57
|
<katherine_g>
|
jeena: ty
|
|
2026-01-15 21:25:59
|
<jeena>
|
cool thanks!
|
|
2026-01-15 21:26:06
|
<logmsgbot>
|
!log jhuneidi@deploy2002 jhuneidi, jsn: Continuing with sync
|
|
2026-01-15 21:26:09
|
<wikibugs>
|
('CR) ''Ssingh: services: gerrit* --> monitoring_setup (''1 comment) [puppet] - ''https://gerrit.wikimedia.org/r/1227423 (https://phabricator.wikimedia.org/T411895) (owner: ''CDanis)'
|
|
2026-01-15 21:27:26
|
<jeena>
|
Seawolf35: we were wondering if your change can be deployed with A_smart_kitten 's together?
|
|
2026-01-15 21:27:36
|
<Seawolf35>
|
Sure
|
|
2026-01-15 21:27:45
|
<A_smart_kitten>
|
:)
|
|
2026-01-15 21:30:17
|
<logmsgbot>
|
!log jhuneidi@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227346|Deploy PersonalDashboard to testwiki (T403982)]] (duration: 09m 18s)
|
|
2026-01-15 21:30:21
|
<stashbot>
|
T403982: Create and deploy Extension:PersonalDashboard - https://phabricator.wikimedia.org/T403982
|
|
2026-01-15 21:31:14
|
<jeena>
|
cjming: I'm going to go ahead and do the remaining two now
|
|
2026-01-15 21:31:28
|
<cjming>
|
jeena: great - thanks!
|
|
2026-01-15 21:32:40
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by jhuneidi@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277) (owner: ''A smart kitten)'
|
|
2026-01-15 21:32:40
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by jhuneidi@deploy2002 using scap backport" [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227394 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 21:33:33
|
<wikibugs>
|
('Merged) ''jenkins-bot: ukwiki: Add "changetags" to sysop user group. [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227394 (https://phabricator.wikimedia.org/T414277) (owner: ''Seawolf35gerrit)'
|
|
2026-01-15 21:33:35
|
<wikibugs>
|
('Merged) ''jenkins-bot: ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227385 (https://phabricator.wikimedia.org/T414277) (owner: ''A smart kitten)'
|
|
2026-01-15 21:33:53
|
<logmsgbot>
|
!log jhuneidi@deploy2002 Started scap sync-world: Backport for [[gerrit:1227385|ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php (T414277 T414684)]], [[gerrit:1227394|ukwiki: Add "changetags" to sysop user group. (T414277)]]
|
|
2026-01-15 21:33:59
|
<stashbot>
|
T414277: Some changes in user group rights in ukwiki - https://phabricator.wikimedia.org/T414277
|
|
2026-01-15 21:33:59
|
<stashbot>
|
T414684: FlaggedRevs-specific group rights from core-Permissions.php get overridden by flaggedrevs.php - https://phabricator.wikimedia.org/T414684
|
|
2026-01-15 21:35:50
|
<logmsgbot>
|
!log jhuneidi@deploy2002 asmartkitten, seawolf35gerrit, jhuneidi: Backport for [[gerrit:1227385|ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php (T414277 T414684)]], [[gerrit:1227394|ukwiki: Add "changetags" to sysop user group. (T414277)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 21:36:09
|
<A_smart_kitten>
|
looking (cc Seawolf35)
|
|
2026-01-15 21:36:12
|
<Seawolf35>
|
Testing
|
|
2026-01-15 21:37:06
|
<A_smart_kitten>
|
my patch looks good AFAICS :]
|
|
2026-01-15 21:37:20
|
<Seawolf35>
|
Mine lgtm
|
|
2026-01-15 21:37:28
|
<jeena>
|
thanks!
|
|
2026-01-15 21:37:36
|
<logmsgbot>
|
!log jhuneidi@deploy2002 asmartkitten, seawolf35gerrit, jhuneidi: Continuing with sync
|
|
2026-01-15 21:39:40
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 21:41:39
|
<logmsgbot>
|
!log jhuneidi@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227385|ukwiki: Move assignments of FlaggedRevs permissions to flaggedrevs.php (T414277 T414684)]], [[gerrit:1227394|ukwiki: Add "changetags" to sysop user group. (T414277)]] (duration: 07m 46s)
|
|
2026-01-15 21:41:45
|
<stashbot>
|
T414277: Some changes in user group rights in ukwiki - https://phabricator.wikimedia.org/T414277
|
|
2026-01-15 21:41:45
|
<stashbot>
|
T414684: FlaggedRevs-specific group rights from core-Permissions.php get overridden by flaggedrevs.php - https://phabricator.wikimedia.org/T414684
|
|
2026-01-15 21:42:01
|
<A_smart_kitten>
|
jeena: thank you for deploying!
|
|
2026-01-15 21:42:04
|
<jeena>
|
cjming: ready for you
|
|
2026-01-15 21:42:10
|
<jeena>
|
A_smart_kitten: yw!
|
|
2026-01-15 21:42:10
|
<cjming>
|
tysm!
|
|
2026-01-15 21:42:20
|
<Seawolf35>
|
jeena ty!
|
|
2026-01-15 21:42:27
|
<jeena>
|
yw!
|
|
2026-01-15 21:43:16
|
<wikibugs>
|
('CR) ''TrainBranchBot: [C:''+2] "Approved by cjming@deploy2002 using scap backport" [extensions/TestKitchen] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227435 (https://phabricator.wikimedia.org/T414528) (owner: ''Clare Ming)'
|
|
2026-01-15 21:43:37
|
<icinga-wm>
|
PROBLEM - Host titan1002 is DOWN: PING CRITICAL - Packet loss = 100%
|
|
2026-01-15 21:43:55
|
<icinga-wm>
|
RECOVERY - Host titan1002 is UP: PING WARNING - Packet loss = 33%, RTA = 1653.67 ms
|
|
2026-01-15 21:44:11
|
<jinxer-wm>
|
FIRING: [2x] ProbeDown: Service titan1002:443 has failed probes (http_thanos_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#titan1002:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
|
|
2026-01-15 21:45:08
|
<jinxer-wm>
|
FIRING: [2x] ProbeDown: Service titan1002:443 has failed probes (http_thanos_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#titan1002:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
|
|
2026-01-15 21:47:17
|
<wikibugs>
|
'SRE, ''Release Pipeline, ''serviceops, ''Release-Engineering-Team (Seen): Kask functional testing with Cassandra via the Deployment Pipeline - https://phabricator.wikimedia.org/T224041#11526839 (''Eevans)'
|
|
2026-01-15 21:49:11
|
<jinxer-wm>
|
RESOLVED: [2x] ProbeDown: Service titan1002:443 has failed probes (http_thanos_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#titan1002:443 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown
|
|
2026-01-15 21:52:45
|
<wikibugs>
|
('Merged) ''jenkins-bot: Update experiment code for JS, PHP SDKs testing of TK [extensions/TestKitchen] (wmf/1.46.0-wmf.11) - ''https://gerrit.wikimedia.org/r/1227435 (https://phabricator.wikimedia.org/T414528) (owner: ''Clare Ming)'
|
|
2026-01-15 21:53:06
|
<logmsgbot>
|
!log cjming@deploy2002 Started scap sync-world: Backport for [[gerrit:1227435|Update experiment code for JS, PHP SDKs testing of TK (T414528 T414530)]]
|
|
2026-01-15 21:53:12
|
<wikibugs>
|
'SRE-OnFire, ''Cassandra, ''MediaWiki-Platform-Team (Radar), ''Sustainability (Incident Followup): Provision anonymous session storage - https://phabricator.wikimedia.org/T408935#11526882 (''Eevans)'
|
|
2026-01-15 21:53:13
|
<stashbot>
|
T414528: Run synthetic experiment using Javascript SDK in Test Kitchen - https://phabricator.wikimedia.org/T414528
|
|
2026-01-15 21:53:13
|
<stashbot>
|
T414530: Run synthetic experiment using PHP SDK in Test Kitchen - https://phabricator.wikimedia.org/T414530
|
|
2026-01-15 21:55:07
|
<logmsgbot>
|
!log cjming@deploy2002 cjming: Backport for [[gerrit:1227435|Update experiment code for JS, PHP SDKs testing of TK (T414528 T414530)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
|
|
2026-01-15 21:56:07
|
<cjming>
|
checking
|
|
2026-01-15 21:58:18
|
<cjming>
|
syncing
|
|
2026-01-15 21:58:30
|
<logmsgbot>
|
!log cjming@deploy2002 cjming: Continuing with sync
|
|
2026-01-15 22:00:04
|
<jouncebot>
|
Deploy window Web Team deployment window (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20260115T2200)
|
|
2026-01-15 22:02:29
|
<logmsgbot>
|
!log cjming@deploy2002 Finished scap sync-world: Backport for [[gerrit:1227435|Update experiment code for JS, PHP SDKs testing of TK (T414528 T414530)]] (duration: 09m 23s)
|
|
2026-01-15 22:02:34
|
<stashbot>
|
T414528: Run synthetic experiment using Javascript SDK in Test Kitchen - https://phabricator.wikimedia.org/T414528
|
|
2026-01-15 22:02:35
|
<stashbot>
|
T414530: Run synthetic experiment using PHP SDK in Test Kitchen - https://phabricator.wikimedia.org/T414530
|
|
2026-01-15 22:03:47
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] logos: Add WP25 temporary logo for Hausa Wikipedia (hawiki) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227443 (https://phabricator.wikimedia.org/T414736) (owner: ''SarthakSingh2904)'
|
|
2026-01-15 22:06:40
|
<jinxer-wm>
|
FIRING: SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 22:23:45
|
<wikibugs>
|
('PS2) ''SarthakSingh2904: logos: Add WP25 temporary logo for Hausa Wikipedia (hawiki) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227443 (https://phabricator.wikimedia.org/T414736)'
|
|
2026-01-15 22:24:47
|
<wikibugs>
|
('CR) ''CI reject: [V:''-1] logos: Add WP25 temporary logo for Hausa Wikipedia (hawiki) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227443 (https://phabricator.wikimedia.org/T414736) (owner: ''SarthakSingh2904)'
|
|
2026-01-15 22:25:10
|
<wikibugs>
|
('CR) ''Ryan Kemper: [C:''+1] java: create openjdk-21 image [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695) (owner: ''Bking)'
|
|
2026-01-15 22:25:32
|
<wikibugs>
|
('CR) ''Bking: [C:''+2] java: create openjdk-21 image [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695) (owner: ''Bking)'
|
|
2026-01-15 22:25:43
|
<wikibugs>
|
('CR) ''Bking: [V:''+2 C:''+2] java: create openjdk-21 image [docker-images/production-images] - ''https://gerrit.wikimedia.org/r/1227376 (https://phabricator.wikimedia.org/T414695) (owner: ''Bking)'
|
|
2026-01-15 22:32:14
|
<wikibugs>
|
('Abandoned) ''SarthakSingh2904: logos: Add WP25 temporary logo for Hausa Wikipedia (hawiki) [mediawiki-config] - ''https://gerrit.wikimedia.org/r/1227443 (https://phabricator.wikimedia.org/T414736) (owner: ''SarthakSingh2904)'
|
|
2026-01-15 22:38:06
|
<jinxer-wm>
|
FIRING: CoreRouterInterfaceDown: Core router interface down - pfw1-codfw:reth1 (Subnet frack-fundraising-codfw in F5) - https://wikitech.wikimedia.org/wiki/Network_monitoring#Router_interface_down - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=pfw1-codfw:9804 - https://alerts.wikimedia.org/?q=alertname%3DCoreRouterInterfaceDown
|
|
2026-01-15 22:59:07
|
<logmsgbot>
|
!log pt1979@cumin2002 START - Cookbook sre.dns.netbox
|
|
2026-01-15 23:02:37
|
<logmsgbot>
|
!log pt1979@cumin2002 START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new wikikube-worker nodes - pt1979@cumin2002"
|
|
2026-01-15 23:02:43
|
<logmsgbot>
|
!log pt1979@cumin2002 END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new wikikube-worker nodes - pt1979@cumin2002"
|
|
2026-01-15 23:02:43
|
<logmsgbot>
|
!log pt1979@cumin2002 END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
|
|
2026-01-15 23:04:25
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 23:04:27
|
<wikibugs>
|
'ops-codfw, ''SRE, ''DC-Ops, ''serviceops: Q2:rack/setup/install wikikube-worker2332-56 - https://phabricator.wikimedia.org/T408757#11527136 (''Papaul)'
|
|
2026-01-15 23:09:25
|
<jinxer-wm>
|
FIRING: [14x] SystemdUnitFailed: prometheus-node-textfile-check-nft.service on tcp-proxy1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed
|
|
2026-01-15 23:13:13
|
<wikibugs>
|
'ops-codfw, ''SRE, ''DC-Ops, ''serviceops: Q2:rack/setup/install wikikube-worker2332-56 - https://phabricator.wikimedia.org/T408757#11527155 (''Papaul)'
|
|
2026-01-15 23:13:46
|
<wikibugs>
|
('PS1) ''Jasmine: wikikube: decommission wikikube-worker[2116-2123,2216-2241].codfw.wmnet [puppet] - ''https://gerrit.wikimedia.org/r/1227454 (https://phabricator.wikimedia.org/T409104)'
|
|
2026-01-15 23:52:01
|
<icinga-wm>
|
PROBLEM - OSPF status on cr2-eqiad is CRITICAL: OSPFv2: 6/7 UP : OSPFv3: 6/7 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
|
|
2026-01-15 23:52:23
|
<icinga-wm>
|
PROBLEM - OSPF status on cr1-drmrs is CRITICAL: OSPFv2: 1/2 UP : OSPFv3: 1/2 UP https://wikitech.wikimedia.org/wiki/Network_monitoring%23OSPF_status
|
|
2026-01-15 23:54:10
|
<jinxer-wm>
|
FIRING: [2x] BFDdown: BFD session down between cr2-eqiad and 185.15.58.139 - https://wikitech.wikimedia.org/wiki/Network_monitoring#BFD_status - https://grafana.wikimedia.org/d/fb403d62-5f03-434a-9dff-bd02b9fff504/network-device-overview?var-instance=cr2-eqiad:9804 - https://alerts.wikimedia.org/?q=alertname%3DBFDdown
|
|
2026-01-15 23:54:39
|
<jinxer-wm>
|
FIRING: [4x] CoreBGPDown: Core BGP session down between cr1-drmrs and cr2-eqiad (185.15.58.138) - group Confed_eqiad - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DCoreBGPDown
|