[03:04:40] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [03:11:05] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [03:14:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/debian synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [04:11:05] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [05:39:53] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#11122271 (10ayounsi) 05Open→03Resolved a:03ayounsi Closing that task as more specific subtasks exists. [07:28:24] I'm deploying the Homer patch to add install2005 and it flags the addition of frdata2002, which looking at https://phabricator.wikimedia.org/T400275 seems fine, so I'll merge it along [08:23:45] sounds good [09:21:54] Hello, getting an error on a homer codfw diff check `Policy error: Policy kubedse_import referenced but not defined In [edit] (policy-options)` on the devices `cr2-codfw.wikimedia.org` and `cr1-codfw.wikimedia.org` [09:23:04] The change is to add dse-k8s-codfw-ctrl and dse-k8s-codfw-worker hosts to BGP [09:27:30] Has anyone encountered a similar error before? cc XioNoX topranks [09:28:27] stevemunene: let me double check perhaps we missed some of the templates needed for that group [09:29:02] Ack, thanks topranks [09:33:58] topranks: yeah, three is a small miscconfig [09:34:09] https://github.com/wikimedia/operations-homer-public/blob/master/config/sites.yaml#L11C6-L11C29 should be under codfw [09:34:29] https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1182500 [09:34:32] yeah [09:34:57] +1 [09:35:45] thanks [09:46:09] stevemunene: can you try again? [09:46:27] you need to run "sudo run-puppet-agent" on the cumin host you're using first [09:47:11] Looks good, thanks for the fix topranks XioNoX [10:19:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:55:48] FIRING: PuppetZeroResources: Puppet has failed generate resources on puppetmaster1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:00:49] FIRING: [2x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:05:48] FIRING: [3x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:20:48] FIRING: [4x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:30:49] FIRING: [4x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:40:48] FIRING: [4x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:45:48] RESOLVED: [4x] PuppetZeroResources: Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:39:25] FIRING: [2x] SystemdUnitFailed: wmf_auto_restart_atftpd.service on install2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:20:21] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Management routers: use BGP instead of OSPF - https://phabricator.wikimedia.org/T294845#11123850 (10Papaul) Diff on mr1-ulsfo ` + bgp { + group Production { + type external; + import BGP_Default; + expo... [13:55:38] 10netops, 06Infrastructure-Foundations, 06SRE: Investigate using BGP addpath for unicast IBGP spine/leaf pods - https://phabricator.wikimedia.org/T402640#11124076 (10cmooney) >>! In T402640#11121128, @ayounsi wrote: > If I understand correctly we currently get some "per rack" load balancing, where `E3` might... [15:00:49] FIRING: PuppetConstantChange: Puppet performing a change on every puppet run on cumin1002:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange