RSS

2020-12-11: IAC Team Status Update

This week has been wild. Lots of community contributions, lots of releases, DSC updates, removing inappropriate language, Cloud CI progress, PDK 2.0, and more! Check out below for the details.

Community Contributions

We’d like to thank the following people in the Puppet Community for their contributions over this past week:

On Community Monday we processed 17 PRs (5 on tooling 12 on modules), releasing 6 modules, despite early vacations starting to eat into our availability. Next week we’ll see if we can go beyond PRs into the issues and tickets and start processing some of that backlog. Please swing by our office hours on Monday on Slack if you want to directly contact us about anything module related.

New Module / Gem Releases - Puppet 7 support

Our supported modules are now officially compatible with Puppet 7. We’re committed to rolling out the module releases as fast as we can, incorporating as many valuable changes as possible from the Community in to the releases too. Some modules pushed to the Forge so far, are:

Tales from the Intern

Disha’s week 15 was very busy too. Find out more information on her blog

DSC

This was a busy week for DSC! On Monday, we rebuilt everything on the forge (incrementing the last digit of all released versions by 1) to take advantage of some new improvements to the type generation (more on that below!)

  • we have also started the process of backporting all prior-released versions of the PowerShell modules with DSC resources already found on the forge. Finally, we pushed out a quick bugfix to the base provider in puppetlabs-pwshlib, so be sure to update your pins to 0.6.2!

Changes to Puppetized DSC Resource Types

If you’d like to walk these changes more deeply, you can checkout our latest changelog entry. In short:

  • Cleaned up some generated documentation
  • Added handling for non-retrievable DSC Resource properties as parameters (no more flapping on specifying credentials or Force, etc)
  • Added handling for read-only properties so they show up in run reports
  • Collapsed the dsc_ensure and ensure keywords so you only need to specify dsc_ensure and then only if the DSC Resource is actually ensurable (fixing flapping and ensuring resources as absent!)
  • Fixed bug in type mapping for nested CIM Instances

Inappropriate Terminology

With everything else going on, we do not want to forget our effort to remove exclusive and inappropriate terminology from our code base. Here’s a snapshot of our current progress:

  • 31 modules updated
  • 4 in progress
  • 15 remaining

That’s 62% completed.

Content Cloud CI Project

The project really has been turning a corner. This week we onboarded a number of modules with no issues at all, including the first module of another team.

As part of our Honeycomb investment, I’ve published an enhancement to their buildevents tooling that will allow us to collect the entire stack of a nightly acceptance test run in one place. This goes from from the Github Actions job through the rake task, through the process rake starts, through the webservice call, through our provisioning backend, to the call to terraform applying. It also includes our rspec tests running in that job, through litmus and bolt into the newly provisioned machine.

Here’s a screenshot of how it looks: screenshot of a honeycomb waterfall trace diagram showing the flow of calls from github into our backend service and beyond

Sidenote: Thanks to Heston for more dynamic build-matrix calculations: the development version of litmus now can also calculate which puppet collections are necessary for testing from metadata.

Porting Progress

As of earlier this morning we have 30 of 47 modules ported over to the new CI. Through the community scripts we have a public report on the nightly CI runs. While yesterday night was especially rough with some timeouts and other issues across multiple modules, on a cell-by-cell basis 435 of 447 cells succeeded (97.3%) with an all week average of 2265/2361 (95.9%). For everyone except the few people who have access to the honeycomb stats, here’s a screenshot of the top-level success/failure chart:

Screenshot of a stacked area graph showing job successes and failures over the last 24 hours. There is a block of tests running from midnight through half three with an arbitrary value highlighted showing 21 cells succeeding and 2 failing for this particular 5 minute span.

Once the aforementioned improvements to the tooling get merged and released, we can drill into the failing cells from this view and get all the details on what went wrong.

Currently open issues:

  • several timeout issues in communication indicate need for some level of retrying to improve robustness. We’ll have to investigate where that is most appropriate.
  • apt: false negative as we don’t need cloud-ci for apt and github is not amused about the empty matrix
  • concat: bolt windows exit code issue
  • java_ks: transient failure with keytool availability
  • kubernetes: will need more work to get a full cluster working in cloud-ci
  • reboot, tagmail: provisioning failed on one machine each
  • our result tracking is not dealing well yet with re-runs (kvrhdn/gha-buildevents#16)

In general we are definitely benefitting (in the loosest sense) from the GCP virtual machines being close to real deployments in their setup and runtime behaviour. Through this we are now detecting edge cases and robustness issues with our puppet code that we can fix once and for all in the modules or the core puppet code leading to a better experience for all.

Preparing for PDK 2.0

While it’s gonna be a small-ish release, PDK 2.0 is in the works, dropping the Ruby 2.1 runtime and with it Puppet 4. We’re gonna take that opportunity and do some early spring cleaning on some components we’ve been dragging along. The first thing to land for that is a fresh 1.0 release of the puppet-module-gems with Ruby 2.1 and Ruby 2.3 dropped, and a current rubocop. This also requires an update to the pdk-templates to provide the new defaults. While we try to sequence this work in a way that minimizes impact, please be aware that this might impact you over the next couple of days if you’re tracking the main branch of pdk-templates.

“Other” changes