Tag: devops

Unit testing & Devops

I gave a small talk about this a few  years ago at the LA Devops meetup. The slides are here.

I should start with some background on why I did this talk in the first place . Unit testing (and integration and end-to-end) is something familiar to even a Jr developer. It’s definitely an extensive and hotly debated topic. Anything from TDD to the quality of tests to significance of coverage to prioritizing integration over unit tests, etc, etc. These are interesting subjects, but they are for a different post.

Unfortunately,  unit testing is often a foreign concept in many devops/syseng/ops teams. The way I look at systems engineering or devops (hate the term when used as a job description, though I think I lost that battle), is that in a modern organization they are fundamentally software development teams with software engineers. It’s just so happens that their domain of expertise is systems/observability/scalability, etc and their product is often the tooling or the glue or the platform that makes all of this happen.

But if your your job is primarily writing software and you take infrastructure-as-code approach, then tests are absolutely mandatory.  My goal with this talk was to give a quick intro to testing and lay out a few options that are available to folks in this world. A lot of details are in the slides, which I am not going to rehash here, but some of the things that can be useful:

 

  • moto – excellent library for mocking out boto and a lot of AWS services. Having python scripts and tools is fairly common in ops and boto library is quite good. I often use it for testing in addition to unittest and others.
  • bats, roundup, and shunit2 bash scripts can also use some love and testing.

 

For config managements tools there are plenty of testing frameworks available as well. My teams tended to settle on Ansible so I am more intimately familiar with it, but a lot of these work across the board. The tricky part  is deciding where to draw the line and what can be reasonably tested. A few general things that will typically be important to test are:

 

  • Complex conditionals – sometimes these are unavoidable. These often come paired with somewhat complex data structures that are being evaluated. You want to make sure you check your logic.
  • Logic in templating engines (Jinja2 in Ansible) – Something very complicated there is often a code smell, but there are exceptions to every rule.
  • Variable inheritance – In ansible you can define variables in 24 places and the precedence is sometimes unclear. When the same variable is being set in multiple places, you need to test that the precedence is correct.

 

More specifically with Ansible you can use a combination of:

 

As a reference example, this is a workflow that a few of my teams have used in the past. It’s an Ansible example, but other config management systems will be reasonably similar. A typical role folder/file structure will look like this:

r-role_name
  .bumpversion.cfg
  molecule.yml
  playbook.yml
  defaults/
    main.yml
  files/
    goss/
      test.yml
  handlers/
    main.yml
  tasks/
    main.yml
  templates/
    main.yml
  tests/
    test.yml
  vars/
    main.yml

A majority of the directory structure is the same as you would see with ansible-galaxy init. The rest is:

  • bumpversion – used for auto-versioning roles by the build system.
  • molecule.yml – has Molecule configuration in there. This is where we’d specify the docker container, test sequences, and the tool used for testing among other things.
  • files/goss/test.yml – this will have a set of tests for a particular role. It will generally include some combination of these tests if using goss and/or additional custom testing code.

When engineer commits changes to the role, the build system will trigger a ‘build’, execute the tests defined for it and increment the version if the tests pass. Especially in cases when dealing with complex roles that handle many scenarios, this can be very helpful. Someone might be making a small modification and the overall impact is possibly unclear. Assuming tests are well written and cover main use cases, this will help an engineer catch an error early.

No organization is going to be perfect and have 100% test coverage that captures every corner case. But adopting some structure/process, setting aspirational goals will go a long way, especially in larger teams. What’s not acceptable is ignoring testing all together or saying that it can’t be done.

Review: Devops in Practice

I read Devops In Practice by O’Reilly. It’s more like a booklet, but I am not sure how they categorize it. Either way, it’s not a bad read. It’s authored by Paul Reed and talks about devops experiences at a couple of companies; namely Nordstrom and Texas.gov.

This is a topic for a different (and much longer post), but devops in general tends to lack specific guidelines and I think that hinders adoption at larger companies.  In enterprise space, people often like to have formal methodologies (ITIL, Agile/Scrum come to mind), even though following the process without understanding the underlying philosophy isn’t going to get you very far. But I digress.

What I liked about Devops in Practice is that it highlights the cultural challenges of a devops transformation. In my mind that’s the first and foremost component that’s often circumvented by “let’s create a devops/tools team” approach.

The first company talked about in the book is Nordstrom. It seems like they had all the usual suspects in place. Separate dev and ops, monolithic apps (arguable if that’s incompatible with devops), lots one-off scripts and so on. They started figuring out that as they needed to deliver faster, “throw it over the wall” mentality wasn’t cutting it anymore.

The solution or rather the path to a solution revolved around embedding people into teams, optimizing for speed-to-value and figuring out their value stream. They also discovered that you need to “optimize the whole“, which is a lean concept that also applies to ops/devops/infrastructure.

At the end it briefly talks about wrapping services into APIs and the organic spread of their model. The second part is a bit more brief, but touches on a concept of integrating security practices with a devops model at Texas.gov.

All in all, it’s a good high level summary of approaches taken at two different organizations. The details are very sparse and I would have liked to see more about what failed and why in their earlier approaches (outside of one example given). It would have also been great to read more about the techniques they used to propagate the culture. However, it’s a good introduction and could be a valuable example/reference to use if you’re trying to sell the mythical devops in your own organization.

 

Do Things that Don’t Scale

I was reading the latest Paul Graham essay: “Do Things That Don’t Scale” and it got me thinking about devops, infrastructure automation and why “tools” teams fail. I am sure I am suffering through a bit of a confirmation bias here, but this article can easily be applied to a team or a culture within a company. If you see yourself as a startup, you have to think who are going to be the users/customers/stakeholders of your product and what you need to do to delight them. That’s a path to success.

The quote that jumped out to me the most was this:

If you can find someone with a problem that needs solving and you can solve it manually, go ahead and do that for as long as you can, and then gradually automate the bottlenecks. It would be a little frightening to be solving users’ problems in a way that wasn’t yet automatic, but less frightening than the far more common case of having something automatic that doesn’t yet solve anyone’s problems.

I think that ties right back into the devops philosophy of people over process over tools and why a simple “automate all the things” approach is sometimes not enough.