In defense of rabbit holes

There’s something about struggling with a problem that really fixes it in the brain.

I spent week recently in a hole. A rabbit hole. A big rabbit hole. A yak hole, if you will.

I started the week with one problem in mind: to write an application in Elixir. In order to solve that problem, I had to solve another problem. And another. And another. Until I was long and far away from the original thing I was trying to do in the first place — which I probably could have had accomplished by now, if I’d found a way around the problem that triggered the first detour, instead of trying to solve it.

I’m Nat Bennett, a writer, occasional software consultant, and Elixir learner. You’re reading Simpler Machines, your weekly yak-and-rabbit report.

(No rabbits were harmed in the creation of this newsletter. I've just had rabbits on the brain lately, since I learned how to pet them properly.)

I’ve included the full story of how I got into this particular rabbit hole at the end of this essay, but the short version is: I started trying to write a web server that accepts requests signed with EdDSA keys, and I ended up digging deep into how Elixir’s Mix tool starts and configures processes. Along the way I learned how to test Mix tasks, found out about Agents, implemented an asynchronous stack, set up several projects with Elixir’s OpenTelemetry modules, and learned how to test traces.

I don’t normally end up in rabbit holes like this, long adventures into obscure problems. Normally, I would have gotten to the first problem — some difficulty understanding the status of a long-running Mix task — gone, “huh, that’s too bad,” and just waited. Or maybe I would have copied the code in question directly into the task so that I could add some output to it directly. Something fast, that got the job done, at the cost of a little elegance.

Sometimes this is because I’m just being practical. Sometimes it’s because I’m pairing. I don’t want to waste my pair’s time. Or they— wisely— point out that we don’t really have to solve whatever problem it is, once it starts to get hard. As a pair, we can use our cleverness to avoid rabbit holes, not to catch rabbits.

Some things that I have learned accidentally

The “forming, storming, norming, and performing” model (from reading dungeon master advice)
How to network by participating in communities online (writing a roleplaying game blog)
How to type at 90+ words per minute (writing novels)
What a “PATH” is (trying to run Ruby script on a Windows machine)
How to write Ruby scripts that test webpages (looking for performance testing tools)
The basics of SQL injection (writing a script to check the results of a large set of API requests, and instead discovering a bunch of 500 errors)
What a “native extension” is, and how to deal with it when a gem with them won’t compile (reading the makefile generation script for a fork of a fork of Postgres)
How to write Golang at a somewhat reasonable pace (taking a break, and spending about six months reading and writing Java)
The basics of Oauth (trying to write a web application framework based on data stored in Google Sheets)
How Gunicorn's process forking process works (while trying to instrument a Python application with Honeycomb)

One of the reasons people pair is to “avoid rabbit holes.”

Pairs are less likely to wander off task, get caught up in personal curiosity, to spend days or hours pursuing unproductive work. Your pair catches you, and you catch your pair, before you end up down a hole, in the dark, covered in fur.

I worked once on a templating system that for complicated human reasons had to be written in Bash. It had complicated if/then logic, producing a variety of different results based on a set of parameters. It was the kind of problem that wants to be test-driven, the same way that marble from Carrara wants to be a statue. Given this, then that.

But the first version we wrote had no tests. Because: Bash. And it was a mess. Programs that are fundamentally made out of if statements like that almost have to be refactored. There’s almost always a clean, simple internal structure, but it’s never obvious at first. The way you find it is to define the behavior, as a set of input/output cases, build a mass of branching logic, and then carve the simple structure out of it. The tests let you experiment, ensuring that while you change the structure, you keep the behavior intact.

Without tests, that first version was stuck as a mass of if statements. We couldn’t easily add new behavior to it, because we didn’t understand the existing behavior. Any change we made risked breaking something else.

A pair of new engineers joined the team, also experienced with test-driven development, with the way. And they were horrified. How, they asked, had we possibly written something like this without tests?

They insisted that it be rewritten. Properly. With tests. And we protested, because it was Bash. How could we possibly test it?

With Rspec, they said.

They wrote a test rig, and demonstrated.

The program, remember, was simple: A set of arguments in. A data structure out. As a Bash process, really, even simpler: A string in. A string out.

Write the entire script like a Python or a Go program, with a main function that calls out to other functions. source the script, to load the functions into your environment. Then call functions individually.

Provide inputs. Check outputs.

The rig looked like this:

let(:std_out) { command_results[0] }
let(:std_error) { command_results[1] }
let(:result) { command_results[2] }

let(:command_results) do
  Open3.capture3(command)
end

The tests looked like this

context 'if it has not 2 args' do
  let(:command) { ". ./tools/prepare-deployments &&        validate_infrastructure ''" }

  it 'prints an error' do
    expect(result).to_not be_success
    expect(std_error).to include "ERROR: infrastructure should not be empty"
  end
end

People say something funny about pairing sometimes: That it's not good for exploring. That it’s good when you’re executing, when you’re building, and once you know what you build, but that working in a pair holds back research.

The reason this is funny is that people also say that pairing is good for learning. People who don’t otherwise value pairing will admit this. “Pairing is good for onboarding, and for juniors,” they’ll say, “but it’s a waste of time if you’re not transferring knowledge.” Others— wiser, I think— will credit it with most of the learning they’ve done in their careers.

I’ve talked to a lot of people about why they pair, and why they find value in pairing, and I’ve noticed that sometimes people will hold these two ideas together, without noticing any contradiction. Good for learning. But not for exploring.

Was figuring out how to write Bash tests in Rspec a rabbit hole?

If we’d only had a general idea that it was possible, and had to spend time figuring out how to implement it, probably. If we hadn’t found, or known about, the open3 gem, and had tried to write its capabilities ourselves. If we hadn’t had someone with a clear picture of the Unix process model, a clear idea of how to structure the Bash script, invoke it in Ruby code, and capture its output.

A pair that hadn’t known these things might have had the thought that they could, probably, write tests. And they might have tried to find a way. And they might have spent a few hours trying it, encountering and solving sub-problems, and then decided, no, this is too hard, too unclear, and abandoned that rabbit to its business.

I remember the exact moment that I learned what an array was — really learned it, well enough to write code that looped over one. They seem simple now but when I first encountered the concept — at age 12, working through a BASIC tutorial — they were so baffling that I abandoned programming for ten years.

I returned to Ruby. And they still made no sense. They were too abstract for the concept to fix in my mind, too slippery. I couldn’t see them, couldn’t see how [1,2,3] became anything other than the literal symbols, couldn’t see how the [] transformed its contents, couldn’t see how the , separated them.

Something about Ruby, and about .each, provided the tiniest of mental toe-holds. I could see, dimly, how .each do was saying “do this, for each item.” And through that, could start to see an Array as a list of distinct items, with an order. I couldn’t write code that worked, but I began to grasp the concept.

So I spent a weekend writing code. I had an array of arrays, and I wanted to make a particular HTTP call with each of the values of each of the interior arrays. It took a whole weekend, something like 14 hours altogether, a weekend of fiddling with irb and deciphering error messages, but I did eventually write the script. And some indescribable something snapped into place inside my mind, and I could see Arrays. They were easy. They were simple.

There’s something about struggling with a problem that really fixes it in the brain.

I spent a long time trying to explain Honeycomb to people.

That we should get this or that bit of data into it. That the code would be easy.

It never worked. “Aren’t metrics fine?” “This is useless if it doesn’t let us see the order the logs came in.” “Sure, maybe this is valuable in theory, but it sounds like an awful lot of work. We have more important things to do right now.”

I could see it. I could see that the code to get data in would be easy to write. But I couldn’t actually write it. I could see how it would be straightforward, once I learned to manipulate the data structures, parsed the logs, studied the framework well enough to plug into it.

But I couldn’t do it. I just wasn’t a strong enough programmer. They were simple problems, but I hadn’t solved them before.

And so I couldn’t explain it. So I couldn’t get the pair, the team, to help me.

I eventually gave up explaining. Instead, I spent a week or so outside of work, getting existing JSON formatted logs and feeding them into a dataset, looking for interesting queries. I discovered that over 80% of the logs in that data set were just one small obscure component saying, “hi, I’m here,” over and over again. I showed that team. They fixed the problem.

I integrated basic instrumentation into a Rails codebase, as part of a piece of performance testing work. The instrumentation wasn’t strictly necessary to collect the data I needed, but I knew: It would mostly be easy, the parts that weren’t easy would be valuable to learn, and once I had the instrumentation in I would learn something valuable. And I did: The instrumentation made it simple to see exactly what components were responsible for errors, which outbound calls within the calls I was testing had failed.

Now it takes me an hour or so to instrument a new application with Honeycomb. Maybe a few hours, if I have to learn a new language while I'm doing it.

So what is a rabbit?

A rabbit is something that you learned accidentally, while you were trying to do something else, or while you were engrossed in play. Pure curiosity.

Sometimes rabbits are useful. Sometimes they’re not. Usually, they’re distractions.

But I think I’m a better programmer since I started going into rabbit holes, and coming out with rabbits.

The full story of the Elixir problem

I was working on an implementation of the Spring83 spec in Elixir. For testing, I needed to be able to generate EdDSA keys with particular properties. Generating such a key takes up to twenty minutes: Many keys must be generated and checked. As I was waiting for my mix task to return I got antsy, staring at an unmoving cursor.

I decided to have the mixtask emit little .s while it was working. The way that I’d written the generator function, though, was with an asynchronous pipe. It starts up a collection of processes to generate and check keys, and they don’t communicate back to the main process until they find one. So I started learning about inter process communication in Elixir, and eventually concluded that what I probably needed was an Agent that could hold a stack of messages. The generators would notify the Agent when they checked a key, and then the mixtask, with access to CLI output, would read messages out of the Agent as they came in.

I got this working and then discovered that the tests — unsurprising for code that deals heavily with concurrent operations, that don’t necessarily have a consistent order — weren’t deterministic. Sometimes they passed. Sometimes one set failed. And sometimes another set failed.

I realized shortly afterwards that all that was happening was that the tests were all sharing the same Agent, so sometimes they were interfering with messages that other tests expected the Agent to contain. By then, though, it was too late. I had already started down the last rabbit hole. The ultimate rabbit hole.

I had started writing a module that would emit test results as trace spans, primarily to Honeycomb. (An ExUnit.Formatter, for the Elixir fans out there.) To do that, I had to figure out how to capture spans and inspect them in tests, how to properly configure the telemetry exporter after several false starts, and then, how to keep the tests span capture from interfering with other spans, so I could run the module on its own tests.

(In case you are also learning Elixir and trying to wrap your head around how Elixir and Mix use the OTP model, the most illuminating functions were in Application. Application.started_applications() and Application.get_all_env(app) began to unravel the mysteries. Realizing that, by default, mix only loads config/confg.exs was also key.)