TDD a CLI Caching Script - Part Three - Help

2020-01-23

This is part three in a series about writing a general-purpose script to cache CLI output. In part two we added support for a TTL option and specifying acceptable status codes to cache.

In this post, we'll add support for --help and improve our option parsing.

`--help`

We're already parsing options in our script, so responding to --help isn't much work.

What should --help do? It should give basic usage instructions and document each option our script can take. We can write a simple test to that end.

@test "documents options with --help" {
  run ./cache --help
  [ "$status" -eq 0 ]
  echo $output | grep -- --ttl
  echo $output | grep -- --cache-status
  echo $output | grep -- --help
}

We're not making any specific claims about what the help documentation tells us about each option. We're only asserting that every option is mentioned (including --help itself).

The double-dash after grep is important in these assertions. In grep -- --ttl, the -- tells grep we're not trying to pass --ttl as an option, but that it should be interpreted as a positional argument (in this case, the thing we want to search for). There's an implication for our cache script here too that we'll address shortly.

In the meantime, let's make this test pass by responding to --help.

We'll add a usage function after set -e and before we parse our options:

usage() {
    echo "usage: cache [--ttl SECONDS] [--cache-status CACHEABLE-STATUSES] [cache-key] [command] [args for command]"
    echo "  --ttl SECONDS   # Treat previously cached content as fresh if fewer than SECONDS seconds have passed"
    echo "  --cache-status  # Quoted and space-delimited exit statuses for [command] that are acceptable to cache."
    echo "  --help      # show this help documentation"
}

Within our option parsing, we'll add a case for --help to call usage and exit.

        --help)
            usage
            exit 0
            ;;

Here's what the output of cache --help looks like:

usage: cache [--ttl SECONDS] [--cache-status CACHEABLE-STATUSES] [cache-key] [command] [args for command]
        --ttl SECONDS   # Treat previously cached content as fresh if fewer than SECONDS seconds have passed
        --cache-status  # Quoted and space-delimited exit statuses for [command] that are acceptable to cache.
        --help          # show this help documentation

Not bad.

Better positional argument parsing

Remember that bit about -- and positional arguments in the grep usage? Anything after -- is treated as a positional argument even if it looks like an option. Let's think about why that matters. Imagine we're trying to cache the --help output of a command. That may sound silly, but it'll illustrate our point. Consider cache some-cache-key grep --help. Can you guess what the output will be?

If you guessed the --help content for our cache script, you're right. Our cache script sees --help anywhere in the command and says "Ah, I know how to do that!" That's not ideal.

We should support the -- like we see in this updated command: cache -- some-cache-key grep --help

We'll write a test:

@test "stops parsing arguments after --" {
  run ./cache -- $TEST_KEY grep --help
  [ "$status" -eq 2 ]
  echo $output | grep -- "usage: grep"
}

Note the expectation that the status will be 2. That's the code grep --help exists with. It is a little strange, perhaps, since the command didn't exactly fail. But it didn't exactly succeed either. git --help exits with 0. I'm fine with cache --help exiting with 0, but this is an idiosyncrasy worth knowing about.

This test fails as every good initial test should. Our $status is 0 and our output is cache --help.

Fortunately our option-parsing code lends itself well to handling --. We'll add the following case:

        --)
            shift
            positional+=("$@")
            break
            ;;

And we're green. And now we're a better citizen of the command line.

Flexible Parsing and Confusion

Handling the double-dash is the right thing to do. But cache some-cache-key grep --help without the double-dash seems pretty reasonable too. We really only have two types of positional arguments: the cache key, and the command and its arguments.

Passing options before the cache key positional argument like cache --ttl 1 some-cache-key grep --help seems valid. So does passing options after the cache key positional argument like cache some-cache-key --ttl 1 grep --help.

What doesn't make sense is providing options after the command starts. Something like cache some-cache-key grep --ttl 1 --help shouldn't be interpreted as providing a TTL of 1 to our cache script.

We could explicitly stop parsing options after our cache key. We'd essentially be treating this as an implicit --. But is this worth the effort or should we trust the user to do the right thing? That's a classic question, right? You need to weigh the value of protecting against weird input against the complexity the protection adds to the code and the difficulty of implementation. Also there's the nebulous question of how much effort it'll take to support this feature as time marches on.

In this case, I'm learning in my free time, so I'll go the extra mile to see how it pans out.

First let's add a test illustrating how we already parse options before or after the cache key. This test will keep us from accidentally breaking things:

@test "parses options before and after the cache key" {
  # fails because the status isn't allowed by our options
  run ./cache --cache-status "2" $TEST_KEY exit 0
  [ "$status" -eq 0 ]
  [ ! -f "$TMPDIR$TEST_KEY" ]

  # succeeds because the status is allowed by our option before the
  # cache key
  run ./cache --cache-status "0 2" $TEST_KEY exit 2
  [ "$status" -eq 2 ]
  [ -f "$TMPDIR$TEST_KEY" ]

  rm "$TMPDIR$TEST_KEY"

  # succeeds because the status is allowed by our option after the
  # cache key
  run ./cache $TEST_KEY --cache-status "0 2" exit 2
  [ "$status" -eq 2 ]
  [ -f "$TMPDIR$TEST_KEY" ]
}

That already passes. Now let's add a failing test.

@test "stops parsing options after the command starts" {
  run ./cache $TEST_KEY echo --ttl 1 --help
  [ "$status" -eq 0 ]
  [ $output = "--ttl 1 --help" ]
}

This fails because we're still parsing --ttl 1 and --help after the command (echo) starts. That means that $output is the result of cache --help.

We'll update the default branch of our case statement.

        *) # default
            if [ -z "$cache_key" ]; then
                cache_key=$1
                shift
            else
                break;
            fi
            ;;

If the $cache_key isn't set, we set it and remove it from the argument list. If it is set, we break because we only want to parse one positional argument (for the cache key) and treat all remaining arguments as the provided command and its args.

Next we'll replace the -- branch of the case statement with

        --)
            cache_key=$2
            shift # drop the --
            shift # drop the cache key
            break
            ;;

We can now remove these lines after the case statement

set -- "${positional[@]}" # restore positional parameters
cache_key=$1
shift

and also remove any other references the positional variable since we are no longer using it.

 ✓ initial run is uncached
 ✓ works for quoted arguments
 ✓ preserves the status code of the original command
 ✓ subsequent runs are cached
 ✓ respects a TTL
 ✓ only caches 0 exit status by default
 ✓ allows specifying exit statuses to cache
 ✓ allows specifying * to allow caching all statuses
 ✓ returns the cached exit status
 ✓ documents options with --help
 ✓ stops parsing arguments after --
 ✓ parses options before and after the cache key
 ✓ stops parsing options after the command starts

13 tests, 0 failures

All this work might have you questioning if supporting positional arguments is worth it. We could require the cache key to be provided as --key. This is an ergonomics versus aesthetic argument that ultimately comes down to opinion. Do whatever you feel is best and document accordingly.

Closing

You might be asking: --help is nice, but what about writing a man page? I haven't taken the plunge here yet, but if I were going to try it, I'd probably lean on (the always wonderful) pandoc. Search for "Man page" on their demos.

Here's the diff for adding --help and -- support and the diff for not parsing options after the command starts.

Stay tuned for our next post where we'll take a quick look at a shell script linter that can help us avoid errors and improve code quality.

semantic art

code should say something

TDD a CLI Caching Script - Part Three - Help

`--help`

Better positional argument parsing

Flexible Parsing and Confusion

Closing