(chibi shell)

Process Combinators

Running a command in a subprocess basically amounts to fork+exec. What becomes interesting is combining together multiple commands, conditionally based on exit codes and/or connecting their inputs and outputs. More generally a variety of parameters or resources of the subprocess may be configured before the command is executed, including:

fileno configuration
environment variables
signal masks
running user
process groups
resource limits (CPU, memory, disk I/O, network)
prioritization
namespace isolation
virtual filesystems

Some of these can be specified by posix_spawn(3), but the more general features come from cgroups.

We can build process combinators by abstracting this configuration from the execution. The most basic case is a single command:

(shell-command (list <command> <args> ...))

This returns a procedure of two arguments, both thunks to run in the child process after the fork but before exec (one for input and one for output). For example,

((shell-command '("ls")) (lambda () #t) (lambda () #t))

would run the ls command in a subprocess with no changes from the parent process, i.e. it would write to the parent process' stdout.

Redirecting stdio to or from files is achieved by opening the file in the child process and calling dup() to match to the appropriate stdio fileno:

((shell-command '("ls"))
 (lambda () #t)
 (lambda ()
   (duplicate-file-descriptor-to
    (open "out" (bitwise-ior open/write open/create open/truncate))
    1)))

((shell-command '("grep" "define"))
 (lambda ()
   (duplicate-file-descriptor-to
    (open "shell.scm" open/read)
    0))
 (lambda () #t))

This looks like a common pattern, so let's provide some utilities:

(define (redirect file mode fileno)
  (duplicate-file-descriptor-to (open file mode) fileno))

(define (in< file) (redirect file open/read 0))
(define (out> file)
  (redirect file (bitwise-ior open/write open/create open/truncate) 1))
(define (err> file)
  (redirect file (bitwise-ior open/write open/create open/truncate) 2))

so we can rewrite the examples as:

((shell-command '("ls")) (lambda () #t) (lambda () (out> "out")))
((shell-command '("grep" "define"))
 (lambda () (in< "shell.scm")) (lambda () #t))

We can use these combinators for more than I/O redirection. For example, we can change the current working directory. The semantics of many commands depends on the current working directory, so much so that some commands provide options to change the directory on startup (e.g. -C for git and make). For commands which don't offer this convenience we can use process combinators to change directory only in the child without invoking extra processes:

((shell-command '("cmake"))
 (lambda () (change-directory project-dir))
 (lambda () #t))

Another resource we may want to change is the user, e.g. via setuid. Since we control the order of resource changes we can do things like the following example. Here we run as root, providing access to the secret data in /etc/shadow, but extract only the row relevant to a specific user and write to a file owned by them:

(let ((user "alice"))
  ((shell-command (list "grep" (string-append "^" user ":")))
   (lambda ()
     (in< "/etc/shadow")   ; read as root
     (set-current-user-id! (user-id (user-information user))))
   (lambda ()
     (out> "my-shadow")))) ; written as user

This is already something not possible in bash (or posix_spawn) without resorting to additional subprocesses.

We can in a similar manner also modify priority with nice, the filesystem with chroot, and change the cgroup, which otherwise is generally done with a wrapper script.

Things get more interesting when we want to combine multiple commands. We can connect the output of one process as the input to another with a pipe. The following pipes the output of echo to tr, outputting "HELLO" to stdout:

((shell-pipe (shell-command '(echo "hello"))
             (shell-command '(tr "a-z" "A-Z")))
 (lambda () #t)
 (lambda () #t))

We can continue to build on these combinators, but for practical use a concise syntax is handy. We provide the syntax shell, similar to SCSH's run, except that a single top-level pipe is implied. The above becomes:

(shell (echo "hello") (tr "a-z" "A-Z"))

A command without any arguments can be written as a single symbol without a list:

(shell (echo "hello") rev)

=> "olleh

You can chain together any number of commands, implicitly joined in a pipe. I/O redirection works by putting the redirection operator after the command it modifies:

(shell cat (< "input.txt") (tr "a-z" "A-Z") (> "out"))

for the following operators:

(< input): redirect stdin from the file input
(<< obj): redirect stdin from the displayed output of obj
(> output): redirect stdout to the file output
(>> output): append stdout to the file output
(err> output): redirect stderr to the file output
(err>> output): append stderr to the file output

Commands can also be combined logically with several operators:

(do cmd1 cmd2 ...): run the commands in sequence
(and cmd1 cmd2 ...): run the commands in sequence until the first fails
(or cmd1 cmd2 ...): run the commands in sequence until the first succeeds
(>< cmd1 cmd2 ...): pipe the output of each command to the input of the next
(if test pass fail): if test succeeds run pass, else fail

Note although piping is implicit in the shell syntax itself, the >< operator can be useful for nested pipelines, or to structure a pipeline in one expression so you can group all I/O modifiers for it as a whole, e.g.

(shell (< x) cat rev (> y))

could also be written as

(shell (>< cat rev) (< x) (> y))

As a convenience, to collect the output to a string we have shell->string;

(shell->string (echo "hello") (tr "a-z" "A-Z")) => "HELLO"

Similarly, the following variants are provided:

shell->string-list: returns a list of one string per line shell->sexp: returns the output parsed as a sexp shell->sexp-list: returns a list of one sexp per line

`(shell-pipe cmd . cmds)`

`(shell-if test pass . o)`

`(shell-and cmd . cmds)`

`(shell-or cmd . cmds)`

`(shell-do cmd . cmds)`

`(redirect file mode fileno)`

`(in< file)`

`(out> file)`

`(out>> file)`

`(err> file)`

`(err>> file)`

`(with-in< file cmd)`

`(call-with-shell-io cmd proc)`

`(shell& cmd ...)`

`(shell cmd ...)`

Returns the exit status of the last command in the pipeline.