It's user error.

Debu.gs


Inferno Part 1: Shell

Inferno has, hands down, the best shell I have ever used.

“But that’s a bold statement, and who the hell are you?”

I’m just a guy on the internet, where bold statments can be and are made without repercussions. When I heard that they will let just anyone have a blog, and they don’t even care what you say on it, I said “Sign me up!”

Seriously, though, here’s a little background on shells I’ve used; you can skip this section if you don’t care. I’ve used bash as my primary shell for quite some time, and have gotten used to its quirks.

More recently, I’ve used rc(1)‌ and had fun. (For the interested, there’s an introduction to rc by the author of the shell Tom Duff.) It is of course the default shell on Plan 9, but it also runs on Unix. On Unix, in the absence of a plumber(4)‌, I usually use rlwrap with it.

I can cope reasonably when presented with csh, tcsh, ksh, zsh, and plain POSIX sh. zsh is the most fun of those, but they are all roughly similar except for minor niceties, syntax differences, and interface tweaks.

So, that should explain somewhat where I’m coming from.

The section after the section that you skip if you don’t care

This one is more code-heavy after the introduction than the previous Inferno post. I recommend downloading and installing Inferno so that you can play with the examples. (The short version of the installation instructions: get the tarball, hg pull, hg update, build using a 32-bit machine or 32-bit chroot.)

Inferno’s shell, despite author Roger PeppĂ©’s modest claims of only minor originality, is radically different.

Very different

It’s pretty similar to rc, although the spots where it differs are brilliant.

In a bare, freshly started shell, there are no conditionals, no loop constructs, no functions, no aliases. There are lambdas after a fashion, though, in the form of dynamically (as opposed to lexically) scoped blocks with local variables, and variables are lists rather than strings. Still with me? You might suspect me of lying about the missing pieces, but I can assure you that I am technically not lying very much at all.

The lambdas/blocks are simple enough to understand: they’re delimited by { and }, and they can have local variables and arguments, which are provided as a list in $*, as a single string in $"*, and individually in $1, $2, $3, etc.

Technically not lying very much at all

There are && and ||, and one would think that it would be easy to build up an if combinator by means of clever variable usage and lambdas, but that won’t work:

; ifc = { $1 && $2 }
; $ifc {echo this} {echo worked}
sh: and: './and' file does not exist

What happened? && and || are just syntactical sugar:

; echo $ifc
{and {$1} {$2}}
; echo $ifc {echo this} {echo worked}
{and {$1} {$2}} {echo this} {echo worked}

Huh. Well, if we wanted, we could make an “and” program that did what made this work, right? (We are ignoring for now that it wouldn’t work anyway, since the inner blocks have no arguments, and thus $1 and $2 will be nil.) You technically could do that, but what you actually want to do is load the std module.

Loadable modules…in a shell!

Loadable modules! Unloadable as well, but that will make sense in a bit. If you want to do any scripting in such a sparse environment, you’re crippled without a feature like that. The presence of loadable modules, in fact, is why the shell can be so spartan but still be so great: a module can be written to allow the shell access to any library on the system.

The std module provides facilities for flow control (looping, conditionals, exceptions), predicates, list manipulation, functions, and substitution functions. There are several other libraries which will be introduced as they come up.

Modules are loaded from /dis/sh, and are expected to conform to a simple interface and to define some “builtin” functions. Straightforward enough, right?

There are very few builtins that come with the shell, but you can use loaded (which gives a line-separated list of functions and their origin) to check what they are, whatis to see what something is.

The basics of flow control.

Finally, upon loading the standard library, we get an “if”! It’s a lot more like Common Lisp’s cond than a straightforward “if” statement, in that it accepts pairs of blocks and expects the first of each pair to be a predicate.

load std
if {~ $i 1} {echo '$i was one.'}
if {~ $i 1} {
	echo '$i was one.'
} {! ~ $i 2} {
	echo '$i was something other than one or two.'
} {
	echo '$i was two.'
}

The ~ is the comparison predicate (so as not to clash with =), and the ! negates the status of the following command. (~, as the name suggests, can also match patterns, following the shell’s globbing rules.)

The looping constructs (for, while, and apply) act as expected:

; for i in `{ns} {echo $i | sed 's/.* //'} | sort | uniq
; i = Hello, World!; while {! ~ $#i 0} {(cur i) = $i; echo $cur}
; apply {man $1 intro} 1 2 3 4 5 6 7 8 9 10

In the for example, the backtick followed by braces (`{}) does what a pair of backticks or $() do in bash: substitutes the output of a command into the arguments for another command. "{} is similar, but does no parsing of tokens; that is, the output is substituted as a single string. Try changing the backtick to a double quote and you will see what I mean.

The while example uses a couple of other new things: $#i returns the number of elements in $i, and the (cur i) = $i splits the first item out of the list $i, assigning it to $cur, and then assigns the rest to $i. I don’t know of an equivalent in bash that is as simple, but it behaves roughly like the following assignment in Ruby: cur, *i = i.

apply is roughly equivalent to for, but without an explicit variable and with the block preceding the arguments to apply the block to. You can use one to simulate the other, in fact:

for i in 1 2 3 { {echo $1} $i }
apply { i = $1; {echo $i} } 1 2 3

Functions, substitution functions

Functions are syntactically nice and are simple due to the semantics of blocks in Inferno:

; fn greet {
	name = $1
	if {~ $#name 0} { name = World }
	echo Hello, $name!
}
; greet
Hello, World!
; greet everybody
Hello, everybody!
; whatis greet
load std; fn greet {name=$1;if {~ $#name 0} {name=World};echo Hello, $name^!}

You can, like any other command, capture its output:

; greeting = "{greet}
; echo The greeting is '"'^$greeting^'"'
The greeting is "Hello, World!
"

The carats (^) are a join operator of sorts; they are used to separate tokens without inserting a space. The only quoting operator for literal strings is the humble tick (or “single-quote”, but anyway it’s this guy: '). Since the double-quote is used elsewhere in the syntax of the shell, you’ll need to escape it where it is used.

But that’s a little clunky if the function’s main purpose is to generate a string for you to keep around, and this is where substitution functions come in. Substitution functions are declared similarly to functions but are called differently (with ${}) and their purpose is to be substituted into a command rather than to act as standalone commands:

; subfn greeting {
	name=$1
	and {~ $#name 0} {name=World}
	result = Hello, $name!
} 
; echo ${greeting}
Hello, World!
; echo ${greeting Charles Forsyth}
Hello, Charles!
; echo ${greeting 'Charles Forsyth'}
Hello, Charles Forsyth!

There are a few substitution functions built into the shell and several that come from the std library.

Functions and substitution functions are stored as environment variables and are available to subshells as a result. Functions get the “fn-” prefix, and substitution functions get “sfn-”. And since /env is a filesystem:

; cat /env/fn-greet
'{name=$1;if {~ $#name 0} {name=World};echo Hello, $name^!}'

Environment variables

As mentioned above, /env is a filesystem representing the current process’s environment variables. No more big block of NULL-terminated strings in a NULL-terminated extern char **environ; as in C, thus the shell’s job in managing environment variables gets easier. In Unix, even if you wanted to, you’d have a hard time getting a different process’s environment, the environment being a chunk of memory at some location inside the process. However, in Inferno, it’s a filesystem, mounted on /env by default.

Environment variables can be enumerated with ls -p /env. Not only that, but propagating changes is a completely different game altogether. Remember listen(1)‌ and export(1)‌? If you felt like it, you could export /env and give outside processes access to the current one’s environment variables.

Math

Mathematical expressions in the shell are taken care of by means of the expr(1) and mpexpr(1)‌ libraries for 64-bit integer math and infinite-precision integer math, respectively. They both provide the same interface: a substitution builtin ${expr ...} and a regular builtin function ntest which returns false for zero and true otherwise. ${expr} gives you a simple, stack-based calculator that anyone familiar with dc(1)‌ or Forth or HP calculators should be comfortable with. It misses some stack manipulation operations but is otherwise a standard RPN math language:

; load expr
; echo ${expr 5 2 3 + '*'}
25
; load std
; i = 10; while {ntest ${expr $i 5 -}} {echo -n $i^' '; i = ${expr $i 1
-}}; echo
10 9 8 7 6

Note the quoting around the multiplication operator, to prevent the asterisk from being expanded. If you prefer to not have to quote, there are equivalents that are not special characters for anything that needs quoting (e.g., ‘x’ is equivalent to ‘*’).

Here’s a naive primality test:

; fn isprime {
	n = $1
	rescue composite {status false} {
		i = 2
		while {ntest ${expr $i $n '<'}} {
			! ntest ${expr $n $i %} && raise composite
			i = ${expr $i 1 +}
		}
	}
}
; subfn primestr {
	n = $1
	result = $n is composite
	isprime $n && result = $n is prime
}
; echo ${primestr 20}
20 is composite
; echo ${primestr 23}
23 is prime

If you look at it closely, there is a bug in the above. I’ll avoid spoiling it in this paragraph if you want to try to spot it.

First, though, you may have noticed the use of rescue and raise. The std library implements exceptions, thinly wrapping the Limbo facilities for excemption-handling. They behave as expected, although it’s recommended they be used sparingly.

The small bug: n is an environment variable, so primestr and isprime will be sharing the same n! Only by luck have we avoided a bug here. Ouch. Luckily, blocks can have local variables:

; n = 5; { n = 6 }; echo $n
6
; n = 5; { n := 6 }; echo $n
5

So, you can recurse to your heart’s content:

; subfn factorial {
	(n r) := $1 1
	and {! ~ $n 0} {r = ${expr ${factorial ${expr $n 1 -}} $n '*'}}
	result = $r
}
; echo ${factorial 15}
1307674368000
; echo ${factorial 21}
-4249290049419214848

Oops. 21! is a 66-bit number. But no problem!

; unload expr
; load mpexpr
; echo ${factorial 21}
51090942171709440000
; for i in ${expr 15 26 seq} { echo ${factorial $i} } | mc
1307674368000               51090942171709440000
20922789888000              1124000727777607680000
355687428096000             25852016738884976640000
6402373705728000            620448401733239439360000
121645100408832000          15511210043330985984000000
2432902008176640000         403291461126605635584000000

mpexpr is slightly slower, of course, but not noticably so for any math you might want to do in the shell.

As a side note, mc(1)‌ in Inferno and in Plan 9 columnates its input and prints it to standard output. This was a feature singled out in Rob Pike and Brian Kernighan’s USENIX presentation “UNIX Style, or cat -v Considered Harmful” and in Program Design in the UNIX Environment.

Fetching web pages

What’s the fun in a computer that doesn’t talk to the internet at large? And the web is a pretty vast slice of it. Now that we have a good grip on the semantics of blocks, let’s play with a command that makes good use of them. If you don’t have the connection server or the DNS server running, you’ll need to run ndb/cs and ndb/dns to start them up. They power /net/cs and /net/dns, respectively, and you’ll want those if you have the urge to talk to the network. There are a few HTTP clients for Inferno, but we’re going to build one in the shell, if you won’t mind a bit of a digression first.

A brief digression about carriage returns

HTTP requires CRLF (“\r\n”, “0×0d0a”, etc.) to terminate lines. To get this in Inferno, you just need to type ^M.

We’ve got to diverge here, somewhat, depending on if you have started the window manager and a shell inside it or if you are using a bare shell (no graphical environment, just ‘emu’ running inside a VT100). We’ll set up a $cr variable to hold a carriage return, and then the next section will work the same no matter what you’re doing.

In the Inferno shell, running under a window manager, (recommended) this is easy:

; cr = '^M'

Note that that’s control+M, not a carat followed by a capital M. The shell will print a special character indicating that it’s a CR, and you’ll have to surround it by quotes, since it’s a whitespace character.

If you’re using the shell in a hosted environment without starting up a window manager, you’ll feel some pain here:

; echo '^M' | xd
0000000 0a0a0000
0000002

Oops. In hosted Inferno running in a VT100 emulator (xterm, rxvt, etc., I like urxvt), we’re going to have to be a little more clever:

; cr = "{os awk 'BEGIN{printf("\r")}'}

You could also write a brief Limbo program that prints a single \r if you don’t want to use os(1)‌ or start the window manager. os, as you may have suspected (or read if you clicked the link to the man page), runs a command on the host OS. There are some caveats but it does roughly what you expect for the most part.

Back to the web

Now that we have $cr set up, we can do this:

; fn bcurl {
	(h p) = $1 $2
	dial -A 'tcp!'^$h^'!80' {
		echo Connected >[1=2]
		echo 'GET '^$p^' HTTP/1.0'^$cr
		echo 'Host: '^$h^$cr
		echo 'User-Agent: Inferno Shell Barbarian Curl'^$cr
		echo $cr
		echo Request made >[1=2]
		cat >[1=3]
		echo Done >[1=2]
	} >[3=1]
}
bcurl reverso.be /

That’ll fetch a page and send it to standard output (including headers and everything), with diagnostics sent to standard error, in a sort of barbaric version of curl(1)‌ that only accounts for the most common use case. dial(1)‌ connects to the specified address, and then executes the provided block, with the block’s standard input and standard output wired up to the connection.

There are two things that won’t make sense here if you are used to, e.g., bash: the fact that $cr goes unescaped when used (causing, no doubt, the seasoned bash user to cringe in chilling apprehension, recalling the destruction of the root filesystem on a production machine after running a script that didn’t quote its variables), and the redirection operators.

In Inferno’s shell, you escape things once, and never have to worry again. Everything is parsed once (unless you use the ${parse} substitution function), so there’s no need to worry. For example, in bash:

$ fourspaces='    '
$ echo a $fourspaces b
a b

echo only sees ‘a’ and ‘b’, and prints them separated by a space. However, in Inferno:

; fourspaces = '    '
; echo a $fourspaces b
a      b

That is, echo receives three arguments, ‘a’, ’ ‘, and ’b’, and prints them with spaces separating them, resulting in six spaces between a and b.

The second is a somewhat nicer syntax for FD manipulation. >[1=2] redirects standard output to standard error, as with bash’s >&2. >[1=3] sends standard output to the newly opened FD 3, which we then send back to standard output by means of >[3=1]. The same works for pipes, of course.

; {echo Hello >[1=2]; echo World} |[2] tr l b
World
Hebbo

That’s technically possible in bash with some amount of FD swapping, but it’s somewhat ugly and if there’s a way to do it without subshells, I don’t know:

$ ((echo Hello >&2; echo World >&3) |& tr l b) 3>&1
World
Hebbo

I can’t see the example getting much more complex without running past my capacity to manage it. The extra FD being required (as far as I can tell, unless you invoke another sub-shell; |& redirects stdout and stderr, so stdout has to be saved elsewhere) is a bigger problem than one might expect, as bash warns on its man page that one ought not to attempt using file descriptors higher than 10. I wouldn’t attempt something like the above bcurl function in bash, but if anyone can get something like that working (using bash’s weird network support), it would be interesting to see.

Loadable as a module

The Inferno shell is also a loadable module for Limbo programs. Where in most Unix systems exec()ing sh -c is somewhat popular, in Inferno, it’s unnecessary. You just load the sh(2)‌ module into your Limbo program and run programs, parse commands, glob patterns, etc., as much as you like.

Not only does this make rewriting shell functionality unnecessary, it lets you use the shell as a Lua-style scripting engine for your application. It also allows you the freedom to slap whatever GUI you like on top of the shell without much problem. The one that ships with Inferno is wm/sh, although wmrun is a good example of a different UI. Based on how much I love typing :%! in vim and | in sam(1)‌, I’ve been thinking about playing with an interface that treats the shell and commands’ output similarly, but that’s a later project.

I told you it was different

I think this covers enough of the Inferno shell to illustrate how powerful it is compared to more common shells, and exactly what is so different when compared to bash. Hopefully by this point you believe me.

Next up

I’ve another article or two planned to cover more of the shell. There is so much you can do with it that it’s hard to cover all the ground in one post. I’ve de-scoped, so to speak, sh-tk(1)‌ and covering the reason Inferno doesn’t have a version of awk: you can do it in the shell via getline, sh-regex(1)‌, and sh-string(1)‌ (although awk’s syntax would be nice to have).

I’m also very interested in explaining the plumber(8)‌. This will likely be tied into the Tk article, since the example I have in mind involves plumbing.

The auth stuff is slightly more easy to wrap your head around using than the shell. I’m still tinkering with it, but I plan an entry on getting machines to authenticate with each other, centralizing your home directory, and then building a couple of simple distributed applications (like, for example, a simple application of the filter/map/reduce pattern, a prime sieve, or something along those lines). A step-by-step guide to using it seems a little difficult to find, though, so I’d like to cover it.

For the reasons mentioned in the previous articles, I kind of want to avoid the Inferno versus BEAM comparison article, but it’s getting more tempting all the time.

So following this, in order of probability, an article on auth and building a cluster, one on text processing in the shell, one on Tk in the shell, and one comparing and contrasting Inferno and Erlang’s BEAM.

Small update

I’ve submitted this article to Hacker News, although my suspicion is that submitting it after-hours on a Friday will not get it much attention.

Errata (2012/06/07)

I had not encountered the unicode(1)‌ command before writing this entry. Rather than os awk to get a CR, it suffices to use "{unicode D}. Much nicer.

Tags: code inferno


<< Previous: "Inferno Part 0: Namespaces"
Next: "Inferno Part 2: Let's Make a Cluster!" >>