It's user error.

Debu.gs


Inferno Part 2: Let's Make a Cluster!

Now that the shell’s mostly covered in the previous entry, we’re going to have some fun!

“But the shell was pretty fun!”

But the network is the computer! So let’s build a cluster. After some introductions, we’ll cover the initial setup, and finally show how to use the it with a few simple examples: interacting with remote namespaces, using cpu(1)‌ to run processes remotely, sharing home directories across the network, sharing a clipboard across several machines, and parallelizing computations across machines with cpu(1)‌ and file2chan(1)‌.

Where Inferno Gets Really Interesting

It’s here. The shell is the best shell ever, Limbo is a great language, the Dis VM is lightweight and speedy, but the really interesting part is a design feature: networking and distributed computing are built into the OS at a fundamental level.

It’s a bit difficult to convey exactly how this affects everything you do unless you use the operating system, but if you can imagine the time and effort that went into scp, for example, you’ll have an idea. In Inferno, networking is built in (and the API is nicer than BSD sockets, but that’s a bit subjective), so you don’t need to write any special networking code. The OS handles authentication and encryption for 9P connections, so no need to add that to your code. Inferno handles, in fact, everything you need to handle besides the file copying, which is neatly handled by cp without additional arguments.

Not just scp, either, but no software needs to be particularly aware of networks, authentication, encryption, or nearly anything related to distributed computing. scp, sftp, ssh, rsync, MapReduce, Hadoop, nearly every specialized client and several servers go away, because external resources are decoupled from the clients that make use of them, and the network is decoupled from servers. That is to say, clients and servers need not integrate code to deal with the network, and often do not need to know that there is a network.

You have to use it for any of that to sink in, but if you can recall the first time you really “got” Forth/Lisp/Unix/FP, then you have an idea what it’s like. (There is, incidentally, a Scheme implementation for Inferno if you are a fan of Scheme.) So, like before, I’ll try to make things example-heavy, and I recommend you install Inferno in a couple of places so you can try out the examples yourself and play with them.

Just a note about architecture and a couple of conventions

The traditional Inferno network (as well as Plan 9) keeps resources separated. You put big CPUs and lots of RAM in the CPU servers, the large disks in the fileserver, and the nice I/O devices on the terminals. The CPU servers and terminals are often diskless and boot from the network, mounting their root filesystems from the file server. Of course, this is less likely for hosted Inferno, but this is roughly the configuration for which Inferno was designed.

This isn’t the only configuration, but it does tend to be the optimal one for Serious Business. As I am running my web/mail/DNS/etc. servers inside my house and I tend to find uses for old computers rather than tossing or selling them, my house is a bit overrun with machines, more than a few of them franken-PCs. The “main” one is the desktop, with all the RAM/CPU/disk, the other ones are usually either clients that talk to the main one over ssh, or use it as a file server for music and movies. This is the home equivalent of the traditional Unix architecture: one big machine, several “terminals”.

Luckily, Inferno is flexible enough that it can handle all of that, but the OS itself makes no assumptions about how the network is architected.

For the sake of making sense, I’m going to write as if these were all separate, and by convention name machines things after their function (e.g., auth1, terminal, etc.). They may be different machines, different instances of Inferno on the same machine, or even the same instance of Inferno, depending on how you set things up. To avoid having to make distinctions without differences, it’s clearer and less confusing to talk about “logical machines” named after their function.

You’ll also need to know about a couple of conventional pathnames. /n is usually used for arbitrary remote services, and /mnt for services exported by local programs. Thanks to the magic of mntgen(4)‌, the mountpoints themselves need not exist. You can set it up and play with it like this:

; mount -a {mntgen} /n
; mount -a {mntgen} /mnt
; ls -ld /n/arbitrary-dirname
d-r-xr-xr-x M 2 me me 0 Dec 31  1969 /n/arbitrary-dirname

The ls(1)‌ command (or any other process that tries to read /n/arbitrary-dirname) will see an empty, read-only directory. If you mount something on it, then the directory will persist in memory. I’m going to assume that you either have set up mntgen for those directories or you know how to operate mkdir(1)‌ when presented with a complaint from the OS that a mountpoint doesn’t exist.

Usually, you set up mntgen in your initialization scripts (preferably in the initialization script you pass to emu(1)‌ when starting Inferno /lib/sh/profile, /lib/wmsetup, or $home/lib/wmsetup). Inferno is very sparing with its defaults.

You’ll also need to startup the networking services if you want any of this to work:

; ndb/cs
; ndb/dns

It’s recommended you add those to the startup, too. Since this is the sort of thing that need occur only once, you might prefer to pass it in the startup script, since presumably you will spawn more than one shell and you can spawn more than one window manager:

Authentication in Inferno is designed very differently than in Unix; it’s even different than in Plan 9. There are plenty of papers on that topic for the interested. In the interests of staying on topic (practical usage) and only presenting things that I am reasonably certain I will present competently, I’ll keep my discussion of the innards minimal.

Trivially

You may remember this from Part 0.

On disk1:

; mount -c {memfs -s} /n/mem
; echo something > /n/mem/asdf
; listen -A 'tcp!*!1234' { export /n/mem & }

On terminal:

; mount -Ac 'tcp!disk1!1234' /n/remotememfs
; cat /n/remotememfs/asdf
something

This ignores authentication (which we’ll get to), but what you have there is two lines of shell script to expose a key-value store to the network, considerably less code than memcached. NoSQL is easier to implement than most people think! (I am only sort of joking here.) If you have a Linux system handy, you can access the (unauthenticated) Inferno filesystems with 9mount:

$ mkdir /tmp/memfs
$ 9mount -i 'tcp!disk1!1234' /tmp/memfs

If you’re not on Linux and don’t mind using a client library rather than mounting the filesystem directly, there are several 9P implementations (like libixp).

Authentication

Let’s get the computers to talk to each other securely!

Processes that want to securely use services provided by other processes use certificates to authenticate with those services. The services will ask an authentication server to verify the certificate, and the authentication server will (provided the certificate is valid) tell the service who the user is. You can have these arbitrarily split up; see /lib/ndb/local for the local setup, although these examples will work without modifying that file.

Initial setup

You’ll need an auth server first. We’ll call it auth1. Any instance of Inferno can be an auth server, any user can run an auth server, etc. Here’s how you set it up:

; auth/createsignerkey `{cat /dev/sysname}
; svc/auth
Key: 
Confirm key: 

createsignerkey(8)‌ does exactly what the name suggests: It creates a signer key. The key ends up in /keydb/signerkey. This command, by the way, can take a while.

The next step, svc/auth(8)‌ is actually a brief script (that I have added a few newlines to for presentation here):

; cat /dis/svc/auth
#!/dis/sh.dis -n
load std
or {ftest -e /net/dns} {ftest -e /env/emuhost} {ndb/dns}
or {ftest -e /net/cs} {ndb/cs}
or {ftest -f /keydb/signerkey} {
    echo 'auth: need to use createsignerkey(8)' >[1=2]
    raise nosignerkey
}
or {ftest -f /keydb/keys} {
    echo 'auth: need to create /keydb/keys' >[1=2]
    raise nokeys
}
and {auth/keyfs} {
    listen -v -t -A 'tcp!*!inflogin' {auth/logind&}
    listen -v -t -A 'tcp!*!infkey' {auth/keysrv&}
    listen -v -t -A 'tcp!*!infsigner' {auth/signer&}
    listen -v -t -A 'tcp!*!infcsigner' {auth/countersigner&}
}
# run svc/registry separately if des
ired

keyfs(4)‌ will prompt for a password, and the first time for a confirmation. You can modify the script to load that from a file (with the usual caveats that relate to keeping things like this in a file) by changing the line with keyfs on it to this:

and {auth/keyfs -n /some-file-with-crazy-permissions} {

The block full of listen(1)‌s is the important bit. When listen gets a connection on the appropriate port, it will spawn a server to handle the request. The numeric values for the ports can be found in /lib/ndb/services. Of course, the services run without authentication, since in order to authenticate, you will need to contact those services.

Next, we’ll create an account on the local machine, and then test it out by getting a key authorizing the current user to access that account:

; auth/changelogin $user
new account
secret: 
confirm: 
expires [DDMMYYYY/permanent, return = 11062013]: permanent
change written
; getauthinfo default
use signer [$SIGNER]: localhost
remote user name [pete]: 
password: 
listen: got connection on tcp!*!inflogin from 127.0.0.1!58005
save in file [yes]: 

changelogin(8)‌ creates or modifies the keys for an account. The password (the secret: prompt) must be at least eight characters long, but otherwise there are no restrictions. You can set an expiration (the default is one year in the future) or no expiration (“permanent”).

getauthinfo(8)‌ downloads a key, which you can use to authenticate. If you provide the correct password, you’ll have a key in $home/keyring named default, which is the default key for any random service you might encounter. The default is to save it into a file, but if you elect not to do so, getauthinfo will act as a fileserver, serving a one-file directory that is mounted on top of your keydir. If you recall how namespaces work, this means that only the current process and its descendants will be able to see the key. In Unix terms, this is like the difference between appending your public key to .ssh/authorized_keys and simply logging in, but with significantly less code.

After you complete the initial setup, you can start services using just svc/auth from now on.

Let’s test it out! Pop open two shells, and try this:

; listen -v 'tcp!localhost!9001' { echo $user & }

And on the other shell,

; dial 'tcp!localhost!9001' { cat >[1=2] }
pete

Whereupon the first shell (because we passed the -v flag) will print the following diagnostics:

listen: got connection on tcp!localhost!9001 from 127.0.0.1!37298
listen: authenticated on tcp!localhost!9001 as pete

Or something like them; it is likely that your username is not “pete”, and the outbound port may not be 37298.

Accessing Remote Resources

If you have two different machines running Inferno, ssh to it, open a VNC connection or remote desktop connection, walk over to the physical machine, bring the physical machine over next to the first one, whatever. Get an Inferno shell on the other machine (which we’ll call the terminal) and do this:

; getauthinfo tcp!auth1
use signer [$SIGNER]: auth1
remote user name [pete]:
password:
save in file [yes]: no

By default, if you want to authenticate from the terminal with the machine auth1, commands will often use $home/keyring/tcp!auth1, so that’s what we call the key. If you set up auth1 as the only authentication server for your network, you can just add it as SIGNER in /lib/ndb/local and use default instead of tcp!auth1. If you like your text-entry to go into boxes instead of a shell, you can also do wm/getauthinfo, which pops up a friendly window.

Of course, you can change your mind about saving the key into a file or not; as long as getauthinfo is running, you can copy the key somewhere else.

If you still have the listen command running on auth1, you can test authentication with the dial command (except, of course, that you’ll want to modify localhost to be whatever the hostname or IP address of auth1 is). You can specify the key file manually with -k, and you have your choice of encryption and MAC algorithms with -a. (See dial(1)‌ for more information.)

Now It Gets Interesting

Back on auth1, in the same shell used to start svc/auth, try these out:

; svc/styx
; svc/rstyx

svc/styx runs a file server and svc/rstyx runs a CPU server. They’re both shell scripts and both very brief:

; cat /dis/svc/styx
#!/dis/sh.dis -n
load std
listen -v 'tcp!*!styx' {export /&}    # -n?

; cat /dis/svc/rstyx
#!/dis/sh.dis -n
load std
listen 'tcp!*!rstyx'  {runas $user auxi/rstyxd&}

styx

svc/styx exposes the whole (almost; listen(1)‌ is running without the -t option for “trusted” connections) namespace to anyone who can authenticate, so if you have permissions on auth1 for a file, you can do things with that file on terminal:

; mount -k tcp!auth1 tcp!auth1!styx /n/auth1
; cat /n/auth1/dev/sysname
auth1

scp, sftp, etc. are all unneeded in such an environment. Whatever file manager you use locally can handle things nicely. In fact, anything that interacts with files (everything in Inferno) can now talk to files on both auth1 and terminal. grep(1)‌, wc(1)‌, diff(1)‌, the whole family. You can open up an editor and edit two files on two different machines simultaneously, without adding code to the editor itself. Since you can arbitrarily mangle your namespace, you can do things like this:

; ps
[Elided:  all the processes running locally]
; bind /n/auth1/prog /prog
; ps
[Elided:  the processes that are running on auth1!]

After you’ve done that, you can even fire up wm/deb(1)‌ to debug a remote process. wm/deb doesn’t know or care if it’s debugging a local or a remote process; it just interacts with processes via the /prog filesystem.

rstyx

svc/rstyx is even more fun. It listens on the rstyx port and when it gets an authenticated connection, it runs the rstyxd(8)‌ command as that user. This enables you to use cpu(1)‌ to connect to it.

You can run whatever command you like (and, as with mount, dial, etc., you can specify arbitrary combinations of encryption algorithms and digests); by default it just gives you a shell. It might look a lot like ssh at first glance:

; cpu auth1
% lc keyring
default
% cat /dev/sysname
terminal

Sure enough, that is exactly what’s in keyring/ on auth1, but why is /dev/sysname telling us the shell is still running on terminal? It’s not, and I’m going to ruin the surprise for you by using ns(1)‌:

; ns | grep /dev
bind /dev /dev
bind -b '#^' /dev
bind -b '#m' /dev
bind -b '#c' /dev
bind -a /dev /dev
bind -b '#i' /dev
bind /chan/cons.48 /dev/cons
bind /chan/cons.48ctl /dev/consctl
bind -b /n/client/dev /dev

See the last line? /n/client/dev is mounted atop /dev, which means that we can’t see the host’s /dev, since it’s been replaced with the client’s. You can unmount /n/client/dev /dev, of course, but if you’re being astute and this OS is new to you, you might have jumped out of your chair and pointed at /n/client, your jaw now making contact with your chest.

It is exactly what it looks like: the namespace on terminal has automatically been exported for our process on auth1 and bound to /n/client. In fact, if you’ve run cpu from the same shell that you ran the earlier mount on, you can do this:

; cat /n/client/n/auth1/dev/sysname
auth1

A Plague of Flies Files!

Now that we have the building blocks in place, you can probably already imagine a number of uses.

Exporting the host’s filesystem.

Inferno, you may recall, has access to the host’s filesystem via the special device #U*. You can export /home (or, for OSX, /Users) like this:

; listen 'tcp!*!1337' { export '#U*/home' & }

Or even the whole filesystem:

; listen 'tcp!*!31337' { export '#U*/' & }

The connection is securely authenticated, messages are digested, encryption can be used, and the whole thing is secure. Perhaps this still makes you uncomfortable, though. No worries, you can turn that off using pctl:

; ls -ld '#U*/'
d-rwxr-xr-x U 5 root root 0 Apr 24 15:37 #U*/
; load std
; pctl nodevs
; ls -ld '#U*/'
ls: stat #U*/: mount/attach disallowed

This prevents mounting or binding new special devices. Devices that are already mounted and in the namespace will still be accessible, of course, but you have fine-grained access to the namespace, so you can expose only what you want, from an empty read-only directory all the way up.

Shared $home

Once terminal can authenticate with auth1, you can have some initialization scripts run as part of the startup to mount home from auth1. If an appropriate key for auth1 isn’t found, run getauthinfo to get it, and then mount the home directory.

The server to serve each user’s home might look like this:

listen 'tcp!*!12345' {export /usr/^$user &}

Note that every user authenticating with auth1 will see only their own home directory, thanks to the block running with $user set to the appropriate user. (If you are curious about why home directories are in /usr in Inferno and Plan 9, it’s explained in an interesting thread on the Busybox mailing list: users used to live in /usr, but they got moved out for reasons that are no longer relevant.)

And the the fragment of the login script might look like this:

or {! ftest -f $home^/keyring/tcp!auth1} {getauthinfo tcp!auth1}
mount -bc -k tcp!auth1 tcp!auth1!12345 $home
run $home^/lib/profile

You could make sure the key isn’t accidentally saved on the terminal by mounting memfs(4)‌ on top of $home^/keyring. Since you’ll be sharing a home, you’ll already have access to the keys that you’ve saved on auth1, so once $home is mounted, you’ll have all of your keys accessible on the terminal.

Shared $home/$HOME

If you don’t want to have a host OS $HOME separate from your Inferno $home, this is simple, too. Assuming Inferno usernames and Unix usernames to be the same (which they need not be, necessarily):

bind -bc '#U*/home' /usr
listen 'tcp!*!12345' {export /usr/^$user &}

Now, any Inferno instance that you log into can see the same $home, which will also be the same as the $HOME you see when using the host operating system.

Never Email Yourself a Link Again

But since everything is a file, you can get more creative than those examples. When you are looking at a website, for example, and wish to look at it on another site, there are a few solutions, none of them especially great: emailing the link to yourself, sending it to yourself via IM, pasting it into a screen session, ssh’ing and remotely opening the browser (if DBUS doesn’t try to outsmart you…don’t laugh, try firefox $some_url over ssh -X between two Linux hosts; it makes Firefox open the link on the client’s machine for some unholy reason), using VNC’s clipboard, xclip -o | ssh somewhere-else sh -c "'DISPLAY=:0 xclip'", or what have you.

The clipboard is called by the unfortunate name “snarf buffer” in Inferno and Plan 9. Naturally, it’s available as a file, and located in /chan/snarf, so cat /chan/snarf will get you the contents, and echo asdf >/chan/snarf will put “asdf\n” into the buffer. In hosted Inferno, it’s synchronized with the host OS’s clipboard as well.

Assuming all the hosts you might want to connect to have exported / with svc/styx or by some other means:

mount -bc {memfs -s} /n/snarfs
for host in `{ls -p $home^/keyring/* | grep -v '^default$'} {
    mount -k $host $host^!styx /n/^$host
    sb := /n/^$host/chan/snarf
    and {ftest -f $sb} {
        echo -n >/n/snarfs/^$host
        bind $sb /n/snarfs/^$host
    }
}

With that, you can now read and write snarf buffers at will, and since it syncs with the host OS’s clipboard, you can copy on one machine and paste on another. You can even do this automatically:

psbc = ''
while {} {
    sbc = "{cat /chan/snarf}
    or {cmp <{echo -n $sbc} <{echo -n $psbc} >/dev/null} {
        psbc = $sbc
        for sb in /n/snarfs/* {echo -n $sbc > $sb}
    }
    sleep 1
}

This will read the contents of the snarf buffer once a second, check it against the previous contents, and write it to all of the remote buffers. So you have the clipboard/snarf buffer synchronized everywhere!

Of interest is the syntax used for cmp(1)‌ (which reads files and reports whether or not they differ). In the shell, <{command} redirects the block’s standard output to a new file descriptor, and then replaces the argument passed with /fd/N, where N is the number of the file descriptor.

It’s left as an exercise for the reader to figure out how to make it synchronize the remote clipboards locally without running the code on all of them. (As I am having difficulty resisting a hint, recall that sh-std(1)‌ does a good job with lists, and changing the cmp line to loop over all of the snarf buffers would be simple.)

Distributed Processing

Now that you can see everything from anywhere (securely even!), let’s play with building a little compute cluster. We’re going to write a painfully slow algorithm to handle a (semi-)trivial math problem: Project Euler #53. Unless you’re into a math hobbyist, you don’t need to worry too much about what we’re doing or what that problem even means, because the way to implement it is simple: we make a function c(n,r) = n!/(r!*(n-r)!), where r<=n, and we count how many values of n between 1 and 100 have values of r that exceed one million. The important parts are that it’s simple arithmetic and that the naive version of the algorithm is slow, easy to understand, and easy to parallelize.

The non-parallel version of the program

It’s very, very slow:

#!/dis/sh
load std mpexpr
 
subfn fac {
    (n s) := $1 1
    while {! ~ $n 0} {
        s = ${expr $s $n '*'}
        n = ${expr $n 1 -}
    }
    result = $s
}
 
subfn c {
    (n r) := $1 $2
    nf := ${fac $n}
    rf := ${fac $r}
    n_rf := ${fac ${expr $n $r -}}
    result = ${expr $nf $rf $n_rf '*' /}
}
 
subfn rs_for_n {
    n := $1
    t := 0
    for r in ${expr 1 $n seq} {
        c = ${c $n $r}
        and {~ ${expr $c 1000000 '>'} 1} { t = ${expr $t 1 +} }
    }
    result = $t
}
 
total = 0
 
for i in ${expr 23 100 seq} {
    total = ${expr ${rs_for_n $i} $total +}
}
echo $total

It takes the straightforward approach of looping from 23 to 100 for values of n and for each of those, it loops from 1 to n for values of r. For each of those pairs, it checks if c(n,r) is greater than a million, and increments a counter if so. At the end it spits out a total.

As I said before, the main relevant parts of this are that that it is slow and, in the words of Mike Muuss (requiem in pax), “embarrassingly parallel”. (An aside: Mike Muuss invented ping. Maybe you have heard of ping. His story about it is great.)

I’ve saved it as 053.sh. Let’s try it out:

% time sh 053.sh >/tmp/the_answer_to_the_problem
0l 57.09r 57.09t

Ouch. Not at all fast. Mission accomplished.

The parallel version

The parallel version is split into two pieces: one script for the workers to run, and one to manage them. Workers pull an integer out of the queue, calculate how many values of r exceed 1,000,000 for that value of n, and send that back to the host. We could do this a number of ways. For example, we could use listen(1)‌ and pass numbers through there, but (via /n/client) the local namespace will be available to the remote CPU servers, so why not take advantage of that? One way we could do that is by mounting a memfs(4)‌ filesystem over it, and coming up with a small protocol for communicating using files (e.g., the worker makes a file named n.started, writes a value to it, and then moves it to n.finished).

But why do either of those, when we could be a little more explicit about our intentions, and save ourselves some code (and bugs!) while we’re at it? As it turns out, there’s a shell module called file2chan(1)‌ that gives us a basic building block for creating exactly that: synthetic files that call a shell block for reads and writes. The worker would be simple to write.

The worker script

It just needs to read jobs from the queue, and then write the answers back somewhere. We’ll have it take the two files as arguments:

#!/dis/sh -n
load std mpexpr

subfn fac {
    (n s) := $1 1
    while {! ~ $n 0} {
        s = ${expr $s $n '*'}
        n = ${expr $n 1 -}
    }
    result = $s
}    

subfn c {
    (n r) := $1 $2
    nf := ${fac $n}
    rf := ${fac $r}
    n_rf := ${fac ${expr $n $r -}}
    result = ${expr $nf $rf $n_rf '*' /}
}

subfn rs_for_n {
    n := $1
    t := 0
    for r in ${expr 1 $n seq} {
        c = ${c $n $r}
        and {~ ${expr $c 1000000 '>'} 1} { t = ${expr $t 1 +} }
    }
    result = $t
}

i = -1
fn nx {
    rescue '*' {echo '['^$h^'] Hit an error after' $i 'jobs.'; exit} {
        n = `{sed 1q < $qc}
        i = ${expr $i 1 +}
    }
}

(qc rc) = $1 $2
h = `{os hostname}
echo 'Started up on' $h^'!  Using queue' $qc 'and result' $rc^'.'
nx; while {! ~ $n done} {
    rs = ${rs_for_n $n}
    echo '['^$h^' #'^$i^']' $n '->' $rs
    echo $rs > $rc
    nx
}
echo '['^$h^']' $i 'jobs completed.'

The math from the beginning remains completely unchanged (in particular, we’re still using the same incredibly slow algorithm), and the new code starts with the fn nx line, which defines a function to fetch the next n from the file using sed(1) and maintain a count of the number of jobs completed. You might notice the rescue block, which is executed should we get a read error from a flaky net connection.

After that, it is mostly self-explanatory: we save the arguments for the queue channel in $qc and for the results channel in $rc, and run the calculations, pausing on occasion to spit out some diagnostics. (It does get noisy, and the echo‍s can be commented out if you like, but take care not to comment out the one that reports results back!) It stops when it reads “done” from the file.

Overall, the worker script is pretty boring. It reads jobs from a file and writes jobs back to another file. But the boringness was exactly the goal here: we don’t want to have to write something complicated just to split work up!

The master script
#!/dis/sh -n
load std mpexpr file2chan
 
hosts = $*
and {no $hosts} {
	echo 'You must supply at least one host!' >[1=2]; raise args
}
 
tocheck = ${expr 23 100 seq}
file2chan /tmp/053-queue {
    if {~ ${rget offset} 0} {
        (r tocheck) = $tocheck
        if {no $r} {rread done} {rread $r}
    } {rread ''}
} {}
 
count = $#tocheck
total = 0
file2chan /tmp/053-result {
    if {~ ${rget offset} 0} {echo $count $total | putrdata} {rread ''}
} {
    total = ${expr $total `{fetchwdata} +}
    count = ${expr $count 1 -}
}
 
hpids = ()
w = /n/client^`{pwd}^/053-worker.sh
qc = /n/client/tmp/053-queue
rc = /n/client/tmp/053-result
for host in $hosts {
    cpu $host sh $w $qc $rc &
    hpids = $hpids $apid
}
 
echo 'Waiting for results from' $hpids
while {! ~ $count 0} {
    (count result) = `{cat /tmp/053-result}
    sleep 1
}
echo $result

Now this one is where the interesting bits live. First off, we will need to load the file2chan(1)‌ module along with the others. Next, the argument list is read so that we know which hosts to send the work to.

We use the ${expr} builtin to get a list of the numbers from 23 to 100, and then we create the first channel: the job queue. file2chan takes the file to add to the namespace followed by two blocks: one to execute for reads and one for writes. Inside the read block, we check if we’re at offset zero, and we send the head of the list back across the wire if so, and return an empty read if not. When we’re out of numbers, we send “done”. The write block is empty; conceivably you could allow jobs to be put back into the queue in a straightforward manner. So, when a process attempts to read that file, it’ll get a number while we have numbers to give out. Simple enough. (Hopefully!)

Next we set up a total at zero and then a simple counter that starts with the size of the list. The read block simply returns the values for $count and $total. The write block updates the values based on the data written. Still simple, right?

There’s no real requirement that we send data when a process attempts to read the channel or read the data when one attempts to write it. The blocks are regular shell blocks, and can do anything (although your programs will likely make much more sense if they don’t do just anything).

After that, we fill some values into some variables, and we spawn one cpu(1)‌ for each host passed in. We tell cpu to run the script and provide the queue and result channels, and then keep the pids around.

Finally, we just cat the result channel until all the jobs have finished, and then we spit out the result.

Let’s run it!

I’m running the program on three machines, including localhost. (You’ll need to use getauthinfo and start up rstyxd as described above for each host you plan to use.)

Let’s see how it looks:

; time sh 053-master.sh localhost tcp!cpu1 tcp!cpu2
[So much--SO MUCH--output elided!]
0.001l 27.35r 27.35t

Not too shabby for a house with flaky wifi! You can run any number of CPU servers on the same machine (it may be expedient to run one per core), but you’ll have to specify the ports as well (e.g., tcp!cpu1!9001, tcp!cpu1!9002, etc.)—and remember them! The registry(4)‌ and suite of grid computing tools handle this problem. They’ll be covered in a future post.

“Pete, are you going to write love poems about Inferno for your next entry, or what?”

This entry being both long and overdue, it’ll have to stop here. The building blocks of distributed computing are simple enough to use in Inferno that you should be able to start playing with them right away. If you look over the registry and grid documentation mentioned above, you can get an idea of how a framework built with these blocks looks.

There are two entries in the works: the first is a guide to using the grid tools that come with Inferno with examples (I have a couple of applications in mind, somewhat more practical than Project Euler solutions), and the second that covers more of the shell, including GUIs. I have a few ARM systems (a couple of small dev boards, a netbook, some mini desktops; they’ve all gotten very cheap) and will probably have a few things to say about running Inferno on those. Intermittently, I’ll probably be posting small but interesting hacks.

I blog like UDP packets, so these entries are not guaranteed to arrive in order or at all.

Meanwhile, elsewhere on the web…

I’ve also put this entry on Hacker News.

Alex Efros has produced a translation into Russian for Part 0. (I’ve put a link to the translation in the original article.) Thanks, Alex!

Update (2012-06-22)

Corrected a typo, /lib/sh/profile is the location of the profile script that runs for shells started with -l.

Tags: code inferno


<< Previous: "Inferno Part 1: Shell"
Next: "An Aside about Uriel" >>