KSH93 Custom Builtins 1

The majority of GNU/Linux and UNIX shells are not designed for extensibility or embeddability. The current exception is the 1993 version of the Korn Shell (ksh93) which includes support for runtime linking of libraries and custom builtins and accessing shell internals.

It is very difficult, however, to find good information or examples of how to implement ksh93 custom builtins. The source code to ksh93 has virtually no comments and the supplied documentation is extremely terse and often conflicts with other sections of the documentation or the source code itself.

This post is an attempt to show by example how to write your own simple ksh93 custom builtins. You are expected to be reasonably proficient in the C language and the use of the gcc compiler/linker. However, before we start, it is important to note that custom builtins can only be implemented on operating systems that support dynamic loading of shared objects into the current running process since, internally, a custom builtin is invoked as a C routine by ksh93. Fortunately most modern operating systems provide this feature via the dlopen(), dlsym(), dlerror() and dlclose() APIs.

Why bother implementing ksh93 custom builtins? The answer lies in fact that custom builtins are inherently much faster and require less system resources than an equivalent routine which uses other standalone commands and utilities. A custom builtin executes in the same process as the shell, i.e. it does not create a separate sub-process using fork() and exec(). Thus a significant improvement in performance can occur since the process creation overhead is eliminated. The author of ksh93, Dave Korn, reported that on a SUN OS 4.1 the time to run wc on a file of about 1000 bytes was about 50 times less when using the ksh93 wc built-in command.

There are two ways to create and install ksh93 custom builtins. In both cases, the custom builtin is loaded into ksh93 using the ksh93 builtin command. Which method you use is entirely up to you. The easiest way is to write a shared library containing one or more functions whose names are b_xxxx where xxxx is the name of the custom builtin. The function b_xxxx takes three arguments. The first two are the same as for the main() function in a C program. The third argument is a pointer to the current shell context. The second way is to write a shared library containing a function named lib_init(). This function is called with an argument of 0 when the shared library is loaded. This function can add custom builtins with the sh_addbuiltin() function.

I believe that the best way to learn about a new feature is to actually write code which uses the new feature. Following are two relatively simple examples which demonstrate the basics of how to write custom builtins. These examples were written and tested using ksh93 version M 93s+ 2008-01-31 and CentOS 5.0 but should compile and work on any modern UNIX or GNU/Linux operating system.

Example 1  Write a simple custom builtin called hello which takes one argument and outputs “hello there to stdout.
/* hello.c */
#include <stdio.h>
int
b_hello(int argc, char *argv[], void *extra)
{
   if (argc != 2) {
      fprintf(stderr,"Usage: hello arg\n");
      return(2);
   }

   printf("Hello there %s\n",argv[1]);
   return(0);
}
Next compile hello.c and create a shared library libhello.so containing the hello builtin.
$ gcc -fPIC -g -c hello.c
$ gcc -shared -W1,-soname,libhello.so -o libhello.so hello.o
Some operating systems (Solaris Intel for example) do not require you to build a shared library and support the direct loading of hello.o. However the majority of operating systems require you to create a shared library as we have done for this example. Note the use of the –fPIC flag to indicate position independent code should be produced. Unlike relocatable code, position independent code can be copied to any memory location without modification and executed.

To actually use the hello custom builtin, you must make it available to ksh93 using the ksh93 builtin command.
$ builtin -f ./libhello.so hello
If you are unfamiliar with the builtin command, you can type builtin –man or builtin –help for more information or read the ksh93 man page.

You can then use the hello custom builtin just like you would use any other command or shell feature:
$ hello joe
Hello there joe
$ hello "joe smith"
Hello there joe smith
$ hello
Usage: hello arg
$
Note that the hello custom builtin will show up when you list builtins using the builtin command.
$ builtin
....
hello
....
but not when you list special builtins using the builtin –s option.

To remove the hello builtin, use the builtin –d option.
$ builtin -d hello
$ hello joe
/bin/ksh93: hello: not found [No such file or directory]
$
Removing a custom builtin does not necessarily release the associated shared library.

Internally hello is named b_hello() and takes 3 arguments. As previously discussed custom builtins are generally required to start with “b_” (There is an exception which will be discussed in a later example.) The arguments argc and argv act just like in a main() function. The third argument is the current context of ksh93 and is generally not used as another mechanism, sh_getinterp(), is provided to access the current content.

Instead of exit(), use return() to terminate a custom builtin. The return value becomes the exit status of the builtin and can be queried using $? A return value of 0 indicates success with > 0 indicating failure. If you allocate any resources such as memory, all such resources used must be carefully freed before terminating the custom builtin.

Custom builtins can call functions from the standard C library, the AST (Advanced Software Technology) libast library, interface functions provided by ksh93, and your own C libraries. You should avoid using any global symbols beginning with sh_, .nv_, and ed_ or BSH_ since these are reserved for use by ksh93 itself.

If you move libhello.so to where the shared libraries normally reside for your particular operating system, typically /usr/lib, you can load the hello custom builtin as follows
$ builtin -f hello hello
as ksh93 automatically adds a lib prefix and .so suffix to the name of the library specified using the builtin –f option.

It is often desirable to automatically load a custom builtin the first time that it is referenced. For example, the first time the custom builtin hello is invoked, ksh93 should load and execute it, whereas for subsequent invocations ksh93 should just execute the hello custom builtin. This can be done by creating a file named hello as follows:
function hello
{
   unset -f hello
   builtin -f /home/joe/libhello.so hello
   hello "$@"
}
This file must to be placed in a directory that is in your FPATH environmental variable. In addition, the full pathname to the shared library containing the hello custom builtin should be specified so that the run time loader can find this shared library no matter where hello is invoked.

There are alternative ways to locating and invoking builtins using a .paths file. See the ksh93 man page for further information.

Example 2  Uppercase the first character of a string.
#include <stdio.h>
#include <ctype.h>

int
b_firstcap(int argc, char *argv[], void *extra)
{
   int c;
   char *s;

   if (argc != 2) {
      fprintf(stderr,"Usage: firstcap arg\n");
      return(2);
   }

   s = argv[1];
   c = *s++;

   printf("%c%s\n", toupper(c), s);

   return(0);
}
Assuming you created a library called libfirstcap.so and placed this library in the default directory for shared libraries you can load and use this custom builtin as follows.
$ builtin -f firstcap firstcap
$ firstcap joe
Joe
$ firstcap united
United
$
Custom builtins can be used to extend in many useful ways just as Perl modules are used to extend Perl and Python modules are used to extend Python. To date this has not happened with ksh93. I believe that this is mainly due to the lack of good documentation on how to write custom builtins.

This post is but a brief introduction to the subject of ksh93 custom builtins. To really learn how to write custom builtins, you ahould download the ksh93 sources and study them. Also read "Guidelines for writing ksh-93 built-in commands" (builtins.mm) which is located in the top-level directory of the ksh93 source tree.

0 comments:

Post a Comment