Open Testware Reviews

Scripting Language Survey

Copyright 2004 by Tejas Software Consulting - All rights reserved.

Reviewed: 2004-January-7

Scripting languages are well-suited for many testing tasks. They are used to "glue" different programs together that need a little help integrating. They are also used to rapidly develop test harnesses, test data generators, and test scripts. In this survey I attempt to characterize what a scripting language is, and I present what I think are the languages and shells that you most need to be aware of.

There are a few scripting languages that are bundled with certain operating systems, and thus widely accessible, but do not qualify as freeware. So popular languages like VBScript and AppleScript are not included here. All of the languages listed here may be used on a multitude of operating systems.

About scripting languages

What is a scripting language? A precise definition is hard to pin down, but I'll give it a try. Scripting languages allow you to accomplish simple tasks with a minimum amount of code. You can usually write a program that prints "Hello, world" to a terminal with a single line of code in a scripting language, while it generally takes three to ten times as much code in a language like C, C++, or Java, and you have to know much more about the language in order to complete your first simple program. Part of the reason for this is the fact that scripting languages tend to be weakly typed, meaning you don't have to work as hard to declare and match up your data types. Also, script code generally doesn't have to be wrapped inside a "main" function, so that reduces some superfluous syntax.

Scripting languages are usually interpreted rather than compiled, though it's not unusual to find a compiler for a scripting language or an interpreter for a language that is usually compiled. To further confuse things, some languages, whether scripting languages or otherwise, are compiled into an intermediate byte code, and then the byte code is interpreted. This includes the Perl scripting language and Java, a compiled language. However, Perl is almost always compiled "on the fly," while Java is compiled before the program is run. When using a compiled language, I've been known to create a shell alias that compiles and runs a program with a single command. So for small programs, the interpreted vs. compiled debate is moot, but for larger programs that require significant compile time, a pure interpreter will start up faster, shortening the edit-compile-debug cycle significantly.

Scripting languages are also likely to have features that make them dynamic. For example, this might mean being able to generate code at run time and execute the new code within the program that generated it. Also, it's easy to make variables spring into existence when needed, and in the more sophisticated scripting languages, they can easily grow into large dynamically allocated data structures.

With some expert-level programming, scripting languages can be embedded into other applications in order to give them some programmability without having to invent a new language. Also, you can usually link code written in other languages into a scripting language.

One of Perl's mantras is telling: "Perl makes easy things easy, and hard things possible." Whether you'd want to tackle some of the larger programming challenges with a scripting language is a matter for debate. Some scripting languages have optional features that make them safer for programming "in the large," but compiled languages are still the best choice for the most ambitious projects, if for no other reason than performance.

Here are the some of the types of resources you may be able to find for a scripting language -
For more information, see Hyperdictionary's definition of scripting language and Ousterhout's dichotomy, and the Wikipedia entry on scripting programming languages, all of which give a somewhat different perspectives than the one I give here. My article The Shell Game gives more background about the landscape of shell scripts in 1996, which hasn't changed considerably since then.

About the matrices

There are two matrices here, one for general-purpose scripting languages, and one for shells. All of these languages are available with an open source license. Many are also available from commercial vendors. There are often additional distributions of a language, especially when they have been ported to a platform that isn't directly supported by the main source base. I did not attempt to catalog all the distributions of each language. In fact, there sometimes isn't any central authority for a language. The URLs I give in the links below are the best starting points for learning more.

The first language that we could label as a scripting language was JCL, released as part of OS/390 in 1964. I did find a port of JCL to a shell-like beast on Unix, but I'll spare you the potential agony by not telling you where I found it. The list of general-purpose scripting languages below is a subset of a large field of such languages. The ones I chose to list are here because they're popular, or because they represent a particular class of languages. The range of platforms supported by these languages vary from "most systems that you're likely to use" to "almost every system under the sun."

The language that seems to garner the most respect is my favorite language, Perl. Two newer languages that also are generating quite a bit of buzz are Python and Ruby. There are seasoned experts who have written about Python or Ruby, giving good reasons one is a better choice than Perl. Given no constraints, these three would be my top choices when choosing a scripting language. Honorable mention goes to tcl and PHP.

Shells, the command-line portals into your computer, also give us a basis for script writing. Sometimes, because of constraints on whether all the computers that run your code will have a particular language interpreter installed, or simply because of storage space constraints on embedded systems, the shell offers your only scripting option. Shells give you the basic syntax for a language, but only a few of the commands that you would invoke from your scripts. Most of the commands you would use are each separate programs installed on your computer. Some older shells can't even evaluate simple math expressions without running a separate program. Because of variations in how these external programs work, shell scripts do not tend to be as portable as scripts written in a general-purpose language. Shell programming can be a bit of a challenge because of esoteric syntax, having to carefully manage the interface between the shell and the utilities you call, and dealing with non-uniform interfaces of the utilities you use. Note that it's common to call other interpreters such as awk to aid in the operation of a shell script.

I used to be a tcsh devotee for command-line use, though I used the Bourne shell for shell programming when I wasn't using Perl. Then I installed Linux and found that the default shell was bash. It behaves just like tcsh for all the important command line features I used, and it's also useful for programming since it's compatible with the Bourne shell. So I tend to use the bash shell lately. I've had occasion to use all the shells on the list at time or another.

General-purpose scripting languages

Tool
Notes
awk
A venerable and fairly simple language with many implementations and a POSIX standard. Commercial Unix variants ship with closed-source versions of awk. GNU's gawk is a common open source implementation.
Expect
Tcl with extensions for testing interactive command-line interfaces. Expectk also incorporates the Tk graphics library.
Guile
An implementation of Scheme designed for ease of use as a scripting language. Some people consider LISP and its cousin Scheme to be scripting languages, because they exhibit many of the common attributes of a scripting language.
Javascript
Usually embedded within a web browser, and rarely used for standalone scripting. Now governed by the ECMAScript standard, but still generally called Javascript. Microsoft calls their implementation JScript. Not closely related to Java.
Perl
Very popular language for web CGI scripts, text processing, system administration, and many other tasks. Starting to lose mindshare to newer languages like Python and Ruby.
PHP
A project of the Apache Software Foundation, usually used for server-side web scripting.
Python
A relative newcomer that appeared after Perl and has been winning over some Perl programmers because they claim that the Python code they write is more understandable.
Rexx
An ANSI-standard procedural language that was also designed to be a general-purpose macro language for applications. Seems to get most of its support from users of mainframes, OS/2, and AmigaDOS.
Ruby
An even newer newcomer that's winning over some Perl and Python programmers. Still not nearly as well-known as Python and Perl. Seems to have strong support in the agile development community.
tcl
General-purpose language originally designed to be embedded into applications. Uses a fairly simplistic interpreter and a rigid syntax. Has a noticeable following within a subset of the testing community.

Shells

Shell name
Notes
ash (Almquist shell) A Bourne-compatible shell designed to have a very small footprint. I haven't found an authoritative web page for it.
bash (Bourne Again Shell)
The shell of choice for the GNU project and very popular on Linux and many Unix variants. Based on the Bourne shell.
busybox
Contains an implementation of a simplistic shell and compact versions of dozens of the commands that are commonly called from the shell. Intended for use on space-constrained systems.
csh (C shell)
Developed shortly after the Bourne shell, and included improvements over the Bourne shell, especially in its command-line usage. No longer commonly used for scripting because of the widely distributed essay "Csh Programming Considered Harmful." Often now implemented within the same executable as tcsh and invoked as a link to tcsh.
ksh (KornShell)
Incorporates some of the C shell's enhancements into the more solid programming base of the Bourne shell. Commonly used for programming today.
rc
Originally written for the Plan 9 operating system. Defines a category of its own. The little-known es shell is based on rc.
sh (Bourne shell) One of the oldest Unix shells, and the antecedent of many other shells. Newer versions of sh are POSIX compliant.
tcsh (TENEX C shell)
A enhancement of csh, and now used much more than csh.
zsh
Yet another shell alternative, drawing features from bash, ksh, and tcsh. Part of the Bourne shell family.