ECS 15AT Lecture Notes 

Lecture 15: Programming Languages

 

In this class you have been studying an interpretive high level programming language (MUMPS), learning about basic concepts of programming. Although your introduction to programming is brief, you have had experience in each of the major types of program components, execution flow control, and program design.

 

In many ways, MUMPS is a unique language, since few others are interpretive and no other high-level language has a built-in database feature (global variables in MUMPS) and shared variables between multiple users. The characteristics of several other languages are discussed in this lecture.

 

High-level programming languages can be divided into three major types. The first type consists of sequential programming languages. These languages (of which MUMPS is one) by default execute instructions in sequential order, using execution flow control to branch from normal sequential mode. This was the first type of programming language to be used with modern computers, and sequential languages still dominate the field, although two other types are becoming more widely used.

 

The second type of programming language takes advantage of new hardware technology, which makes it possible to have many processing units in a single computer system. This has led to parallel processing languages. These programming languages identify segments of code that can be executed independently of each other in parallel, and they assign such tasks to different processors in a parallel processor system. We will not go into these languages in this review, but you should know that they are beginning to appear.

 

The third type of programming language uses a completely different concept: "objects," program components that are independent and communicate by messages that invoke actions on in the object receiving a message that may lead to other messages. Although each object performs its tasks sequentially, there is no assumption that the messages between objects are synchronized, therefore there is no overall sequential program in this approach. Although we will not discuss these languages further, you should know that two well-known representatives of this approach are C++, an extension of C that accommodates objects, and Small-Talk, the language that gave this group of languages its big start (developed by Adele Goldberg and Alan Kay at Xerox Palo Alto Research Center: Xerox PARC).

 

Examples of these three approaches to programming are briefly summarized below.

 

Sequential High-Level Languages

Since sequential programs were the first kind used and have dominated the industry for over forty years, they are by far the most common form of high-level programming language. The first such language was developed late in the 1950s and given the name COBOL, for COmmon Business-Oriented Language. As its name implies, COBOL was used in business, especially database types of applications, where it was the dominant language for many years, and in fact it probably still is the most widely used. A great many applications in business are written in COBOL, and the cost of converting those programs to newer techniques is extremely high. As a result, there is still a need for programmers who understand how to program in COBOL, even though the need continues to decline and may become insignificant by the turn of the century. COBOL is wordy, designed to run on large mainframes, and too large to run effectively on small PCs, though it probably would run on the newer versions of Pcs. It is gradually becoming obsolete, principally because the areas in which it has been more useful are now being replaced by special purpose application packages (especially Database Management Systems). COBOL is not taught at Davis, although it is still offered in some business-oriented programs (CSU Sacramento offers courses in COBOL, but only in the business school, not in the Computer Science Department).

 

The second major programming language to gain prominence was FORTRAN, which emerged in the early 1960s as the program for scientists. FORTRAN stands for FORmula TRANslation, and it is especially well suited to numeric applications. FORTRAN is taught in the College of Engineering at Davis (E5 is a FORTRAN course), and many of the engineering faculty use it heavily. It has been updated a couple of times, the most recent standard being FORTRAN 77 (approved as a standard in 1977), and it continues to have many good features for problems requiring numeric calculations. It is weak in handling text, and it is more commonly used on large mainframes and minicomputers than on personal computers, partly for historical reasons and partly because personal computer programmers have moved on to another language (C, described next).

 

The C programming language was originally developed by Bell Telephone's research laboratories in New Jersey as a part of their minicomputer environment (it ran under Unix, and in fact the Unix operating system was originally written in C). C is rapidly becoming the most widely used computer language for general purpose programming in the world today. It is more compact than FORTRAN, allows more hardware-specific actions (the kind required to get into details of operating system interface, image manipulation, communication protocols and many other applications which FORTRAN programs traditionally handle by writing Assembly language subroutines. C is the language preferred by almost all computer scientists and taught more than any other in computer science programs. It is much more difficult to learn than FORTRAN or MUMPS, but good programmers usually say that it is the most effective language for a large variety of applications. One of the difficulties with C is that it relies heavily on libraries of built-in functions; these libraries differ markedly from one operating system to another, so C programs lose a great deal of their portability as a result. However, anyone who plans to go into programming today should plan on learning C.

 

Pascal was designed by a computer scientist who wanted to develop a language specifically to teach about computers and computer programming. It does that job well, but it was never intended to be used for serious programming, although it is sometimes used in that way. Computer Science classes ECS 10 and ECS 30 use Pascal as an introductory programming language. Pascal is in many respects similar to C (with some annoying differences if you learn Pascal before C), but it lacks some of the more important features that makes C such a useful language.

 

There are other sequential languages that are important. Ada, developed by the U.S. Department of Defense, is a new program that was designed to serve all proposed needs of the military. As such, it grew to be quite cumbersome, and the computing community does not appear to have accepted Ada as a major language, despite a considerable amount of support given it by the D.O.D.

 

There are two languages frequently used in the world of Artificial Intelligence (AI): LISP and PROLOG. Students in some sections of ECS 15 will learn a version of LISP called Scheme, which has some features much like MUMPS. LISP stands for LISt Processing, and it is intended to work with text strings rather than numeric. I do not know enough about it at this point to summarize differences between LISP (or Scheme) and MUMPS, but the basic structure of data in Scheme is a list, whereas in MUMPS one deals with strings and hierarchical subscripted nodes. Scheme has fewer text manipulation features such as $Piece and $Extract, but it is inherently self-extending more than MUMPS, which makes it flexible but also less portable from one system to another. PROLOG is also used in AI applications, but its features are very different from those of MUMPS (there is a version of MUMPS which has been extended to include PROLOG-like elements, but this is not a standard system at present). You are unlikely to come into contact with PROLOG unless you go further into the area of AI, so we will not describe it here.

 

PL/I (pronounced P-L-1) is a rather complex language developed for large IBM mainframes which was used extensively as a replacement for FORTRAN and COBOL in the 1970s, but it seems to be much less used today. It has added features that made it more attractive for non-numeric applications, but the complexity and size of the compiler restricted its use to mainframes, and since one can do in C what one might have done in PL/I, the incentive to develop PL/I compilers for personal computers is minimal (I do not know of any available, though some may exist).

 

BASIC (Beginner's All-purpose Symbolic Instruction Code) was developed at Dartmouth for students like yourselves. It became popular on early microcomputer systems (Bill Gates of Microsoft wrote a BASIC compiler early in his career), and there are many programs written in BASIC. It is often implemented as an interpretive language (like MUMPS and Scheme), but there are compiled versions also. Although there are many features of BASIC that are similar to MUMPS, there are also a few differences that caused us not to use it. It is not standard between computers; there are said to be over 80 different versions, and the differences are sometimes significant. It is less powerful for string handling. It has no permanent data, and no hierarchical structure to its variables. It would be very easy for you to learn, now that you have worked with MUMPS.

 

Other Sequential Programming Languages

Although you are unlikely to run into other programming languages unless you get serious about wanting to learn about programming, it doesn't hurt to mention a few other languages just so that you will have heard their names. FORTH is a language which has a small but devoted following; it has some similarities to LISP in that it is easily extended and hence less portable. There are FORTH user's groups in the Bay Area and elsewhere. ALGOL is a language somewhere between FORTRAN and C which is especially associated with Burroughs (now UNISYS) computers. MODULA is a computer science-oriented language, as is ICON; you may study them if you major in computer science, but you are unlikely to see them in other fields. That more or less completes the list of better-known general purpose programming languages.

 

Special Purpose Languages

Up to now we have discussed only general purpose languages, designed to solve large classes of programming problems. There are, however, a great many special purpose languages aimed at solving a much smaller set of problems. You may well be called on to work with one or more of these "languages," so it is well to introduce them in general terms.

 

Many application packages have their own set of commands that enable users to solve more complicated problems than are covered by the novice mode of interaction. Depending on the complexity of these extended commands, they may be considered separate languages. For instance, both Excel and Word Perfect offer extended command capabilities. the Macro feature in Word Perfect comes close to being a special purpose language, as is the formula generation aspect of Excel. There are many shades of sophistication to these types of extensions, and the dividing line between a command structure and a programming language is fairly thin. In general, if you have to run a separate compiler or pre-processor before commands are executed, those command sets would be considered languages.

 

One of the most common subsets of computer applications to have its own set of programming languages includes Database Management Systems, programs such as dBASE, Paradox, Oracle, Ingres, and many others. Each of these packages has its own set of commands, most of them involve compilation prior to execution. In recent years, there has been a trend toward standardizing on a single database language called SQL (for Structured Query Language). Most database packages now offer variations of SQL.

 

A second major set of languages is designed to solve statistical programs. Among the better known are SPSS (Statistical Programs for Social Scientists), SAS (which formerly stood for Statistical Analysis Package, but now stands for nothing), and BMDP (BioMedical Data Package). Of these, SAS is probably the most widely used on this campus, although we have a special support group for SPSS in the College of Letters and Science.

 

Yet another important family of special purpose computer languages is aimed at solving problems of simulation of real world events. CSMP (Continuous Systems Modeling Program), and GPSS (General Purpose Simulation System) are probably the two best known in this area. There are, however, some other "languages" that deal with discrete event simulation rather than continuous systems, of which SIMULA is one example.

 

 

Non-Sequential Languages

 

All the languages covered thus far are sequential, meaning that their instructions are executed in sequence (with exceptions controlled by branching). However, there are other ways that one might instruct computers, and these options are beginning to be explored today.

 

The first departure from sequential programming relates to new computer architectures in which there may be more than one CPU. When we have more than one CPU operating on a single problem at the same time, we refer to it as a parallel processor. There are quite a few different computers today that are designed in this way. The first well-known parallel computer was called the ILIAC (For Illinois Advanced Computer), developed at the University of Illinois and later moved to NASA Ames research facility at Moffett Field in the Bay Area. Some of the better known ones are the Sequent computer (the UCD campus uses these systems in the RSVP/Banner Student Information System), and we have one in the Department of Computer Science. Another well-known parallel processor system is called the Intel Hypercube, which may have a number of 386 or 486 processors running in a single box. We have an older version, which has 16 Intel 286 CPUs in a single box. Parallel processors can divide programming tasks into multiple smaller steps that can execute independently, which means that the program must be smart enough to figure out how to divide the task for the processors available. Some computer manufacturers have adapted versions of C to run on their parallel processors in a form that is transparent to the programmer. Other researchers have developed special purpose langauges for these new architectures. One such language, SRL, developed in part by a faculty member at UC Davis, takes a longer range view of the opportunities afforded by parallel processor systems.

 

The second major departure from sequential languages is called Object-Oriented Programming. This is a radical departure from concepts of sequential languages: it treats program segments as objects that can send messages to each other. There might be an object that calculates a sque\re root, or one that performs all printing tasks. Such objects would receive a message with some information to be processed (a number whose square root is desired, or the name of a file to be printed), and it acts accordingly. Object-oriented approaches have also been extended to database systems, and there are some languages designed specifically to support this kind of system. The first well-known Object-Oriented language was called Small Talk, and it was developed by Alan Kay and his associates at Xerox PARC (Palo Alto Research Center). Small Talk was originally used to teach small children (ages 4-12) to program computers, hence its name. Since then, there have been a number of Object-Oriented languages, or Object-Oriented extensions to other languagers. C++ is one such example, and though it does not have all the features desirable in a true object-oriented language, it is a good approximation and likely to be one of the best known. There are some people who are working on adding object-oriented features to the MUMPS language (we have one version of MUMPS that offers that capability), but it will probably take several years for this capability to be incorporated into the M language standard.

 

These two examples (parallel languages and object-oriented languages) serve to illsutrate the fact that, as research in computer science continues, new ways may be found to instruct computers to perform tasks. It seems likely that a number of such new approaches will develop over the next few years, but this lecture is designed only to alert you to the possibility, rather than speculate on different likely scenarios.

 

Summary

 

From this overview, you will realize that programming computers is so common in so many areas today that there are literally hundreds of programming languages in existence. This review covers only a few, but it gives you the flavor of most families of languages that you are likely to run into.