Programs

Programs

A 1970s punched card containing one line from a FORTRAN program. The card reads: “Z(1) = Y + W(1)” and is labeled “PROJ039” for identification purposes.

In practical terms, a computer program might include anywhere from a dozen instructions to many millions of instructions for something like a word processor or a web browser. A typical modern computer can execute billions of instructions every second and nearly never make a mistake over years of operation. Large computer programs may take teams of computer programmers years to write and the probability of the entire program having been written completely in the manner intended is unlikely.

Errors in computer programs are called bugs. Sometimes bugs are benign and do not affect the usefulness of the program, in other cases they might cause the program to completely fail (crash), in yet other cases there may be subtle problems. Sometimes, otherwise benign bugs may be used for malicious intent, creating a security exploit. Bugs are usually not the fault of the computer. Since computers merely execute the instructions they are given, bugs are nearly always the result of a programmer’s error or an oversight made in the program’s design. (It is not universally true that bugs are solely due to programmer oversight. Computer hardware may fail or may itself have a fundamental problem that produces unexpected results in certain situations. For instance, the Pentium FDIV bug caused some Intel microprocessors in the early 1990s to produce inaccurate results for certain floating point division operations. This was caused by a flaw in the microprocessor design and resulted in a partial recall of the affected devices.)

In most computers, individual instructions are stored as machine code with each instruction being given a unique number (its operation code or opcode for short). The command to add two numbers together would have one opcode, the command to multiply them would have a different opcode, and so on. The simplest computers are able to perform any of a handful of different instructions; the more complex computers have several hundred to choose from, each with a unique numerical code. Since the computer’s memory is able to store numbers, it can also store the instruction codes. This leads to the important fact that entire programs (which are just lists of instructions) can be represented as lists of numbers and can themselves be manipulated inside the computer just as if they were numeric data. The fundamental concept of storing programs in the computer’s memory alongside the data they operate on is the crux of the von Neumann, or stored program, architecture. In some cases, a computer might store some or all of its programs in memory that is kept separate from the data it operates on. This is called the Harvard architecture after the Harvard Mark I computer. Modern von Neumann computers display some traits of the Harvard architecture in their designs, such as in CPU caches.

While it is possible to write computer programs as long lists of numbers (machine language), and this technique was used with many early computers, it is extremely tedious to do so in practice, especially for complicated programs. Instead, each basic instruction can be given a short name that is indicative of its function and easy to remember—a mnemonic such as ADD, SUB, MULT, or JUMP. These mnemonics are collectively known as a computer’s assembly language. Converting programs written in assembly language into something the computer can actually understand (machine language) is usually done by a computer program called an assembler. Machine languages and the assembly languages that represent them (collectively termed low-level programming languages) tend to be unique to a particular type of computer. This means that an ARM architecture computer (such as may be found in a PDA or a hand-held video game) cannot understand the machine language of an Intel Pentium or the AMD Athlon 64 computer that might be in a PC. (However, there is sometimes some form of machine language compatibility between different computers. An x86-64 compatible microprocessor like the AMD Athlon 64 can run most of the same programs that an Intel Core 2 microprocessor can, as well as programs designed for earlier microprocessors like the Intel Pentiums and Intel 80486. This contrasts with very early commercial computers, which were often one-of-a-kind and totally incompatible with other computers.)

Though considerably easier than in machine language, writing long programs in assembly language is often difficult and error-prone. Therefore, most complicated programs are written in more abstract high-level programming languages that can express the needs of the computer programmer more conveniently (and thereby help reduce programmer error). High-level languages are usually “compiled” into machine language (or sometimes into assembly language and then into machine language) using another computer program called a compiler. (High-level languages are also often interpreted rather than compiled. Interpreted languages are translated into machine code on the fly by another program called an interpreter.) Since high-level languages are more abstract than assembly language, it is possible to use different compilers to translate the same high-level language program into the machine language of many different types of computers. This is part of the means by which software like video games may be made available for different computer architectures such as personal computers and various video game consoles.

The task of developing large software systems is an immense intellectual effort. It has proven, historically, to be very difficult to produce software with an acceptably high reliability, on a predictable schedule and budget. The academic and professional discipline of software engineering concentrates specifically on this problem.

Example

Suppose a computer is being employed to control a traffic light. A simple stored program might say:

  1. Turn off all of the lights
  2. Turn on the red light
  3. Wait for sixty seconds
  4. Turn off the red light
  5. Turn on the green light
  6. Wait for sixty seconds
  7. Turn off the green light
  8. Turn on the yellow light
  9. Wait for two seconds
  10. Turn off the yellow light
  11. Jump to instruction number (2)

With this set of instructions, the computer would cycle the light continually through red, green, yellow, and back to red again until told to stop running the program.

However, suppose there is a simple on/off switch connected to the computer that is intended be used to make the light flash red while some maintenance operation is being performed. The program might then instruct the computer to:

  1. Turn off all of the lights
  2. Turn on the red light
  3. Wait for sixty seconds
  4. Turn off the red light
  5. Turn on the green light
  6. Wait for sixty seconds
  7. Turn off the green light
  8. Turn on the yellow light
  9. Wait for two seconds
  10. Turn off the yellow light
  11. If the maintenance switch is NOT turned on then jump to instruction number 2
  12. Turn on the red light
  13. Wait for one second
  14. Turn off the red light
  15. Wait for one second
  16. Jump to instruction number 11

In this manner, the computer is either running the instructions from number (2) to (11) over and over or it’s running the instructions from (11) down to (16) over and over, depending on the position of the switch. Although this is a simple program, it contains a software bug. If the traffic signal is showing red when someone switches the “flash red” switch, it will cycle through green once more before starting to flash red as instructed. This bug is quite easy to fix by changing the program to repeatedly test the switch throughout each “wait” period—but writing large programs that have no bugs is exceedingly difficult.