Genetic Programming
Should we create computer programs that can modify themselves?
Genetic programming is a form of artificial intelligence programming that deals specifically with the ability of computer programs to modify themselves to more effectively solve the problems they address. John Koza of Stanford University is widely considered the inventor of genetic programming. In a review of Koza's latest book on the topic, Peter Nordin of Chalmers University notes that Koza's "first book actually spawned the creation of the research field…"(Nordin). To explain the goal of genetic programming, Koza turns to artificial intelligence pioneer Arthur Samuel's 1959 statement that one of the central challenges of computer science concerns "How can computers be made to do what needs to be done, without being told exactly how to do it?" (qtd in Koza Genetic Programming 15).
Genetic programming incorporates and expands the concept of
genetic algorithms. Describing genetic algorithms, Koza states,
"John Holland's pioneering book Adaptation in Natural and
Artificial Systems (1975) described a domain-independent
algorithm, called the genetic algorithm, based on an evolutionary
process involving natural selection, recombination, and
mutation." (Genetic Programming 16). However, genetic programs
differ from genetic algorithms in that the output from a genetic
program is another computer program rather just a string of
numbers that represent a problem's solution. Koza leaves no doubt
about this difference. In his explanation of how genetic
programming differs from other approaches to artificial
intelligence, the first difference he mentions is the form of
representation. He says, "Genetic programming overtly conducts it
search for a solution to the given problem in program space."
(Koza Seven Differences 1)
An Example of a Genetic Algorithm Used for a Classic Problem
Before we discuss genetic programs, let's first examine how genetic techniques can be applied to problem solving. In an article written for the Artificial Intelligence website on "www.generation5.com", Andy Thomas explains how he applied a genetic algorithm to the classic problem of the traveling salesman. The salesman must travel to a large number of cities with the least miles and least number of backtracks in his route. This problem is much more understandable than the electrical circuit examples used in the genetic programming literature I surveyed.
The city visiting order is represented as an array of city numbers, the total distance as the sum of the distances from occurrence to occurrence all the way through the array. The fitness value rewards results based on lower total distance traveled, in-range city numbers, and the non-repetition of destinations. After each trip, the fitness value is calculated and the evolutionary algorithm is applied. Results of the lowest mileage are stored and the program terminates when an established acceptable total mileage value is reached.
Thomas relates that for a 100-city route, the initial (generation
0) distance was 12,224.6 miles, but after 10,099 generations, the
distance was reduced to 4,989.4 miles. He also notes an important
point: to analyze the optimal route for 100 cities by a "brute
force" algorithm with a computer that could examine one million
routes per second would take 3e144 years. He states: "In a
real-world route planning problem, the determination of the
shortest possible route may not be required. It may suffice to
simply find an acceptably short route. A genetic algorithm is
ideally suited such to cases where the optimum solution is not
required, provided a near-optimum solution can be delivered in a
short time scale." (Thomas). As I mentioned, this is an example
of a genetic algorithm in a program. In actual genetic
programming, the output would be a computer program rather than
an improved algorithm.
A Summary of the Genetic Programming Process
To demonstrate some the basics of genetic programming, I've paraphrased the program and cycle description below directly from Koza et al."Genetic Programming…" Genetic computer programs are combinations of functions and terminals. The functions can be arithmetic operations, conditional operators, problem-specific functions or such. The terminals are external inputs, constants, or zero-argument functions. The programs can be thought of as "trees whose points are labeled with the functions and whose leaves are labeled with the terminals." (17). Genetic programs build new programs by executing three main steps.
- 1. Randomly create an initial population of individual
computer programs that contain the necessary functions and
terminals to solve the problem. This commonly referred to as the
zero generation.
- 2. Repeatedly perform these sub-steps:
- Assign a fitness value to each individual program in the population. The fitness function measures the accuracy of the program's result. For example, if a program were designed to simulate shooting an arrow into a target under varying wind conditions, the fitness measurement would be the distance of the arrow from the bulls-eye.
- Create a new population of individual programs by
applying any of three genetic operations to one or two
individuals in the old generation selected based on fitness.
- Reproduction: copy an existing individual into the new population (asexual).
- Crossover (sexual reproduction): create two new individuals from two existing individuals (parents) by genetically recombining sub-trees from each program using operations at randomly chosen crossover points in the parent programs.
- Mutation: Create a new individual from an existing parent program by randomly mutating one randomly chosen sub-tree of the parent.
- 3. Find the computer program that has performed the best so far from all generations. When this performance meets defined criteria , the result may represent either the exact or an approximate solution to the problem (Koza Genetic Programming).
A flowchart for a typical genetic program appears below

Note that probability plays a vital role in selecting individuals, determining the crossover points and creating the mutations. Koza mentions in one experiment summary that "The probability of crossover was approximately 89%; reproduction 10%; and mutation 1%" (Genetic Programming 30). It's also important to note that all genetic operations are governed by the requirement that the program's syntactic structure is preserved in all offspring. For example, if a sub-tree is replaced by the mutation operation, the new program's sub-tree will still have the same functional capability, variable scheme and terminals as the parent program.
Returning to Koza's discussion of differences between genetic
programming and other machine learning methodologies, an
important one should now be clear: "Genetic programming does not
rely exclusively on greedy hill climbing to conduct its search,
but instead allocates a certain number of trials, in a principled
way, to choices that are known to be inferior." (Seven
Differences 1). Koza is saying here that methodologies that
always select the best performer to create the next generation
are likely to find a localized optimum that might not necessarily
be the global optimum. By allowing the program to mutate, for
instance, it might eventually find a less obvious path through
the possible solutions that will create a more optimized result.
Koza also points out that the "toy problems" used to test many
other forms of machine learning are simple enough that they can
be solved by hill climbing. Genetic programming thus seems to
have the potential to solve more sophisticated problems where the
answer might not be accessible to point-to point hill climbing
algorithms (Seven Differences 3).
What Has Genetic Programming Accomplished?
Although much of the available research describes experiments where genetic programming is applied to classic optimization problems or game theory, Koza and his colleagues claim that their genetic programming experiments have actually achieved results similar to what human effort could create. In a 1999 paper relating the results of a genetic programming experiment with circuit design, they state:, "We illustrated genetic programming by applying it to a non-trivial problem, namely the synthesis of a design for a lowpass filter circuit. The results were competitive with human-produced solutions to the problem. The results exhibited creativity and inventiveness and correspond to four inventions that were patented between 1917 and 1936." (Koza Genetic Programming 42).
If this seems like a remarkable claim, I saw no indication that
anyone has refuted it in the literature I reviewed. Subsequent
characterization of genetic programming as an "invention machine"
seems to reinforce their findings. The success of their
program-creating experiments is also consistent with the
successful use of genetic algorithms in a number of real-world
applications where complex optimization is required, such as
multi-dimensional scheduling and optimizing satellite orbits
(Knight).
Conclusions
As a business programmer who has worked primarily with high-level procedural languages and relational databases, genetic programming sounded exotic when I started to research it. As it turned out, even though I didn't have time to develop an in-depth understanding of the languages and technical programming methods used in this field, I understand its core concepts quite well. It seems as if the time is close when genetic programming techniques will be employed in a wider variety of practical applications. While I'm not sure it will ever be practical for business applications, I would certainly welcome any business programming methodology that allows programs to mutate to solve problems more effectively.
One thought I have about genetic programming I didn't see explicitly expressed in the material I surveyed: combined with advanced robotic technology, these techniques might allow creation of machines that will approach or rival human ability not only to perform tasks but also to adapt to changes in the environments in which they function. I can envision machines being sent into environments inhospitable to humans that could, by using genetic programming, modify their approach to the problem if the conditions change or differ from what was anticipated. My final thought is that even though the creation of machines that truly rival human intelligence will not occur in the near future, the adaptive capability associated with genetic programming will probably be an important quality of the "brains" that will help push such devices up to new levels of sophistication.
Works Cited
Fernandez, Jamie. "The Genetic Programming Notebook"
[geneticprogramming.com] 24 March, 20002.
Koza, John R. et al. "Genetic Programming: Biologically Inspired
Computation that Creatively Solves Non-Trivial Problems".
In Landweber, Laura F. and Winfree, Erik (Editors). 2001.
Evolution as Computation, DIMACS Workshop, Princeton, January
1999. Heidelberg: Springer-Verlag: 15-44.
---. "Seven Differences Between Genetic Programming and Other
Approaches to Machine Learning and Artificial
Intelligence".
[genetic-programming.com]. 24 March 2002.
---. "What is Genetic Programming?".
"http://www.genetic-programming.com." 24 March 2002.
Knight, Will. "Genetic algorithms evolve optimum satellite
orbits".
16 October 2001. [newscientist.com/news.jsp?id=ns99991437]. 24
March 2002.
Nordin, Peter. "Book Review: Genetic Programming III - Darwinian
Invention and Problem Solving".
Evolutionary Computation 7.4 (1999): 451-453.
Thomas, Andy. "Solving the Travelling Sales Man Problem using A
Genetic Algorithm".
6 July 2001. [generation5.org]. 24 March, 2002.
Images
Flowchart from “The GP Tutorial” page of
The
Genetic Programming Notebook by Jamie Fernandez.
Genetic Programming, by by Dean Ennis 2002-03-25
for "Computer Science Senior Seminar", Chestnut Hill College
