Concordia University's DEC LAB: Research at the DEC LAB

1 Introduction

Typical applications in Evolutionary Computation often involve a direct and simple relation between genotype and phenotype; Commonly, values from the genome are simply slotted into a fitness function in a bijective mapping. While this approach is sufficient for most practitioners, the dimensionality of the solution space (phenotypes) is directly translated into the dimensionality of the space of genotypes, potentially exceeding the size of space capable of being searched by a Genetic Algorithm. For larger and more complex problems direct relations between genotype and phenotype may be insufficient.

In a field inspired by Biology, it is often useful to re-examine the source: the human genome may be viewed as a tremendous compression of phenotypic complexity; The approximately 3 billion chemical base pairs of the genome map to approximately 100 trillion cells [8]. It is clear that the process of development plays a significant role in the addition of information to the phenotype; Indeed, models of biology often attempt to re-create the hierarchical structure inherently formed by the differentiation process, as in [5].

An emerging trend in Evolutionary Computation is to create a (relatively) small and simple genotype, and to increase the complexity of the phenotype through a developmental process. The field of Artificial Morphogenesis spans a wide array of approaches, varying on a gradient between total information being contained in the genotype, to a bare minimum, where the genotype is a program designed to spawn the phenotype. Some approaches use simple techniques, such as the use of repeated structure when interpreting the genome. Another more complex approach is the use of Grammars to develop agents, where the grammar and initial conditions form the genotype. Such systems have enjoyed much success, as in [11], where they were used in a theoretical experiment to study the development of modular forms, or in [7], who used the model to develop a neural network controlling a foveating retina. However, these systems are far separated from their original biological metaphors.

Other equally complex examples include developmental models, inspired by Embryogenesis. Examples which attempt to model actual biological embryogenesis have also enjoyed much success, as in the case of the particularly successful modeling of the embryogenesis of a drosophilae, as found in [6]. In between direct models of biology and models which are entirely divorced, there exists a class of developmental models which seeks to abstract the developmental process to high-level systems capable of artificial evolution, hopefully retaining some of the high-level features of biological development. Bentley and Kumar [1], and Eggenberger [3] both propose highly-simplified models which are used to demonstrate the development of geometric and aesthetic principles. As is noted in a review by Stanley and Miikulainen, simpler solutions may outperform biologically plausible ones, and a need exists for abstraction [14]. Perhaps most closely related to the subject of this paper, however, is work undertaken by Dellaert and Beer [2]; This experiment aimed at the creation of a computationally-tractable but biologically-defensible model of development, aimed towards the evolution of agents capable of a locomotive task. Dellaert and Beer's model consists of a conglomerate of cellular material, which performed a series of differentiations. Differentiations begin as a single symmetry-breaking division, followed by divisions controlled by a series of Genetic Regulatory Networks. Drawbacks to this model resulted chiefly from the size of the search space associated with their system. For example, they were unable to evolve fit agents from scratch and hence began their experiments with a hand-coded agent, from there obtaining results. Following this, their model was simplified, using Random Boolean Networks ([9]).

This viewpoint, that of increasing the complexity of the phenotype through the developmental process, forms an interesting starting point for another emerging line of thought: It has been postulated (most notably by Wolfram [15]) that natural selection serves not to increase the complexity of agents through time, but instead to limit the complexity inherent in a complex and unwieldy developmental process. If this suspicion is correct, then the current typical use of Evolutionary Computation may not utilize the metaphor of natural selection to its fullest potential.

Wolfram suggests the use of Cellular Automata as a model for the developmental process; The problem with this approach is that the space of all CAs is notoriously large and difficult to search. Attempts to evolve Cellular Automata to perform tasks may be found in a review by Mitchell et al [12], where CAs were evolved to solve problems of density classification and synchronization. While some of the burden was alleviated through the use of stochastic approximations, searching through the space of CAs proves to be tremendously computationally expensive.

In this paper, we present the Bluenome[1] model of development. Bluenome is a highly abstracted model for developing application-neutral agents composed of an arbitrarily large network of components chosen from a finite set of types. Bluenome uses a subset of the space of two-dimensional CAs, evolved in a genetic algorithm, starting from a single neutral cell. In Phase One, Bluenome is applied to a series of application-neutral experiments, designed to show that the system is capable of producing interesting agents in a reasonable amount of time. In Phase Two, it is applied to the problem of generating agents for an artificial but non-trivial problem, contrasted against a bijective technique. It is our hope to demonstrate that the Bluenome technique is a viable option for the design of high-dimensional agents, by utilizing the features of the developmental process, while abstracting enough so as to make evolution feasible.

2 The Bluenome Model[2]

The Bluenome Developmental Model is a highly simplified version of biological embryogenesis. It involves the inclusion of a single component (Cell) into an array of spaces (Grid Cells), and a methodology for that cell to grow into an agent, utilizing only local information. The Cell contains a single piece of DNA, which it interprets to decide its next action - the complexity of a piece of DNA is governed by a system parameter, numRules, which limits its precision. The number of possible cells is governed by a system parameter, numColours, the number of types of components which might be included in an agent. This process is limited by a system parameter numTel (number of telomeres) which acts as a counter in each cell, decrementing with each action that a cell undertakes.

. An agent's genome is comprised of a series of numRules rules, numRules Î N⁺. Each rule is (numColours+2) integers long, leading to a total genome length of (numColours+2)*numRules. Each rule is structured as:

colour

hormone₁

hormone_numColours

action

where colour Î [1,numColours], hormone_i Î [1,12], and action Î [1,numColours+3].

Initially, an agent begins as a single neutral cell, centred in an environment (a square matrix of Grid Cells). When activated, a cell (currCell) in the environment will collect hormones from its twelve neighbourhood, storing the number of occurrences of each cell colour in a list. The exception is in the case of cells on the periphery; These cells are only included in the count if the cells in between are empty[3] - hence the cell on the far, far left will be included in the count only if the cell on the left is empty. What results is a list of numbers of length numColours, each of value between zero and twelve.

Once any particular cell has collected information regarding its neighbours, it searches the genome to find the closest matching rule: First, it collects all rules in the genome such that the current colour of the cell and the first argument in the rule match (If no such rule is found, the cell takes no further action this time step). Next, it searches that collection of rules for the one most closely matching its list of current hormone levels (Euclidean distance). Finally, the action is extracted from that rule. The action is executed by first decrementing the cell's internal telomere counter (hence a cell may execute only numTel actions), then executing the action corresponding to theRule_action. Possible actions include: Die, where a cell is removed, leaving an empty Grid Cell; Specialize(colour), where a cell changes its colour to colour; Divide, where a copy of the cell is placed in the best free location; and Move, where the cell is relocated to the best free location (if there are no free locations, no action is taken). It should be noted that the best free location is defined as the Grid Cell in currCell's four-neighbourhood furthest away from the largest mass of cells in the eight-neighbourhood. In the case of equal distribution, the best free location includes a directional bias - left, then counter-clockwise.

In this manner, a "good" genome will allow a single initial cell to grow to a robust agent. The process terminates when all telomeres are expended, or when there is no change from one developmental time step to the next. No other mechanisms are included - there are no special parameters for symmetry breaking, beyond that inherent in the directional bias.

One view of this process is as a subset of all 2-dimentional Cellular Automata with radius 3. The key differences between CAs and Bluenome are: (1) Bluenome begins with a single cell in the centre of a finite grid. Empty (white) cells cannot change their colour without a non-white neighbour[1] ; (2) As the hormone collection process does not note direction, the rules instantiated by the Bluenome genome map to symmetrical patterns in a CA rule set; (3) Bluenome utilizes a measure to compute distance to a rule, unlike CAs, which are precise. This may be viewed as collapsing several similar rules into a single outcome; and (4) The lack of consideration of peripheral cells in the twelve-neighbourhood may be viewed as a further grouping of CA rules. Hence, the space of all Bluenome genotypes is significantly smaller than that of two-dimensional CAs. Phase One aims to demonstrate that these simplifications do not significantly limit the robustness of the possible phenotypes, but, due to the decreased dimensionality of the genotypic space, does allow the space of all Bluenome genotypes to be more readily pressured by selection.

Fig. 2.3 shows the development of an interesting agent taken from Phase One, shown at various times in the development. An unfortunate point is that the majority of genotypes generate trivial agents - nearly 80% of random samples. However, it will be shown that selection quickly improves matters.

Fig. 2.2. Development of an interesting agent.

We can now make some estimates involving size: Firstly, we note that the maximum size of an agent with numTel telomeres will be 2*(numTel+1)² - this is the size of a diamond with sides of length (numTel+1). Hence, an agent with numTel = 6 will have a maximum phenotypic size of 98 cells; 882 cells for an agent with numTel = 20. In contrast, and agent with numColours = 5 and numRules = 25 will have a genotypic size of 175, regardless of phenotypic size.

The complexity of the developmental process is O(numTel³*numRules).

3 Phase One: Application-Neutral Experiments

The evolution of cellular automata is a notoriously difficult problem: the highly non-linear nature of the space of CAs, as well as the unpredictability of the forecasting problem [15], [13] makes the prospect of the evolution of complex patterns using GAs seem grim. As we have recognized the Bluenome model as a subset of the space of two-dimensional CAs, it is not obvious that evolution is possible in any reasonable measure of time. Phase One was a set of experiments utilizing an array of fitness functions to determine which high-level axis could potentially be evolved.

Phase one utilized a neutral view of phenotypes - an organism was treated as an image, fitness function being based on techniques from Image Processing. In all cases, grids of size 100x100 were used, and the agents were allowed to grow to fit the grid. A genetic algorithm was used to evolve phenotypes, typically running for 100 generations with a population size of 30.

Successful experiments showed promising results; Typically, highly fit population members were discovered prior to generation 100, with steady increase in mean fitness. These fitness functions included: selection for complexity, selection for similar but disconnected regions, selection for highly distinctive regions, and selection for a transport system (a minimal network of cells which neighbours as many cells in the grid as possible).

An example of one such successful run was the attempt to generate images of increasing complexity. Here, fitness was a function of the size of the grown agent and the number of different cell types included in the phenotype. By generation 100 a robust agent utilizing all colours can be seen. Members of the population can be seen in Fig. 3.1. From generation 100 of this same experiment, a visual continuity between population members can be seen, as illustrated in Fig. 3.2

Fig. 3.1 Images from a Phase One experiment using complexity as a fitness. Members are shown from generations 0, 10, 40 and 100.

Fig. 3.2 Visual examples of continuity from generation 100.

In addition to the successful attempts at evolution, there were some unsuccessful examples as well. One such example was the attempt to generate images using an entropy function as fitness - that is, the attempt to grow images consisting of pure noise. Although images near optimal fitness were found, following 300 generations of evolution, there was no significant increase in mean fitness relative to generation zero. This is not surprising, given previous arguments regarding developmental model's inherent structured growth. A second example was the attempt to produce agents which terminate their own growth prior to filling the grid[1] ;It is for this reason that the numTel parameter is included, modeling biological telomeres.

Fig. 3.3 shows a set of exemplar members chosen from the various experiments. In all cases, these members were found in less than 100 generations of evolution, utilizing population sizes of less than 30.

Fig. 3.3 Exemplar members from Phase One experiments, identified by generating fitness: (far left) maximal thin coverage; (left) disconnected regions of similarity; (right) disconnected regions of similarity; (far right) highly distinctive regions.

Another interesting result from Phase One was the attempt to re-grow a genotype in differing environments. This attempt is illustrated in Fig. 3.4, where again visual

similarities may be seen most clearly, Additionally, under the fitness used in the experiment (the complexity function), fitness was nearly identical for each re-growth.

Fig. 3.4 Attempt to re-grow the same genotype is differing environments: ( top far left) original 100x100 grid; (right) 200x200 grid; (top left) a square of non-interacting foreign cells; (bottom left) 100x200 grid.

Page 1, Page 2, Page 3

DEC Lab Home

Last Updated April 6, 2004