Sunday, September 28, 2008

Memetic Algorithms, Convergence and Pre-existing Fitness Landscapes

Memetic Algorithms

Memetic Algorithms (MAs) are search techniques used to solve problems by mimicking molecular processes of evolution including selection, recombination, mutation and inheritance.

A few important aspects of MAs (Figure 1):

  • The fitness landscape needs to be finite.
  • The search space of the MA is limited to the fitness landscape.
  • There is at least one solution in the fitness landscape (Figure 2).
  • A fitness function determines the relationship between the fitness of the genotype (or phenotype) and the fitness landscape.
  • Selection is based on fitness.

Figure 1: Basic lay out of memetic algorithms. A population of individuals is randomly seeded with regard to fitness (initialized). The individuals are randomly mutated and their fitness is measured. Individuals with optimal fitness are further mutated until convergence of a local optima is reached. The process is carried out for the entire initialized population. The global optima is selected from the various local optima.


Figure 2: Fitness landscape with local optima (A, B and D) and a global optima (C). In a memetic algorithm, the initial population of individual are randomly seeded and can be viewed as any of the arrows indicated in the figure.


Various molecular docking programs employ genetic algorithms in order to try and predict the orientation of a ligand within a protein receptor. Autodock employs a MA for this purpose. A good docking program is one that can reproduce an existing crystallographic pose with reasonable success. The Root Means Squared Deviation (RMSD) of a docked ligand compared the to the crystallographic pose is generally used as a good indicator. A RMSD value less than 2 is considered a success. In the case of the Autodock software, the global optima is supposed to correlate with the crystallographic pose (RMSD <2)

As an example to illustrate, Colchicine binds to tubulin and interferes with tubulin dynamics by inhibiting tubulin polymerization. Colchicine binds at a position between the alpha and beta tubulin dimer (Figures 3 and 4).



Figure 3: Colchicine binding site.


Figure 4: Colchicine binding cavity.


A docking run with Autodock can be characterized by the following:

Finite fitness landscape: The physical properties of the protein receptor (E.g. electrostatic properties, Van der Waals interactions and desolvation energies). Pre-existing fitness landscape.

Search space: Confined to the protein receptor.

At least one solution: Crystallographic pose.

Fitness function: Estimated Free Energy of Binding pose. This is determined through a combination of various interactions including Van der Waals-, electrostatic-, desolvation-, hydrogen bond- and torsional free energy.

Selection (guiding function): Selection is based on fitness.


Using Autodock, Colchicine was "docked" 4 times into the tubulin receptor. Each time the ligand is docked, 30 populations with 250 individuals (ligands) are randomly placed within the receptor. The local optima of each population is determined (blue bar graph). The results revealed the following (Figure 5).

Figure 5a: Run 1

Figure 5b: Run 2

Figure 5c: Run 3

Figure 5d: Run 4

All four runs converged on a the same global optima which also corresponded reasonably well to the crystallographic pose (RMSD<1.8).>

Is this process analogous to the evolution of life?


The Memetic Algorithms of life:
A) A genetic code that is optimized for random searches.
B) Quality control systems (DNA repair, protein quality, programmed cell death).
C) Variation inducers (Cytosine deaminases, Low vs High fidelity polymerases, gene conversion and homologous recombination).

Examples of convergence in the evolution of life:
Running MAs in pre-existing fitness landscapes result in the convergence of various local optima, with the global optima being the best of the local optima. Evolutionary history is filled with examples of convergence (local optima).

A) The spectacular convergence of abiogenesis into a universal optimized genetic code and life's memetic algorithms.
B) Structural convergence
Nice article showing various examples of convergent evolution.
C) Molecular convergence
Carbonic anhydrases
Prestin
More examples

Pre-existing fitness landscapes and the evolution of life:
The fitness of the docking pose of the ligand in the above example is dependent on the pre-existing properties of the receptor protein. These properties include:

Van der Waals energy
Electrostatic energy
Desolvation energy
Hydrogen bond energy
Torsional free energy
These are all combined to determine the fitness (binding energy) of the ligand.

Figure 6: Convergence of local optima of Colchicine in the pre-existing fitness landscape of the tubulin protein receptor Fitness (binding energy) is measured by Van der Waals-, Electrostatic-, Desolvation-, Hydrogen bond - and Torsional free energy. Replaying the docking run yields similar results every time.


Standard evolutionary theory describes fitness as the capability of an individual of a certain genotype to reproduce (self-replicate). What are the properties of the pre-existing fitness landscape of life that determines the fitness (self-replication) of life forms?

Should these properties include the following?

Reproduction success (self-replication)
Intelligence (Ability to process information - genetics, proteomics, metabolomics)
Agency (Ability to manipulate information)
Complexity (Emergence of complexity seems to be the first rule of evolution)


What are these properties composed of?
Perhaps elemental proto-experiences (PEs) as phenomenal aspects that are properties of elementary particle (superimposed) described in this paper? Can it connect quantum physics, consciousness (article) and evolution?


A "docking" (replaying the tape of life) run with such a simulation can be characterized by the following :

Finite fitness landscape: The physical properties of the universe (Mass, spin, charge and proto-experiences superimposed as elementary particles. The pre-existing fitness landscape.

Search space: Confined to the universe.

At least one solution: Self-replication.

Fitness function: Reproduction success. This is determined through a combination of various interactions including self-replication, intelligence, agency and emergence of complexity.

Selection (guiding function): Selection is based on fitness.


What would a "docking" run of life look like if we run it over and over with a pre-existing fitness landscape and universal memetic genetic algorithms (Figure 6)?

Figure 7: Convergence of local optima in a fitness landscape whereby fitness is measured by reproduction, intelligence, agency and complexity. If life's memetic algorithms are comparable to a "docking" run, it should yield similar local optima in pre-existing fitness landscapes every time the simulation is run.


No comments: