Mathematical model of cancer cell growth and chemotherapy drug scheduling using genetic optimization.

View With Charts And Images

Mathematical model of cancer cell growth and chemotherapy drug scheduling using genetic optimization.

Chapter 1

Introduction

Cancer poses serious health problems both in developed and developing countries. The prevention and control of cancer in developing countries deserve urgent attention because cancer is the sixth leading cause of death in Bangladesh. In coming 20 to 25 years cancer is expected to double in Bangladesh [Banglapedia]. International Agency for Research on Cancer (IARC) has been projected that death from cancer in Bangladesh is 7.5 % in 2005 and it would be increased up to 13 % in 2030. A recent WHO study has been estimated that there are 49,000 oral cancer, 71,000 laryngeal cancer and 196,000 lung cancer cases in Bangladesh, among those aged 30 years or above. It is estimated that there are 1,000,000 people in Bangladesh who already have cancer and 200,000 new patients are added each year [Ahsania mission cancer and general hospital].

As per above description, it is really urgent to do something against cancer growth in Bangladesh. To prevent cancer in Bangladesh, lots of fascinating researches should be conducted into medical, pharmacy, microbiology, and bio-physics. So, I devote myself to design drug dose for chemotherapy using mathematical model and advanced computing techniques.

1.1 What is cancer?

Cancer is known medically as a malignant neoplasm. It is a large group of different diseases which involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors and invade nearby parts of the body. The cancer may also spread to more distant parts of the body through the lymphatic system or bloodstream. Not all tumors are cancerous. Benign tumors do not grow uncontrollably [Kinzler, 2002]. They do not invade neighboring tissues, and do not spread throughout the body.

Healthy cells control their own growth and will destroy themselves if they become unhealthy. Cell division is a complex process that is tightly regulated. Cancer occurs when problems in the genes of a cell prevent these controls from functioning properly. These problems may come from damage to the gene or may be inherited and can be caused by various sources inside or outside of the cell. Faults in two types of genes are especially important: oncogene which drives the growth of cancer cells and tumor suppressor genes, which prevent cancer from developing.

1.2 Cell life cycle and formation of cancer

The growth of a cancerous tumor is a complicated process involving multiple biological interactions. The response of such tumors to active treatment such as chemotherapy and radiotherapy is also complex. The physiological condition of the patient, the stage of the diseases, scheduling of the therapy and interaction of the drugs are the main factors in chemotherapy. The cell cycling mechanisms is needed to understand the actions of the chemotherapy treatment. The cell cycle is a chain of phases that both normal and cancer cells undergo from their birth to death. The cycle comprises of five stages as show in Figure 1.1. A brief description of different stages is given below [Alam et al. 2010, Burns and Newbury, 2008]:

G1: Pre-synthetic interface stages. Parent cells grows, organelles are reproduced. Longest phase, can last up to 48 hours. Post mitotic gap, the cell prepares for DNA synthesis.

S: Synthetic interface stage. DNA synthesis takes place in preparation for cell division and lasts 8-20 hours (many anticancer drugs act by interfering with DNA at this stage, causing cell death).

G2: Post-synthetic interface stage. Pre-mitotic gap, specialized proteins and RNA are synthesized in preparation for cell division and lasts up to 4 hours.

M: Mitotic phase, cell division takes place and DNA is distributed evenly among daughter cells and lasts up to 30 minutes.

Go: Resting phase, cell is quiescent, able to work as intended but unable to divide.

Fig. 1.1. The five phases of the cell-cycle: G0 (quiescent), G1 (growth), S (DNA replication),G2 (mitotic preparation), and M (mitosis). [Florian et al.]

1.3 Genetic algorithm

Genetic algorithm (GA) is a way to randomly search for the best answers [Holland, 1975]. Over the last few years, it is becoming important to solve a wide range of search, optimization and machine learning problems. It attempts to solve problems in a fashion similar to the way in which human genetic process seem to operate. It’s fundamental principle is the fittest member of population has the highest probability for survival.

A GA (multi path search scheme) is an iterative procedure which maintains a constant size population p(t) of candidate solutions. The initial population p(0) can be chosen at random. The structures of the populatioin p(t+1) (i.e. for nest iteration called generation) are chosen from p(t) randomize selection procedure that ensures that the expected number of times a structure is chosen is approximately proportional to that structure’s performance relative to the rest of the population. In order to search other points in a search space, some variation is introduced into the new population by means of genetic operators (crossover and mutation) [D.Goldberg, 1989]. While it may seem to be a random search, in fact, the improvement in each generation indicates that the algorithm produces an effective directed search technique. The power of GAs derives largely from their ability to exploit efficiently this vast amount of accumulating knowledge by means of relatively simple selection mechanisms. Termination of the may be triggered by finding an acceptable approximate solution by fixing the total number of structure evaluations or some other application dependent criterion.

GA have been used to find optimal dose for chemotherapy. The objective of my thesis work is to design optimal drug doses to minimize the number of cancer cells consequently to keep the other healthy cell at its required value.

1.4 Mathematical modeling, simulation and optimization

In this thesis, de Pillis et al. [2006] mathematical model is implemented that consists of six steps, up to three controls and 29 parameters in the parameter sets that are mouse, human 9, human 10. This model is based on a set of ordinary differential equations. In this model, mouse is given only chemotherapy and human 9 & human 10, both are given chemotherapy and immunotherapy. Experiments with human body are not so easily reproducible in simulation environment. Scientific computing that is mathematical modeling, simulation and optimization processes are complementary to theory and experiments. It is true that the gap between simulated and real world data will still be large because the currently available models in differential equations are not in good match to clinical reality since cancer vary from patient to patient.

The aim of a therapy, no matter what kind, is to minimize the tumor size that is tumor volume or the number of tumor cells. Doses have been applied to the model with a fixed amount of drugs on a certain time period. The result, which makes the tumor grow as big under these constraints, can be considered the worst treatment. Together with the optimal control for a minimal tumor, the best treatment, these results are compared to standard treatments. The differences between worst and best treatments are small, some scenarious are shown where the same amount of durgs leads to a growing tumor on the one hand and a total disappearance on the other hand. In De Pillis et al. [2006] model, there are four objective functions. The objectives are to minimize the cencer cells and to maximize NK cells, CD8+ T cells and Circulating Lymphocytes. The formulation of objective functions is crucial for the optimal control result [Engelhart, 2009].

1.5 Aim of this work

The aim of this thesis work is to design effective drug doses for chemotherapy to minimize the growth of cancer cells with tolerable side effect. To do this, a mathematical models of cancer cell growth developed by de Pillis et at. [2006] investigated. At first, three models, namely, “mouse”, “human 9” and “human 10” are implemented and simulated in software invironment. These models are then tested with predefined drug doses of chemotherapy. After successful investigation of these models, chemotherapy drug scheduling for cancer treatment is designed with genetic algorithm. In this work several cases for chemotherapy drug scheduling are investigated. A comparative assesment is also made to find acceptable drug doses for chemotherapy.

1.6 Organization of the thesis

The work is structured as follows. In the next chapter, I discuss about cancer growth and its treatment with chemotherapy. In chapter 3, I introduce a optimization approach called genetic algorithm. The mathematical model of cancer cell growth developed by de Pillis et al.[2006] is explained and implemented in chapter 4. These models have been tested with predefined drug doses. The outputs of these tests have shown in this chapter too. in chapter 4. Chemotherapy drug dose scheduling can be found in chapter 5. Chapter 6 presents conclusion and future work.

Chapter 2

Cancer Growth and Chemotherapy

2.1 Cancer

Cancer is known medically as a malignant neoplasm. It is a large group of different diseases which involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors and invade nearby parts of the body. The cancer may also spread to more distant parts of the body through the lymphatic system or bloodstream. Not all tumors are cancerous. Benign tumors do not grow uncontrollably. They do not invade neighboring tissues, and do not spread throughout the body [Sherr, 2004].

Healthy cells control their own growth and will destroy themselves if they become unhealthy. Cell division is a complex process that is normally tightly regulated. Cancer occurs when problems in the genes of a cell prevent these controls from functioning properly. These problems may come from damage to the gene or may be inherited and can be caused by various sources inside or outside of the cell. Faults in two types of genes are especially important: oncogens, which drive the growth of cancer cells and tumor suppressor genes, which prevent cancer from developing.

Determining what causes cancer is complex and it is often impossible to assign a specific cause for a specific cancer. Many things are known to increase the risk of cancer, including tobacco use, infection, radiation, lack of physical activity, poor diet and obesity, and environmental pollutants [Anand et al, 2008]. These can directly damage genes or combine with existing genetic faults within cells to cause the disease [Kinzler, 2002]. A small percentage of cancers, approximately five to ten percent, are entirely hereditary.

2.1.1 Oncogene

An oncogene is a gene that has the potential to cause cancer [Wilbur, 2009] In tumor cells, they are often mutated or expressed at high levels [Kimball’s biology pages]. An oncogene is a gene found in the chromosomes of tumor cells whose activation is associated with the initial and continuing conversion of normal cells into cancer cells.

Most normal cells undergo a programmed form of death (apoptosis). Activated oncogenes can cause those cells that ought to die to survive and proliferate instead. Most oncogenes require an additional step, such as mutations in another gene, or environmental factors, such as viral infection, to cause cancer. Since the 1970s, dozens of oncogenes have been identified in human cancer. Many cancer drugs target the proteins encoded by oncogenes [Croce, 2008 and Yokota, 2000].

A proto-oncogene is a normal gene that can become an oncogene due to mutations or increased expression. The resultant protein may be termed an oncoprotein [Mitchell et al.]. Proto-oncogenes code for proteins that help to regulate cell growthand differentiation.

The proto-oncogene can become an oncogene by a relatively small modification of its original function. There are three basic activation types:

§ A mutation within a proto-oncogene can cause a change in the protein structure, causing

§ an increase in protein (enzyme) activity

§ a loss of regulation

§ An increase in protein concentration, caused by

§ an increase of protein expression (through misregulation)

§ an increase of protein (mRNA) stability, prolonging its existence and thus its activity in the cell

§ a gene duplication (one type of chromosome abnormality), resulting in an increased amount of protein in the cell

§ A chromosomal translocation (another type of chromosome abnormality), causing

§ an increased gene expression in the wrong cell type or at wrong times

§ the expression of a constitutively active hybrid protein. This type of aberration in a dividing stem cell in the bone marrow leads to adult leukemia

2.1.2 Tumor suppressor gene

A tumor suppressor gene, or anti-oncogene, is a gene that protects a cell from one step on the path to cancer. When this gene is mutated to cause a loss or reduction in its function, the cell can progress to cancer, usually in combination with other genetic changes.

Tumor-suppressor genes, or more precisely, the proteins for which they code, either have a dampening or repressive effect on the regulation of the cell cycle or promote apoptosis, and sometimes do both. The functions of tumor-suppressor proteins fall into several categories including the following [Sherr, 2004].

  • Repression of genes that is essential for the continuing of the cell cycle. If these genes are not expressed, the cell cycle does not continue, effectively inhibiting cell division.
  • Coupling the cell cycle to DNA damage. As long as there is damaged DNA in the cell, it should not divide. If the damage can be repaired, the cell cycle can continue.
  • If the damage cannot be repaired, the cell should initiate apoptosis (programmed cell death) to remove the threat it poses for the greater good of the organism.
  • Some proteins involved in cell adhesion prevent tumor cells from dispersing, block loss of contact inhibition, and inhibit metastasis. These proteins are known as metastasis suppressors [Yoshida et al., 2000].
  • DNA repair proteins are usually classified as tumor suppressors as well, as mutations in such their genes increase the risk of cancer, for example mutations in HNPCC, MEN1 and BRCA. Furthermore, increased mutation rate from decreased DNA repair leads to increased inactivation of other tumor suppressors and activation of oncogenes [Markowitzs, 2000].

The expression of oncogenes can be regulated by microRNAs (miRNAs), small RNAs 21-25 nucleotides in length that control gene expression by downregulating them. Mutations in such microRNAs (known as oncomirs) can lead to activation of oncogenes [Hirohashi, 2003]. Antisense messenger RNAs could theoretically be used to block the effects of oncogenes.

2.2 Biological cell life cycle and formation of cancer

The growth of a cancerous tumor is a complicated process involving multiple biological interactions. The response of such tumors to active treatment such as chemotherapy and radiotherapy is also complex. The physiological condition of the patient, the stage of the diseases, scheduling of the therapy and interaction of the drugs are the main factors in chemotherapy. The cell cycling mechanisms is needed to understand the actions of the chemotherapy treatment. The cell cycle is a chain of phases that both normal and cancer cells undergo from their birth to death. The cycle comprises of five stages as show in Figure 2.1. A brief description of different stages is given below [Alam et al., 2010 & Newbury, 2008]:

G1: Pre-synthetic interface stages. Parent cells grows, organelles are reproduced. Longest phase, can last up to 48 hours. Post mitotic gap, the cell prepares for DNA synthesis.

S: Synthetic interface stage. DNA synthesis takes place in preparation for cell division and lasts 8-20 hours (many anticancer drugs act by interfering with DNA at this stage, causing cell death).

G2: Post-synthetic interface stage. Pre-mitotic gap, specialized proteins and RNA are synthesized in preparation for cell division and lasts up to 4 hours.

M: Mitotic phase, cell division takes place and DNA is distributed evenly among daughter cells and lasts up to 30 minutes.

Go: Resting phase, cell is quiescent, able to work as intended but unable to divide.

2.2.1 Explanation of cell life cycle and formation of cancer

The explanation of cell life cycle and cancer growth are given here and a more detailed discussion can be found in [Calman et al., 1980 and Cridland, 1978]. The cell life cycle begins in the G1 gap phase in which protein and RNA syntheses are active. Once these processes are complete, the cell then enters the synthetic S phase where new DNA is generated. After the new DNA is formed, the duplicated chromosomes condense in the G2 gap phase. In the mean time cyclin-mediated activities create organelles and molecules necessary for the next and final phase called mitosis M. Mitosis consists of a set of steps which leads to the development of two daughter cells. In all of these phases there are a number of check points that the cell must pass through to insure the integrity of the DNA. If the cell fails to meet the necessary requirements at the various checkpoints, the cell either repairs or destroys itself and this process is referred to as apoptosis or programmed cell death. If one of these checkpoint mechanisms fails, the result can be a malignant cell that has mutated from a normal cell and the cell will be a germ or initial proliferating cancerous cell which may potentially develop into cancer [Westman et al., 2001].

Fig. 2.1. The five phases of the cell-cycle: G0 (quiescent), G1 (growth), S (DNA replication),

G2 (mitotic preparation), and M (mitosis). [Florian et al.]

The cancerous cells have acquired chromosomal abnormalities through multi-step process for the molecular basis of cancer [Volker, 2001]. One cancerous cell has the potential to form cancer that can be used to model whether the germ cell forms a colony of cancer. Cancerous cells can be compartmentalized according to their intrinsic proliferation capabilities: those actively engaged in the cell division form the proliferating or growth fraction, those in the G0 quiescent phase form the clonogenic or cancer stem cell fraction, and those incapable of further cell division to propagate the cancer form the end cell fraction. The allowable transitions among these three compartments are shown in Figure 2.3. After a cell completes mitosis, either daughter cell may go to any of the three compartments with a given probability that can depend on the current population sizes of the compartments as well as other factors.

Figure 2.2: The cell life cycle. The active cells undergo a series of steps that finally produce two daughter cells by the division of the parent cell. The quiescent or stem cells are inactive and are in the gap phase G0, which can be activated by appropriate stimulus [Westman et al., 2001].

The probabilistic transition rates of a daughter cell to the growth and clonogenic fractions, in the absence of treatment, depend on the aggressiveness of the cancer. For aggressive or fast growing cancers the transition rate of the daughter cells remaining in the proliferating fraction is high whereas slow developing cancers have high rates entering the clonogenic fraction. Cells in the G0 phase are in a resting or quiescent state and naturally will become active again entering the proliferating fraction which is called a natural back migration. A probabilistic equilibrium exists between the compartments that depend on the number of cells present. The bulk dynamics without treatments, the daughter cells that enter the clonogenic fraction from the proliferating compartment after mitosis is called drift. Large migrations or recruitment of cells from the clonogenic to the growth fraction require an external stimulus. An example of such a stimulus would be the application of a cytotoxic agent, a chemotherapeutic treatment, which would cause an abnormally large number of cells in either or both compartments to undergo apoptosis [Westman et al., 2001].

Figure 2.3: Cancer cell types and the allowable transitions among the compartments. Note that once a cell is in the end cell compartment it may never leave. [Westman et al., 2001].

The stimulus causes the recruitment of cells from the clonogenic to the growth fraction. In turn, filling of the growth fraction tries to ensure the survival of the cancer. This source of cells for the growth fraction in conjunction with intrinsic drug resistance normally means that chemotherapy is a palliative treatment that cannot lead to a cure for cancer by itself. The end cell compartment is a terminating compartment, due to lack of potential cancer proliferation. The model presented in this thesis paper considers cells in the proliferating (P) and clonogenic (C) compartments only because these are the compartments which contain cells capable of proliferating the cancer. This is not to say that the members of the end cell compartment do not contribute to tumor evolution. On the contrary, the end cells play an important role in the development of the cancer, particularly in the size that the cancer can attain.

2.3 Chemotherapy and drug resistance

The chemotherapy treatment which can affect the cells in the end cell fraction such as biomodulators (e.g. interferons and interleukins) [Thompson, 2001]. This thesis paper focuses on treatment with cytotoxic agents which can only affect cells capable of proliferation. Thus, explicit dependence on the end cell compartment is neglected throughout the rest of this paper and model parameters are adjusted accordingly. Traditional chemotherapeutic treatment schedules normally consist of a fixed number of treatments, called a treatment cycle. The number of and the time between treatments is bounded by the cumulative effects of toxicity to normal tissue. For example, healthy cells such as NK cells, CD8+ T cells, circulating lymphocytes are affected due to chemotherapeutic treatments. These treatments are given in order to drive the number of cancer cells to a lower level. The cancer is then referred to as being in remission. After the initial cycle, if the cancer is in remission no further treatments are given and the patient is monitored for recurrence. Upon recurrence, the first time the cancer is clinically detected, the treatment cycle is repeated. If the cancer does not recur after a sufficient period of time, then the claim is that the patient is cured. The goal of chemotherapy in the framework of the three cellular compartments is to drive the cells in the proliferating and clonogenic fractions into the end cell fraction. The compartmental view of cancer also allows for the distinction of different types of cytotoxic agents, in particular cycle specific and nonspecific. Cycle specific drugs can only destroy cells in certain phases of the cell cycle and thus acts only on the proliferating fraction. Cycle nonspecific cytotoxic agents can destroy cells in either proliferating or clonogenic compartments. This formulation allows for different rates for the growth and clonogenic fractions since uptake of the agents in an active cell should be greater than that of a quiescent one. One of the key difficulties encountered in treating cancers with chemotherapy is drug resistance. Every biological system exhibits drug resistance, and cancer is no exception [Goldie and Coldman, 1998]. In either case, drug resistance hinders the effectiveness of a given cytotoxic agent. Thus, there are a finite number of treatments that can be given with significant impact. In order to combat drug resistance and yield a more effective treatment, multiple drugs typically are used that are not cross resistant. Thus, cells resistant to one of the cytotoxic agents can be killed by the others [Kuczek and Chan, 1988]. If a single cycle of treatment does not lead to remission, additional applications of the cytotoxic agent(s) may be administered. Whether single or multiple cytotoxic agents are used, the resistance will be developed among the cancerous cells. Therefore the benefits of treatments will be diminished until they are negligible and should not be used [Murray, 1997].

2.4 Cell growth

The term cell growth is used in the contexts of cell development and cell division (reproduction). When used in the context of cell division, it refers to growth of cell populations, where one cell (the “mother cell”) grows and divides to produce two “daughter cells” (M phase). When used in the context of cell development, the term refers to increase in cytoplasmic and organelle volume (G1 phase), as well as increase in genetic material before replication (G2 phase) [Biology online dictionary]. The exponential growth model, Gompertz growth model and logistic growth model are described below,

2.4.1 Exponential growth model

In exponential growth the doubling time is constant and the cell number N at any time is a function of the initial size N(0), the time of growth t, and a constant . This exponential growth model can be expressed by the formula:

N

This equation has an analytical solution:

N (t) = N(0).exp( t)

Hence, the time td required for N (t1) to be double i.e., for N(t1+td)/N(t1) =2; is always constant and given by loge(2)/b. This model was developed principally to yield a valuable relationship among tumor size, growth characteristics, and therapeutic response [Larry Norton, 1988]. This model also has a significant role in the development of more recent models. The exponential pattern of growth can be useful when it fits actual data, but it is now known not to be universally appropriate. It has been shown for some cancers that, as the mass of a tumor increases, its growth slows due to a reduction in its growth fraction [Martin]. In particular, in many types of growth- normal and neoplastic-td is not constant, but increases as the population size increases [Larry Norton, 1988]. For example a fivefold increase in mean tumor doubling times between early and late breast cancer has been observed [Shackney et al., 1978]. This cannot be reproduced using an exponential growth model.

2.4.2 Gompertz growth model

In a living body as the tumor size increases, the growth of tumor slows down when the tumor mass approaches a plateau population. The Gompertz model is a modification of exponential growth, with the addition of a decreasing growth rate over time [http://www.biology-online.org]. Gompertz’ equation, was originally developed for actuarial analysis, but later proposed as a growth curve. In Gompertzian growth N(t) is not only a function of N(0), and t but also a limiting size N(?),the equation for this model is given by [Larry Norton, 1988]-

N (t) = N (0).exp {k [1-exp (- t)]}

Where k = loge [N(?)/N(0)].

This model has been most broadly and successfully applied to fit with experimental and clinical data [Anita and Francesca, 2008] and uncovers the growth-regulatory mechanisms of animal and human cancers. In addition, when used to relate a tumor’s size to its rate of regression in response to therapy [Norton and Simon, 1977] Gompertz equation has aided in the design of successful clinical trials. Figure-2.4 shows the Gompertz growth of tumor.

2.4.3 Logistic growth model

The logistic growth model has been used in many cases as a basic model of both cell growth and, more particularly, tumor cell growth [Bao-Quan Ai et al., 2003]. Logistic growth is an example of sigmoidal growth function since as time increases the tumor mass approaches a stable level called plateau population. For any living being the plateau population is above the level that is lethal to the host (Martin). The logistic growth function is given by-

=

Where, is a positive constant and is the plateau population of tumor. The exponential and logistic growth functions are close in value until the tumor burden comes within an order of magnitude of the plateau population.

Chapter 3

Genetic Algorithm

3.1 Introduction

Genetic algorithms (GAs) are becoming popular to solve the optimization problems now a days in different fields of application mainly because of its robustness in finding optimal solution and ability to provide near optimal solution close to global minimum. Genetic algorithms employ search procedures based on the mechanism of natural selection and survival of the fittest. The GAs, which uses multiple point searches instead of single point search and works with the coded structure of variables. The information required is the objective function. The Gas has proven successful in obtaining a solution to the travelling salesman problem, optimal control of an aircraft auto pilot system, multivariate curve-fitting, and game playing. A number of these achievements suggest the potential utility of the Gas as a method for control system design.

3.2 Genetic algorithm

Genetic algorithm (GA) is a way to randomly search for the best answers [Holland, 1975]. Over the last few years, it is becoming important to solve a wide range of search, optimization and machine learning problems. It attempts to solve problems in a fashion similar to the way in which human genetic process seem to operate. It’s fundamental principle is the fittest member of population has the highest probability for survival.

A GA (multi path search scheme) is an iterative procedure which maintains a constant size population p(t) of candidate solutions. The initial population p(0) can be chosen at random. The structures of the populatioin p(t+1) (i.e. for next iteration called generation) are chosen from p(t) randomize selection procedure that ensures that the expected number of times a structure is chosen is approximately proportional to that structure’s performance relative to the rest of the population. In order to search other points in a search space, some variation is introduced into the new population by means of genetic operators (crossover and mutation) [Goldberg, 1989].

While it may seem to be a random search, in fact, the improvement in each generation indicates that the algorithm produces an effective directed search technique. The power of GAs derives largely from their ability to exploit efficiently this vast amount of accumulating knowledge by means of relatively simple selection mechanisms. Termination of the may be triggered by finding an acceptable approximate solution by fixing the total number of structure evaluations or some other application dependent criterion.

3.3 Representation

GAs is derived from biological systems. In biological systems evolution takes place on chromosomes – organic devices for encoding the structure of living beings. A living being is only a decoded structure of the chromosomes. Natural selection is the link between chromosomes and the performance of their decoded structure. In GAs the design variables or features that characterize an individual are represented in ordered list called string. Each design variable corresponds to gene and the strings of design variables correspond to chromosome in biological systems.

Bit strings lists of 0’s and 1’s have been found to be effective in representing a wide variety of information in the problem domains involving function optimization. The length of genes (bit size) depends on the precision required. If the variables are integer then the number of bits in a gene may be determined by the largest integer in the search range.

3.4 Definition of some important genetic codes

Population Size: It is defined as number of multiple paths used in search space. It affects both the ultimate performance and the efficiency of GAs.

Bit Size: Also called gene length. It is basically the number of bits chosen to represent the given design variables.

String Length: Also called chromosome length. It is combined length of all the genes (i.e. bit size* no. of variables).

No. of Generation: This is the number of new populations that is wished to be created during the search.

Crossover Rate : It controls the frequency with which the crossover operation is applied. In each new population ‘crossover rate* population size’ structure undergo crossover.

Mutation Rate: It is secondary search operator, which increases the variability of the population. Approximately ‘mutation rate*string rate*population size’ number of bit position of each structure in new population undergoes a random change with a probability equal to the mutation rate.

Fitness: It is defined as a non-negative figure of merit to be maximized. It is associated directly with the objective function value in the optimization.

3.5 Methodology

A GA is an iterative procedure which maintains a constant size population p(t) of candidate solutions. The algorithm begins with a randomly selected population of function inputs represented by strings of bits. During each iteration step called generation, the structure in the current population is evaluated and on the basis of that evaluation, a new population of candidate solution is formed. That is, GA uses the current population of string to create a new population such that the strings in the new population are on average ‘better’ than those in the current population. The idea is to use the best elements from the current population will on average be “better” than the old population. Three processes- selection, mating and mutation are used to make the transition from one population generation to the next [Goldberg, 1989]. The basic genetic algorithm cycle based on these is shown in Fig-3.1. These three steps are repeated to create each new generation. And it continues in this fashion until some stopping condition is reached (such as maximum number of generations or resulting new population not improving fast enough).

Evaluation

Crossover

Selection
Mutation
Mating
Old Population
New Population

Fig-3.1. The basic genetic algorithm cycle.

Initialization

GAs operates with a set of strings instead of a single string. This set or population of string goes through the process of evaluation to produce new individual strings. To begin with the initial population could be seeded with heuristically chosen string or at random. In either case, the initial population should contain a wide variety of structures.

Evaluation Function

The evaluation function is a procedure to determine the fitness of each string in the population and is very much application oriented. The performance of each structures of string is evaluated according to its fitness, which is defined as a non-negative figure of merit to be maximized [Holland, 1975]. It is directly associated with the objective function value in the optimization. GA treats the problem, as a black box in which the input is the structure of the string and the output is its fitness. Because GA proceeds only according to the fitness of the strings and not to other information, the properties of the fitness will influence the GA’s performance. The evaluation function, Fc for binary vector C is equivalent to the function F.

Fc = F(x).

Where the chromosome C corresponds to real value (x). The evaluation function plays the role of the environment rating potential solutions in terms of their fitness. In many problems, the objective is more naturally stated as the minimization of some cost function J rather than the maximization of some utility or profit function u(x). Hence, it is often necessary to map the objective junction to a fitness function form.

Selection

This is the first step of the three genetic operations. This determines which strings in the current generation will be used to create the next generation. This is done by using a biased random selection methodology. That is, parents are randomly selected from the current population is such a way that the “best” strings in the population have the greatest chance of being selected [Goldberg, 1989]. There are many ways to do this. One commonly used technique is Roulette Wheel selection. The following steps are carried out in Roulette Wheel parent selection algorithm:

(1) Sum the fitness of all the population members. Call it as total fitness.

(2) Generate a random numbers (rj) between 0 and the total fitness.

(3) Select a population member whose cumulative fitness obtained from adding its fitness to the fitness of the proceeding population members is greater than or equal to rj.

This algorithm is referred to as Roulette Wheel selection because it can be viewed as allocation pi-shaped slice on Roulette Wheel to population members with each slice proportional to the member’s fitness. Selection of a population member to be a parent can then be viewed as a spin of the wheel. This parent selection technique has the advantage that it directly promotes reproduction of the fittest population member by biasing each member’s chance of selection in accordance with its evaluation. Table -3.1 shows the fitness of five chromosomes, running total of their fitness and percentage of fitness. Figure – 3.2 shows the Roulette Wheel and Table – 3.2 shows the chromosome that would be chosen by the Roulette Wheel algorithm using these fitness values for each 5 randomly generated numbers. This technique can be used for only those chromosomes whose fitness is positive numbers [Goldberg et al., 1989].

Chromosome [j] 1 2 3 4 5
Fitness (fj) 0.1538 0.5352 0.0739 0.4805 0.3851
Running Total(Rfj) 0.1538 0.6890 0.7629 1.2434 1.6285
Percentage (fj/? fj)×100 9.44 32.86 4.54 29.51 23.46

Table-3.1: Roulette Wheel Parent Selection

Random Number (rj) 0.250 1.173 0.567 1.189 0.294
Chromosome Chosen 2 4 2 4 2
9.44

29.51

4.54

23.46

32.86

Table-3.2 : Selection of Chromosomes

Fig-3.2: Roulette Wheel

Remark

Although this parent selection procedure is random, each parent’s chance of being selected is directly proportional to its fitness. It is possible that the worst population member could be selected by this algorithm. Such an occurrence would inhibit the performance of genetic algorithm using this selection technique. But the odds of this happening in a population of any size are negligible.

Crossover

Crossover is a randomized yet structured recombination operation [ Srinivas and Patnaik, 1994]. Simple crossover may proceed in two steps. Firstly, the newly reproduced strings in the mating pool are mated at random. Secondly, crossover of each pair of strings is done as follows:

1. An integer position p along string is selected at random in the intervals [L, L-1], where L is the string length.

2. Two new strings are created by swapping all characters between

Let us consider the crossover operator on chromosomes C2 and C3. Assume that the crossover point was (randomly) selected after the 5th gene

C2 = 00000011100000

C3 = 11100000001111

The two resulting offspring will be

C2’ = 00000000001111

C3’ = 11100011100000

Crossover exchanges corresponding genetic material from two parent chromosomes, allowing beneficial genes of different parents to be combined in their offspring.

Mutation

Reproduction and crossover effectively search and recombine the existing chromosomes. However, they do not create any new genetic material in the population. Mutation is capable of overcoming this shortcoming. It is an occasional random alteration of a string position. In the binary sting representation, this simply means changing a 1 to 0 or vice versa. This random mutation provides background variation and occasionally introduces beneficial materials into the population [ Srinivas and Patnaik, 1994].

Consider the mutation operation on a chromosome C3 and say that the 3rd gene is selected for mutation then

C3” = 11000000001111

Remark

Nothing in the classical approach to mutation or crossover requires that only those methods be used in a genetic algorithm. One could try any procedure one can think of to mix bits in a population to create a new population. Some of the ideas may not produce a new population better than the previous one for a specific application on the other hand some may work very well.

3.6 Genetic control parameter selection

Genetic parameters (namely population size, crossover rate and mutation rate) are the entities that help to tune the performance of the Genetic Algorithms. The selection of values for these parameters plays an important role in obtaining an optimal solution. There are no deterministic rules to decide these values. There are some general guidelines which could be followed to arrive at an optimal values for these parameters.

Population size:

GAs generally does poorly with very small population, because the population provides an insufficient sample size for most hyper planes. A large population is more likely contain representative from a large number of hyper planes. Hence the GAs can perform a more informed search. As a result, a large population discourages premature convergence to optimal solution. On the other hand, a large population requires some evaluations per generation, possibly resulting in an unacceptably slow rate of convergence.

Crossover rate:

The higher the crossovers rate, the more quickly new structures are introduced into the population. If the crossover rate is too high, high performance structures are discarded faster than the selection can produce improvements. If the crossover rate is low the search may stagnate due to the lower exploration rate.

Mutation rate:

A low level of mutation serves to prevent any given bit position from remaining forever converged to a single value in the entire population. A high level of mutation yields an essentially random search.

Fitness function

In GAs, the value of fitness represents the ‘performance’, which is used to rank the string, and the ranking is then used to determine how to allocate reproductive opportunities. This means that individuals with higher fitness value will have higher probability of being selected as a parent. Fitness thus is some measure of goodness to be maximized [Srinivas and Patnaik, 1994]. The fitness function is essentially the objective function for the problem.

3.7 Flow chart [Goldberg, 1989] for the basic genetic algorithm

Stop
Is generation< max generation?
Is offspring< popsize*Pc?
Generation = 1
Generation= generation+1
Combine offspring & old population to create a single old population containing best of both
Print the solution vector
Copy the string with highest fitness to the solution vector
Compare the fitness of each offspring, i.e. call subroutine FITTNESS
Offspring = offspring + 2
Mutate each offspring based on their probability of selection
Perform the crossover to produce two offspring
Select the two parent string from old population based on their probability of selection
Offspring = 0
Compute the selection probability for each string in old population
Randomly generate the initial population i.e. old population & compute the fitness of each string, i.e. call subroutine FITNESS
Start

No

Chapter 4

Model Explanation and Implementation

4.1 De Pillis et al. 2006

The model by de Pillis et al. consists of six states, up to three controls and 29 parameters in three parameter sets [de Pillis et al. 2006].

Table-4.1: Overview of states and controls in de Pillis et al. model.

The work is based on this model [de Pillis et al. 2001, 2003, 2005]. This model contains a combination of chemotherapy and immunotherapy. Thus, one of the six states is still the tumor population x0, but in absolute cell count. Instead of the blood vessel volume, this model features three types of immune cell populations (all in absolute cell count):

• NK cells x1—unspecific immune cells which are also present in a healthy body, “natural killer” cells

• CD8+ T cells x2—tumor-specific immune cells

• circulating lymphocytes x3—white blood cells

The fifth and the sixth state represent the chemotherapeutic drug concentration x4 and interleukin-2 concentration x5 respectively. Interleukin-2, which is one of the two immunotherapeutic controls, is a cytokine that stimulates the production of CD8+ T cells and is used “to boost immune system function” [L. de Pillis et al. 2006]. In addition, there is a control for a classic chemotherapeutic drug u0 and one for a tumor infiltrating lymphocyte injection (TIL) u2. The latter means an injection of CD8+ T cells that have been highly stimulated against tumor cells outside the body.

4.2 Parameter sets

There are three parameter sets given: mouse, human 9, and human 10. As the name suggests, the mouse parameter set contains parameters derived from murine experiments. In fact, the parameters come from different papers treating different types of cancer and different types of mice. For example the tumor growth parameter has been fitted to data from a paper by Diefenbach et al. [2001]. On the other hand, e.g. the death rate of NK cells is derived from Kuznetsov et al. [Kuznetsov et al. 1994], where a mathematical model is used to describe the kinetics of growth and regression of a BCL1 lymphoma. For the parameters r2 and v, there is no source. For the human parameter set, De Pillis et al refer to an article by Dudley et al. [2002]. The data of the ones numbered 9 and 10 has been used in de Pillis et al. 2006 for fitting seven of the parameters. Again, for r2 and v, there is no source. Actually, there are eight more references for the human parameter set, e.g. the two murine papers from the mouse parameter set. Another source [Kirschner and Panetta, 1998] refers among others to Kuznetsov et al. Note that this means that about one third of the human parameters comes from murine experiments. In [Kirschner and Panetta, 1998] it is stated that the value of a certain parameter in their model varies greatly from patient to patient and cancer to cancer. So this model may not be adequate for generating exact treatment schedules 1—in particular for humans since there is only little human data used and the values may highly depend on the patient and the type of cancer. But it still may be useful for making general qualitative statements. Table 4.2 gives an overview of the different parameter sets.

4.3 Model equations

The model features a logistic growth term in the tumor equation, here

(4.1)

As already mentioned above, NK cells x1 and CD8+ T cells x2 both kill tumor cells. The interaction between NK and tumor cells is modeled with a bilinear term, while the effect of CD8+ T cells introduces a quotient also used in other equations:

(4.2)

Both terms are taken from de Pillis et al. [2006] The last summand in the tumor equation represents the effect of chemotherapy on the tumor where a saturation term (1 ? e?x4(t)) is used. This term with different parameters is contained in the immune cell equations, too:

(4.3a)

(4.3b)

(4.3c)

(4.3d)

The equation for NK cells contains a recruitment term of circulating lymphocytes e x3(t) and an exponential decay term ?f x1(t) representing the limited natural lifespan of each cell. Additionally, there is a term for recruitment of NK cells,

(4.4)

and one for the inactivation of NK cells after interaction with tumor cells (4.5)

Table 4.2: Parameter sets of de Pillis et al. [2006] [Engelhart, 2009]

Matching the assumption that there are no CD8+ T cells present in the absence of a tumor, there is no growth term but an exponential decay ?m x2 in the corresponding equation. However, there are several recruitment terms. One for the recruitment by tumor-CD8+ T cell interaction, one for the recruitment by tumor cells killed by NK cells (tumor debris), and one for the recruitment by encounters between tumor cells and circulating lymphocytes:

(4.5)

Similar to the inactivation of NK cells there is a summand ?q x2 x0 for the inactivation of CD8+ T cells after interacting with tumor cells. In addition, there is an inactivation term ?v x1 x22 describing the regulation and suppression of CD8+ T cell activity at high cell levels. In the CD8+ T cell equation there is also a Michaelis-Menten-term for the influence of IL-2 and a simple input of the TIL control,

(4.6)

Circulating lymphocytes are expected to have a constant source and again a limited natural lifespan which leads—together with chemotherapeutic influence—to

(4.7)

The last two equations—the ones for chemo- and immunotherapeutic drug concentration— are of simple exponential decay type with the corresponding controls as inputs:

(4.8)

(4.9)

Again for reasonable results, it is necessary that all states and controls are ? 0. Therefore De Pillis add the constraints

(4.10)

(4.11)

Finally the whole model by the following set of ODEs. In favor of readability the time dependence of states and controls is omitted. For more details the equations are referred to [de Pillis et al., 2006, 2003].

(4.12)

(4.13)

(4.14)

(4.15)

(4.16)

(4.17)

(4.18) (4.19)

(4.20)

(4.21)

4.4 Initial values and optimal control

There are nine different initial value sets: one for the mousesetting and eight for human 9 and human 10respectively. They are listed in table 4.3. In de Pillis et al. 2006, there is no optimal control problem given and no optimization done. Though there are many different scenarios investigated, e.g. big initial tumor size vs. small initial tumor size, combination therapy vs. monotherapy and so on. The setting with nonlinear ODEs, multiple controls, three different parameter sets and lots of initial values helps for any applications of optimal control techniques. Numerical results for