All individuals in a population share the same genotypic properties such as number of chromosomes, number and position of loci, names of markers, chromosomes, and information fields. These properties are stored in this GenoStruTrait class and are accessible from both Individual and Population classes. Currently, a genotypic structure consists of
In addition to basic property access functions, this class provides some utility functions such as locusByName, which looks up a locus by its name.
A Population consists of individuals with the same genotypic structure. An Individual object cannot be created independently, but refences to inidividuals can be retrieved using member functions of a Population object. In addition to structural information shared by all individuals in a population (provided by class GenoStruTrait), the Individual class provides member functions to get and set genotype, sex, affection status and information fields of an individual.
Genotypes of an individual are stored sequentially and can be accessed locus by locus, or in batch. The alleles are arranged by position, chromosome and ploidy. That is to say, the first allele on the first chromosome of the first homologous set is followed by alleles at other loci on the same chromsome, then markers on the second and later chromosomes, followed by alleles on the second homologous set of the chromosomes for a diploid individual. A consequence of this memory layout is that alleles at the same locus of a non-haploid individual are separated by Individual::totNumLoci() loci. It is worth noting that access to invalid chromosomes, such as the Y chromosomes of female individuals, is not restricted.
A simuPOP population consists of individuals of the same genotypic structure, organized by generations, subpopulations and virtual subpopulations. It also contains a Python dictionary that is used to store arbitrary population variables.
In addition to genotypic structured related functions provided by the GenoStruTrait class, the population class provides a large number of member functions that can be used to
The following parameters are used to create a population object:
The pedigree class is derived from the population class. Unlike a population class that emphasizes on individual properties, the pedigree class emphasizes on relationship between individuals. An unique ID for all individuals is needed to create a pedigree object from a population object. Compared to the Population class, a Pedigree object is optimized for access individuals by their IDs, regardless of population structure and ancestral generations. Note that the implementation of some algorithms rely on the fact that parental IDs are smaller than their offspring because individual IDs are assigned sequentially during evolution. Pedigrees with manually assigned IDs should try to obey such a rule.
This function locates relatives (of type relType) of each individual and store their IDs in information fields relFields. The length of relFields determines how many relatives an individual can have.
Parameter relType specifies what type of relative to locate, which can be
Optionally, you can specify the sex and affection status of relatives you would like to locate, using parameters sex and affectionStatus. sex can be ANY_SEX (default), MALE_ONLY, FEMALE_ONLY, SAME_SEX or OPPOSITE_SEX, and affectionStatus can be AFFECTED, UNAFFECTED or ANY_AFFECTION_STATUS (default). Only relatives with specified properties will be located.
This function will by default go through all ancestral generations and locate relatives for all individuals. This can be changed by setting parameter ancGens to certain ancestral generations you would like to process.
Trace a relative path in a population and record the result in the given information fields resultFields. This function is used to locate more distant relatives based on the relatives located by function locateRelatives. For example, after siblings and offspring of all individuals are located, you can locate mother’s sibling’s offspring using a relative path, and save their indexes in each individuals information fields resultFields.
A relative path consits of a fieldPath that specifies which information fields to look for at each step, a sex specifies sex choices at each generation, and a affectionStatus that specifies affection status at each generation. fieldPath should be a list of information fields, sex and affectionStatus are optional. If specified, they should be a list of ANY_SEX, MALE_ONLY, FEMALE_ONLY, SAME_SEX and OppsiteSex for parameter sex, and a list of UNAFFECTED, AFFECTED and ANY_AFFECTION_STATUS for parameter affectionStatus.
For example, if fieldPath = [['father_id', 'mother_id'], ['sib1', 'sib2'], ['off1', 'off2']], and sex = [ANY_SEX, MALE_ONLY, FEMALE_ONLY], this function will locate father_id and mother_id for each individual, find all individuals referred by father_id and mother_id, find informaton fields sib1 and sib2 from these parents and locate male individuals referred by these two information fields. Finally, the information fields off1 and off2 from these siblings are located and are used to locate their female offspring. The results are father or mother’s brother’s daughters. Their indexes will be saved in each individuals information fields resultFields. If a list of ancestral generations is given in parameter ancGens is given, only individuals in these ancestral generations will be processed.
A simuPOP simulator is responsible for evolving one or more populations forward in time, subject to various operators. Populations in a simulator are created from one or more replicates of specified populations. A number of functions are provided to access and manipulate populations, and most importantly, to evolve them.
Evolve all populations gen generations, subject to several lists of operators which are applied at different stages of an evolutionary process. Operators initOps are applied to all populations (subject to applicability restrictions of the operators, imposed by the rep parameter of these operators) before evolution. They are used to initialize populations before evolution. Operators finalOps are applied to all populations after the evolution.
Operators preOps, and postOps are applied during the life cycle of each generation. These operators can be applied at all or some of the generations, to all or some of the evolving populations, depending the begin, end, step, at and reps parameters of these operators. These operators are applied in the order at which they are specified. populations in a simulator are evolved one by one. At each generation, operators preOps are applied to the parental generations. A mating scheme is then used to populate an offspring generation. For each offspring, his or her sex is determined before during- mating operators of the mating scheme are used to transmit parental genotypes. After an offspring generation is successfully generated and becomes the current generation, operators postOps are applied to the offspring generation. If any of the preOps and postOps fails (return False), the evolution of a population will be stopped. The generation number of a population, which is the variable "gen" in each populations local namespace, is increased by one if an offspring generation has been successfully populated even if a post-mating operator fails. Another variable "rep" will also be set to indicate the index of each population in the simulator. Note that populations in a simulator does not have to have the same generation number. You could reset a population’s generation number by changing this variable.
Parameter gen can be set to a non-negative number, which is the number of generations to evolve. If a simulator starts at the beginning of a generation g (for example 0), a simulator will stop at the beginning (instead of the end) of generation g + gen (for example gen). If gen is negative (default), the evolution will continue indefinitely, until all replicates are stopped by operators that return False at some point (these operators are called terminators). At the end of the evolution, the generations that each replicates have evolved are returned. Note that finalOps are applied to all applicable population, including those that have stopped before others.
If parameter dryrun is set to True, this function will print a description of the evolutionary process generated by function describeEvolProcess() and exits.