abtools.phylogeny: Phylogenetic analysis of antibody lineages¶
-
abtools._phylogeny.run(**kwargs)¶ Builds a phylogenetic representation of antibody sequences.
outputis required, as well as one ofinput,alignmentornewick.Parameters: - input (str) –
Can be one of three things:
- Path to a FASTA-formatted file containing input sequences.
- A list of AbTools
Sequenceobjects. - A list of dictionaries, containing at minimum
name_keyandseq_key.
- output (str) – Path to the output directory, into which tree images and all intermediate files will be deposited.
- root (str) – Path to a FASTA-formatted file containing a single sequence which will be used to root the tree. If not provided, tree will be unrooted.
- mabs (str) – Path to a FASTA-formatted file containing mAb sequences. If supplying both mAb sequences and NGS sequences, passing the mAb sequences separately allows you to modify their representation separately (for example, show sequence IDs for just the mAb sequences).
- alignment (str) – Path to a multiple sequence alignment, in FASTA format. If sequences are already aligned, this will save some computational time since the alignment will not be redone.
- newick (str) – Path to a tree file, in Newick format. As with
alignment, this is primarily to save computational time if the tree file has already been generated. - name_key (str) – If
inputis a list of Sequence objects or dicts, this key will be used to find the sequence ID. Default isseq_id. - sequence_key (str) – If
inputis a list of Sequence objects or dicts, this key will be used to find the sequence. Default isvdj_nt. - timepoints (str) –
Path to a Tab-delimited file, of the following format (one per line):
TimepointName TimepointOrder TimepointColor
TimepointNameshould prepended to the sequences in the input file (separated bydelimiter).TimepointOrderis an integer that indicates the order in which the timepoints should be sorted.TimepointColoris a hex value that will be used to color the phylogenetic tree. If mAb sequences are provided, the ‘mab’TimepointNamewill be used to sort/color the mAb sequences. If not provided, colors will be automatically selected and timepoints will be determined by a simple sort of the raw timepoint values parsed from the input file. - is_aa (bool) – If
True, input sequences will be assumed to be amino acid sequences. Default isFalse, which assumes nucleotide sequences. - delimiter (str) – The delimiter used in sequence IDs to separate the timepoint from
the sequence name. Default is
_. - scale (int) – Horizontal scale of the phylogeny. Default is
None, which uses the defaultete2value. - branch_vertical_margin (float) – Vertical scale of the phylogeny. Default is
None, which uses the defaultete2value. - label_nodes (str) – Type of nodes to be labeled. Options are:
all,none,no-root,mab,input, androot. - label_fontsize (float) – Font size for the node labels.
- tree_orientation (int) – If
0, tree is drawn from left to right. If1, tree will be drawn from right to left (mirror). Default is0.
- input (str) –