CRISPRessoWGS Parameters

Amplicon Min Alignment Score

-amas, --amplicon_min_alignment_score

Help: Amplicon Minimum Alignment Score; score between 0 and 100; sequences must have at least this homology score with the amplicon to be aligned (can be comma-separated list of multiple scores, corresponding to amplicon sequences given in --amplicon_seq)

Type: str


Default Minimum Alignment Score

--default_min_aln_score, --min_identity_score

Help: Default minimum homology score for a read to align to a reference amplicon

Type: int

Default: 60


Expand Ambiguous Alignments

--expand_ambiguous_alignments

Help: If more than one reference amplicon is given, reads that align to multiple reference amplicons will count equally toward each amplicon. Default behavior is to exclude ambiguous alignments.

Type: bool

Default: False


Assign Ambiguous Alignments To First Reference

--assign_ambiguous_alignments_to_first_reference

Help: If more than one reference amplicon is given, ambiguous reads that align with the same score to multiple amplicons will be assigned to the first amplicon. Default behavior is to exclude ambiguous alignments.

Type: bool

Default: False


Guide Seq

-g, --guide_seq, --sgRNA

Help: sgRNA sequence, if more than one, please separate by commas. Note that the sgRNA needs to be input as the guide RNA sequence (usually 20 nt) immediately adjacent to but not including the PAM sequence (5' of NGG for SpCas9). If the PAM is found on the opposite strand with respect to the Amplicon Sequence, ensure the sgRNA sequence is also found on the opposite strand. The CRISPResso convention is to depict the expected cleavage position using the value of the parameter '--quantification_window_center' nucleotides from the 3' end of the guide. In addition, the use of alternate nucleases besides SpCas9 is supported. For example, if using the Cpf1 system, enter the sequence (usually 20 nt) immediately 3' of the PAM sequence and explicitly set the '--cleavage_offset' parameter to 1, since the default setting of -3 is suitable only for SpCas9.

Type: str


Guide Name

-gn, --guide_name

Help: sgRNA names, if more than one, please separate by commas.

Type: str


Flexiguide Seq

-fg, --flexiguide_seq

Help: sgRNA sequence (flexible) (can be comma-separated list of multiple flexiguides). The flexiguide sequence will be aligned to the amplicon sequence(s), as long as the guide sequence has homology as set by --flexiguide_homology.

Type: str

Default: None


Flexiguide Homology

-fh, --flexiguide_homology

Help: flexiguides will yield guides in amplicons with at least this homology to the flexiguide sequence.

Type: int

Default: 80


Flexiguide Name

-fgn, --flexiguide_name

Help: flexiguide name

Type: str


Discard Guide Positions Overhanging Amplicon Edge

--discard_guide_positions_overhanging_amplicon_edge

Help: If set, for guides that align to multiple positions, guide positions will be discarded if plotting around those regions would included bp that extend beyond the end of the amplicon.

Type: bool

Default: False


Expected HDR Amplicon Sequence

-e, --expected_hdr_amplicon_seq

Help: Amplicon sequence expected after HDR

Type: str


Exon Specification Coding Sequence/s

-c, --coding_seq

Help: Subsequence/s of the amplicon sequence covering one or more coding sequences for frameshift analysis. If more than one (for example, split by intron/s), please separate by commas.

Type: str


Config File

--config_file

Help: File path to JSON file with config elements

Type: str

Default: None


Minimum Average Read Quality (phred33 Scale)

-q, --min_average_read_quality

Help: Minimum average quality score (phred33) to keep a read

Type: int


Minimum Single bp Quality (phred33 Scale)

-s, --min_single_bp_quality

Help: Minimum single bp score (phred33) to keep a read

Type: int


Minimum bp Quality or N (phred33 Scale)

--min_bp_quality_or_N

Help: Bases with a quality score (phred33) less than this value will be set to 'N'

Type: int


File Prefix

--file_prefix

Help: File prefix for output plots and tables

Type: str


Sample Name

-n, --name

Help: Output name of the report (default: the name is obtained from the filename of the fastq file/s used in input)

Type: str


Suppress Amplicon Name Truncation

--suppress_amplicon_name_truncation

Help: If set, amplicon names will not be truncated when creating output filename prefixes. If not set, amplicon names longer than 21 characters will be truncated when creating filename prefixes.

Type: bool

Default: False


Output Folder

-o, --output_folder

Help: Output folder to use for the analysis (default: current folder)

Type: str


Verbosity

-v, --verbosity

Help: Verbosity level of output to the console (1-4) 4 is the most verbose

Type: int

Default: 3


Trimming Adapter

--trim_sequences

Help: Enable the trimming with fastp

Type: bool

Default: False


Trimmomatic Command

--trimmomatic_command

Help: DEPRECATED in v2.3.0, use --fastp_command

Type: str

Default: None


Trimmomatic Options String

--trimmomatic_options_string

Help: DEPRECATED in v2.3.0, use --fastp_options_string

Type: str


Flash Command

--flash_command

Help: DEPRECATED in v2.3.0, use --fastp_command

Type: str

Default: None


Fastp Command

--fastp_command

Help: Command to run fastp

Type: str

Default: fastp


Fastp Options String

--fastp_options_string

Help: Override options for fastp, e.g. --length_required 70 --umi

Type: str


Min Paired End Reads Overlap

--min_paired_end_reads_overlap

Help: Parameter for the fastp read merging step. Minimum required overlap length between two reads to provide a confident overlap

Type: int

Default: 10


Max Paired End Reads Overlap

--max_paired_end_reads_overlap

Help: DEPRECATED in v2.3.0

Type: str

Default: None


Stringent Flash Merging

--stringent_flash_merging

Help: DEPRECATED in v2.3.0

Type: bool

Default: False


Quantification Window Size

-w, --quantification_window_size, --window_around_sgrna

Help: Defines the size (in bp) of the quantification window extending from the position specified by the '--cleavage_offset' or '--quantification_window_center' parameter in relation to the provided guide RNA sequence(s) (--sgRNA). Mutations within this number of bp from the quantification window center are used in classifying reads as modified or unmodified. A value of 0 disables this window and indels in the entire amplicon are considered. Default is 1, 1bp on each side of the cleavage position for a total length of 2bp. Multiple quantification window sizes (corresponding to each guide specified by --guide_seq) can be specified with a comma-separated list.

Type: str

Default: 1


Quantification Window Center

-wc, --quantification_window_center, --cleavage_offset

Help: Center of quantification window to use within respect to the 3' end of the provided sgRNA sequence. Remember that the sgRNA sequence must be entered without the PAM. For cleaving nucleases, this is the predicted cleavage position. The default is -3 and is suitable for the Cas9 system. For alternate nucleases, other cleavage offsets may be appropriate, for example, if using Cpf1 this parameter would be set to 1. For base editors, this could be set to -17 to only include mutations near the 5' end of the sgRNA. Multiple quantification window centers (corresponding to each guide specified by --guide_seq) can be specified with a comma-separated list.

Type: str

Default: -3


Exclude bp From Left

--exclude_bp_from_left

Help: Exclude bp from the left side of the amplicon sequence for the quantification of the indels

Type: int

Default: 15


Exclude bp From Right

--exclude_bp_from_right

Help: Exclude bp from the right side of the amplicon sequence for the quantification of the indels

Type: int

Default: 15


Use Legacy Insertion Quantification

--use_legacy_insertion_quantification

Help: If set, the legacy insertion quantification method will be used (i.e. with a 1bp quantification window, indels at the cut site and 1bp away from the cut site would be quantified). By default (if this parameter is not set) with a 1bp quantification window, only insertions at the cut site will be quantified.

Type: bool

Default: False


Ignore Substitutions

--ignore_substitutions

Help: Ignore substitutions events for the quantification and visualization

Type: bool

Default: False


Ignore Insertions

--ignore_insertions

Help: Ignore insertions events for the quantification and visualization

Type: bool

Default: False


Ignore Deletions

--ignore_deletions

Help: Ignore deletions events for the quantification and visualization

Type: bool

Default: False


Discard Indel Reads

--discard_indel_reads

Help: Discard reads with indels in the quantification window from analysis

Type: bool

Default: False


Needleman Wunsch Gap Open

--needleman_wunsch_gap_open

Help: Gap open option for Needleman-Wunsch alignment

Type: int

Default: -20


Needleman Wunsch Gap Extend

--needleman_wunsch_gap_extend

Help: Gap extend option for Needleman-Wunsch alignment

Type: int

Default: -2


Needleman Wunsch Gap Incentive

--needleman_wunsch_gap_incentive

Help: Gap incentive value for inserting indels at cut sites

Type: int

Default: 1


Needleman Wunsch Alignment Matrix Location

--needleman_wunsch_aln_matrix_loc

Help: Location of the matrix specifying substitution scores in the NCBI format (see ftp://ftp.ncbi.nih.gov/blast/matrices/)

Type: str

Default: EDNAFULL


Plot Histogram Outliers

--plot_histogram_outliers

Help: If set, all values will be shown on histograms. By default (if unset), histogram ranges are limited to plotting data within the 99 percentile.

Type: bool

Default: False


Plot Window Size

--plot_window_size, --offset_around_cut_to_plot

Help: Defines the size of the window extending from the quantification window center to plot. Nucleotides within plot_window_size of the quantification_window_center for each guide are plotted.

Type: int

Default: 20


Min Frequency Alleles Around Cut To Plot

--min_frequency_alleles_around_cut_to_plot

Help: Minimum %% reads required to report an allele in the alleles table plot.

Type: float

Default: 0.2


Expand Allele Plots By Quantification

--expand_allele_plots_by_quantification

Help: If set, alleles with different modifications in the quantification window (but not necessarily in the plotting window (e.g. for another sgRNA)) are plotted on separate lines, even though they may have the same apparent sequence. To force the allele plot and the allele table to be the same, set this parameter. If unset, all alleles with the same sequence will be collapsed into one row.

Type: bool

Default: False


Allele Plot Percentages Only for Assigned Reference

--allele_plot_pcts_only_for_assigned_reference

Help: If set, in the allele plots, the percentages will show the percentage as a percent of reads aligned to the assigned reference. Default behavior is to show percentage as a percent of all reads.

Type: bool

Default: False


Quantification Window Coordinates

-qwc, --quantification_window_coordinates

Help: Bp positions in the amplicon sequence specifying the quantification window. This parameter overrides values of the '--quantification_window_center', '--cleavage_offset', '--window_around_sgrna' or '--window_around_sgrna' values. Any indels/substitutions outside this window are excluded. Indexes are 0-based, meaning that the first nucleotide is position 0. Ranges are separted by the dash sign (e.g. 'start-stop'), and multiple ranges can be separated by the underscore (_) (can be comma-separated list of values, corresponding to amplicon sequences given in --amplicon_seq e.g. 5-10,5-10_20-30 would specify the 6th-11th bp in the first reference and the 6th-11th and 21st-31st bp in the second reference). A value of 0 disables this filter for a particular amplicon (e.g. 0,90-110 This would disable the quantification window for the first amplicon and specify the quantification window of 90-110 for the second).Note that if there are multiple amplicons provided, and only one quantification window coordinate is provided, the same quantification window will be used for all amplicons and be adjusted to account for insertions/deletions.(default: None)

Type: str


Annotate Wildtype Allele

--annotate_wildtype_allele

Help: Wildtype alleles in the allele table plots will be marked with this string (e.g. **).

Type: str


Keep Intermediate

--keep_intermediate

Help: Keep all the intermediate files

Type: bool

Default: False


Dump

--dump

Help: Dump numpy arrays and pandas dataframes to file for debugging purposes

Type: bool

Default: False


Write Detailed Allele Table

--write_detailed_allele_table

Help: If set, a detailed allele table will be written including alignment scores for each read sequence.

Type: bool

Default: False


Fastq Output

--fastq_output

Help: If set, a fastq file with annotations for each read will be produced.

Type: bool

Default: False


Bam Output

--bam_output

Help: If set, a bam file with alignments for each read will be produced.

Type: bool

Default: False


Bowtie2 Index

-x, --bowtie2_index

Help: Basename of Bowtie2 index for the reference genome

Type: str


Zip Output

--zip_output

Help: If set, the output will be placed in a zip folder.

Type: bool

Default: False


Max Rows Alleles Around Cut To Plot

--max_rows_alleles_around_cut_to_plot

Help: Maximum number of rows to report in the alleles table plot.

Type: int

Default: 50


Suppress Report

--suppress_report

Help: Suppress output report

Type: bool

Default: False


Place Report In Output Folder

--place_report_in_output_folder

Help: If true, report will be written inside the CRISPResso output folder. By default, the report will be written one directory up from the report output.

Type: bool

Default: False


Suppress Plots

--suppress_plots

Help: Suppress output plots

Type: bool

Default: False


Base Editor Output

--base_editor_output

Help: Outputs plots and tables to aid in analysis of base editor studies.

Type: bool

Default: False


Conversion Nuc From

--conversion_nuc_from

Help: For base editor plots, this is the nucleotide targeted by the base editor

Type: str

Default: C


Conversion Nuc To

--conversion_nuc_to

Help: For base editor plots, this is the nucleotide produced by the base editor

Type: str

Default: T


Prime Editing Spacer Sequence

--prime_editing_pegRNA_spacer_seq

Help: pegRNA spacer sgRNA sequence used in prime editing. The spacer should not include the PAM sequence. The sequence should be given in the RNA 5'->3' order, so for Cas9, the PAM would be on the right side of the given sequence.

Type: str


Prime Editing Extension Sequence

--prime_editing_pegRNA_extension_seq

Help: Extension sequence used in prime editing. The sequence should be given in the RNA 5'->3' order, such that the sequence starts with the RT template including the edit, followed by the Primer-binding site (PBS).

Type: str


Prime Editing pegRNA Extension Quantification Window Size

--prime_editing_pegRNA_extension_quantification_window_size

Help: Quantification window size (in bp) at flap site for measuring modifications anchored at the right side of the extension sequence. Similar to the --quantification_window parameter, the total length of the quantification window will be 2x this parameter. Default: 5bp (10bp total window size)

Type: int

Default: 5


Prime Editing pegRNA Scaffold Sequence

--prime_editing_pegRNA_scaffold_seq

Help: If given, reads containing any of this scaffold sequence before extension sequence (provided by --prime_editing_extension_seq) will be classified as 'Scaffold-incorporated'. The sequence should be given in the 5'->3' order such that the RT template directly follows this sequence. A common value is 'GGCACCGAGUCGGUGC'.

Type: str


Prime Editing pegRNA Scaffold Min Match Length

--prime_editing_pegRNA_scaffold_min_match_length

Help: Minimum number of bases matching scaffold sequence for the read to be counted as 'Scaffold-incorporated'. If the scaffold sequence matches the reference sequence at the incorporation site, the minimum number of bases to match will be minimally increased (beyond this parameter) to disambiguate between prime-edited and scaffold-incorporated sequences.

Type: int

Default: 1


Prime Editing Nicking Guide Sequence

--prime_editing_nicking_guide_seq

Help: Nicking sgRNA sequence used in prime editing. The sgRNA should not include the PAM sequence. The sequence should be given in the RNA 5'->3' order, so for Cas9, the PAM would be on the right side of the sequence

Type: str


Prime Editing Override Prime Edited Reference Sequence

--prime_editing_override_prime_edited_ref_seq

Help: If given, this sequence will be used as the prime-edited reference sequence. This may be useful if the prime-edited reference sequence has large indels or the algorithm cannot otherwise infer the correct reference sequence.

Type: str


Prime Editing Override Sequence Checks

--prime_editing_override_sequence_checks

Help: If set, checks to assert that the prime editing guides and extension sequence are in the proper orientation are not performed. This may be useful if the checks are failing inappropriately, but the user is confident that the sequences are correct.

Type: bool

Default: False


CRISPResso 1 Mode

--crispresso1_mode

Help: Parameter usage as in CRISPResso 1

Type: bool

Default: False


dsODN

--dsODN

Help: Label reads with the dsODN sequence provided

Type: str


Auto

--auto

Help: Infer amplicon sequence from most common reads

Type: bool

Default: False


Debug

--debug

Help: Show debug messages

Type: bool

Default: False


No Rerun

--no_rerun

Help: Don't rerun CRISPResso2 if a run using the same parameters has already been finished.

Type: bool

Default: False


Number of Processes

-p, --n_processes

Help: Specify the number of processes to use for analysis. Please use with caution since increasing this parameter will significantly increase the memory required to run CRISPResso. Can be set to 'max'.

Type: str

Default: 1


Skip Failed

--skip_failed

Help: Continue with batch analysis even if one sample fails

Type: bool

Default: False


CRISPResso Command

--crispresso_command

Help: CRISPResso command to call

Type: str

Default: CRISPResso


Gene Annotations

--gene_annotations

Help: Gene Annotation Table from UCSC Genome Browser Tables (http://genome.ucsc.edu/cgi-bin/hgTables?command=start), please select as table 'knownGene', as output format 'all fields from selected table' and as file returned 'gzip compressed'

Type: str


Bam File

-b, --bam_file

Help: WGS aligned bam file

Type: str

Default: bam filename


Region File

-f, --region_file

Help: Regions description file. A BED format file containing the regions to analyze, one per line. The REQUIRED columns are:

  • chr_id (chromosome name)
  • bpstart (start position)
  • bpend (end position)

The optional columns are:

  • name (an unique indentifier for the region)
  • guide_seq
  • expected_hdr_amplicon_seq
  • coding_seq See CRISPResso --help for more details on these last 3 parameters

Type: str


Reference File

-r, --reference_file

Help: A FASTA format reference file (for example hg19.fa for the human genome)

Type: str


Minimum Reads to Use Region

--min_reads_to_use_region

Help: Minimum number of reads that align to a region to perform the CRISPResso analysis for WGS

Type: float

Default: 10


Disable Guardrails

--disable_guardrails

Help: Disable guardrail warnings

Type: bool

Default: False


Use Matplotlib

--use_matplotlib

Help: Use matplotlib for plotting instead of plotly/d3 when CRISPRessoPro is installed

Type: bool

Default: False