CRISPRessoWGS Parameters

Amplicon Min Alignment Score

-amas, --amplicon_min_alignment_score

Help: Amplicon Minimum Alignment Score; score between 0 and 100; sequences must have at least this homology score with the amplicon to be aligned (can be comma-separated list of multiple scores, corresponding to amplicon sequences given in --amplicon_seq)

Type: str

Default Minimum Alignment Score

--default_min_aln_score, --min_identity_score

Help: Default minimum homology score for a read to align to a reference amplicon

Type: int

Default: 60

Expand Ambiguous Alignments

--expand_ambiguous_alignments

Help: If more than one reference amplicon is given, reads that align to multiple reference amplicons will count equally toward each amplicon. Default behavior is to exclude ambiguous alignments.

Type: bool

Default: False

Assign Ambiguous Alignments To First Reference

--assign_ambiguous_alignments_to_first_reference

Help: If more than one reference amplicon is given, ambiguous reads that align with the same score to multiple amplicons will be assigned to the first amplicon. Default behavior is to exclude ambiguous alignments.

Type: bool

Default: False

Guide Seq

-g, --guide_seq, --sgRNA

Help: sgRNA sequence, if more than one, please separate by commas. Note that the sgRNA needs to be input as the guide RNA sequence (usually 20 nt) immediately adjacent to but not including the PAM sequence (5' of NGG for SpCas9). If the PAM is found on the opposite strand with respect to the Amplicon Sequence, ensure the sgRNA sequence is also found on the opposite strand. The CRISPResso convention is to depict the expected cleavage position using the value of the parameter '--quantification_window_center' nucleotides from the 3' end of the guide. In addition, the use of alternate nucleases besides SpCas9 is supported. For example, if using the Cpf1 system, enter the sequence (usually 20 nt) immediately 3' of the PAM sequence and explicitly set the '--cleavage_offset' parameter to 1, since the default setting of -3 is suitable only for SpCas9.

Type: str

Guide Name

-gn, --guide_name

Help: sgRNA names, if more than one, please separate by commas.

Type: str

Flexiguide Seq

-fg, --flexiguide_seq

Help: sgRNA sequence (flexible) (can be comma-separated list of multiple flexiguides). The flexiguide sequence will be aligned to the amplicon sequence(s), as long as the guide sequence has homology as set by --flexiguide_homology.

Type: str

Default: None

Flexiguide Homology

-fh, --flexiguide_homology

Help: flexiguides will yield guides in amplicons with at least this homology to the flexiguide sequence.

Type: int

Default: 80

Flexiguide Name

-fgn, --flexiguide_name

Help: flexiguide name

Type: str

Flexiguide Gap Open Penalty

--flexiguide_gap_open_penalty

Help:

Type: int

Default: -20

Flexiguide Gap Extend Penalty

--flexiguide_gap_extend_penalty

Help:

Type: int

Default: -2

Discard Guide Positions Overhanging Amplicon Edge

--discard_guide_positions_overhanging_amplicon_edge

Help: If set, for guides that align to multiple positions, guide positions will be discarded if plotting around those regions would included bp that extend beyond the end of the amplicon.

Type: bool

Default: False

Expected HDR Amplicon Sequence

-e, --expected_hdr_amplicon_seq

Help: Amplicon sequence expected after HDR

Type: str

Exon Specification Coding Sequence/s

-c, --coding_seq

Help: Subsequence/s of the amplicon sequence covering one or more coding sequences for frameshift analysis. If more than one (for example, split by intron/s), please separate by commas.

Type: str

Config File

--config_file

Help: File path to JSON file with config elements

Type: str

Default: None

Minimum Average Read Quality (phred33 Scale)

-q, --min_average_read_quality

Help: Minimum average quality score (phred33) to keep a read

Type: int

Minimum Single bp Quality (phred33 Scale)

-s, --min_single_bp_quality

Help: Minimum single bp score (phred33) to keep a read

Type: int

Minimum bp Quality or N (phred33 Scale)

--min_bp_quality_or_N

Help: Bases with a quality score (phred33) less than this value will be set to 'N'

Type: int

File Prefix

--file_prefix

Help: File prefix for output plots and tables

Type: str

Sample Name

-n, --name

Help: Output name of the report (default: the name is obtained from the filename of the fastq file/s used in input)

Type: str

Suppress Amplicon Name Truncation

--suppress_amplicon_name_truncation

Help: If set, amplicon names will not be truncated when creating output filename prefixes. If not set, amplicon names longer than 21 characters will be truncated when creating filename prefixes.

Type: bool

Default: False

Output Folder

-o, --output_folder

Help: Output folder to use for the analysis (default: current folder)

Type: str

Verbosity

-v, --verbosity

Help: Verbosity level of output to the console (1-4) 4 is the most verbose

Type: int

Default: 3

Trimming Adapter

--trim_sequences

Help: Enable the trimming with fastp

Type: bool

Default: False

Trimmomatic Command

--trimmomatic_command

Help: DEPRECATED in v2.3.0, use --fastp_command

Type: str

Default: None

Trimmomatic Options String

--trimmomatic_options_string

Help: DEPRECATED in v2.3.0, use --fastp_options_string

Type: str

Flash Command

--flash_command

Help: DEPRECATED in v2.3.0, use --fastp_command

Type: str

Default: None

Fastp Command

--fastp_command

Help: Command to run fastp

Type: str

Default: fastp

Fastp Options String

--fastp_options_string

Help: Override options for fastp, e.g. --length_required 70 --umi

Type: str

Min Paired End Reads Overlap

--min_paired_end_reads_overlap

Help: Parameter for the fastp read merging step. Minimum required overlap length between two reads to provide a confident overlap

Type: int

Default: 10

Max Paired End Reads Overlap

--max_paired_end_reads_overlap

Help: DEPRECATED in v2.3.0

Type: str

Default: None

Samtools Exclude Flags

--samtools_exclude_flags

Help: Exclude reads with any of the specified flags set in the SAM/BAM file. Flags can be specified in either base 16 (hex) or base 10. Default is 4 (read unmapped).

Type: str

Default: 4

Stringent Flash Merging

--stringent_flash_merging

Help: DEPRECATED in v2.3.0

Type: bool

Default: False

Quantification Window Size

-w, --quantification_window_size, --window_around_sgrna

Help: Defines the size (in bp) of the quantification window extending from the position specified by the '--cleavage_offset' or '--quantification_window_center' parameter in relation to the provided guide RNA sequence(s) (--sgRNA). Mutations within this number of bp from the quantification window center are used in classifying reads as modified or unmodified. A value of 0 disables this window and indels in the entire amplicon are considered. Default is 1, 1bp on each side of the cleavage position for a total length of 2bp. Multiple quantification window sizes (corresponding to each guide specified by --guide_seq) can be specified with a comma-separated list.

Type: str

Default: 1

Quantification Window Center

-wc, --quantification_window_center, --cleavage_offset

Help: Center of quantification window to use within respect to the 3' end of the provided sgRNA sequence. Remember that the sgRNA sequence must be entered without the PAM. For cleaving nucleases, this is the predicted cleavage position. The default is -3 and is suitable for the Cas9 system. For alternate nucleases, other cleavage offsets may be appropriate, for example, if using Cpf1 this parameter would be set to 1. For base editors, this could be set to -17 to only include mutations near the 5' end of the sgRNA. Multiple quantification window centers (corresponding to each guide specified by --guide_seq) can be specified with a comma-separated list.

Type: str

Default: -3

Exclude bp From Left

--exclude_bp_from_left

Help: Exclude bp from the left side of the amplicon sequence for the quantification of the indels

Type: int

Default: 15

Exclude bp From Right

--exclude_bp_from_right

Help: Exclude bp from the right side of the amplicon sequence for the quantification of the indels

Type: int

Default: 15

Use Legacy Insertion Quantification

--use_legacy_insertion_quantification

Help: If set, the legacy insertion quantification method will be used (i.e. with a 1bp quantification window, indels at the cut site and 1bp away from the cut site would be quantified). By default (if this parameter is not set) with a 1bp quantification window, only insertions at the cut site will be quantified.

Type: bool

Default: False

Ignore Substitutions

--ignore_substitutions

Help: Ignore substitutions events for the quantification and visualization

Type: bool

Default: False

Ignore Insertions

--ignore_insertions

Help: Ignore insertions events for the quantification and visualization

Type: bool

Default: False

Ignore Deletions

--ignore_deletions

Help: Ignore deletions events for the quantification and visualization

Type: bool

Default: False

Discard Indel Reads

--discard_indel_reads

Help: Discard reads with indels in the quantification window from analysis

Type: bool

Default: False

Needleman Wunsch Gap Open

--needleman_wunsch_gap_open

Help: Gap open option for Needleman-Wunsch alignment

Type: int

Default: -20

Needleman Wunsch Gap Extend

--needleman_wunsch_gap_extend

Help: Gap extend option for Needleman-Wunsch alignment

Type: int

Default: -2

Needleman Wunsch Gap Incentive

--needleman_wunsch_gap_incentive

Help: Gap incentive value for inserting indels at cut sites

Type: int

Default: 1

Needleman Wunsch Alignment Matrix Location

--needleman_wunsch_aln_matrix_loc

Help: Location of the matrix specifying substitution scores in the NCBI format (see ftp://ftp.ncbi.nih.gov/blast/matrices/)

Type: str

Default: EDNAFULL

Plot Histogram Outliers

--plot_histogram_outliers

Help: If set, all values will be shown on histograms. By default (if unset), histogram ranges are limited to plotting data within the 99 percentile.

Type: bool

Default: False

Plot Window Size

--plot_window_size, --offset_around_cut_to_plot

Help: Defines the size of the window extending from the quantification window center to plot. Nucleotides within plot_window_size of the quantification_window_center for each guide are plotted.

Type: int

Default: 20

Min Frequency Alleles Around Cut To Plot

--min_frequency_alleles_around_cut_to_plot

Help: Minimum % reads required to report an allele in the alleles table plot.

Type: float

Default: 0.2

Expand Allele Plots By Quantification

--expand_allele_plots_by_quantification

Help: If set, alleles with different modifications in the quantification window (but not necessarily in the plotting window (e.g. for another sgRNA)) are plotted on separate lines, even though they may have the same apparent sequence. To force the allele plot and the allele table to be the same, set this parameter. If unset, all alleles with the same sequence will be collapsed into one row.

Type: bool

Default: False

Allele Plot Percentages Only for Assigned Reference

--allele_plot_pcts_only_for_assigned_reference

Help: If set, in the allele plots, the percentages will show the percentage as a percent of reads aligned to the assigned reference. Default behavior is to show percentage as a percent of all reads.

Type: bool

Default: False

Quantification Window Coordinates

-qwc, --quantification_window_coordinates

Help: Bp positions in the amplicon sequence specifying the quantification window. This parameter overrides values of the '--quantification_window_center', '--cleavage_offset', '--window_around_sgrna' or '--window_around_sgrna' values. Any indels/substitutions outside this window are excluded. Indexes are 0-based, meaning that the first nucleotide is position 0. Ranges are separted by the dash sign (e.g. 'start-stop'), and multiple ranges can be separated by the underscore (_) (can be comma-separated list of values, corresponding to amplicon sequences given in --amplicon_seq e.g. 5-10,5-10_20-30 would specify the 6th-11th bp in the first reference and the 6th-11th and 21st-31st bp in the second reference). A value of 0 disables this filter for a particular amplicon (e.g. 0,90-110 This would disable the quantification window for the first amplicon and specify the quantification window of 90-110 for the second).Note that if there are multiple amplicons provided, and only one quantification window coordinate is provided, the same quantification window will be used for all amplicons and be adjusted to account for insertions/deletions.(default: None)

Type: str

Annotate Wildtype Allele

--annotate_wildtype_allele

Help: Wildtype alleles in the allele table plots will be marked with this string (e.g. **).

Type: str

Keep Intermediate

--keep_intermediate

Help: Keep all the intermediate files

Type: bool

Default: False

Dump

--dump

Help: Dump numpy arrays and pandas dataframes to file for debugging purposes

Type: bool

Default: False

Write Detailed Allele Table

--write_detailed_allele_table

Help: If set, a detailed allele table will be written with the following columns:

#Reads: the number of reads this allele represents.
Aligned_Sequence: the alignment of the read sequence.
Reference_Sequence: the alignment of the amplicon sequence.
n_inserted: the number of insertions within the quantification window.
n_deleted: the number of deletions within the quantification window.
n_mutated: the number of substitutions within the quantification window.
Reference_Name: the amplicon name to which this allele is assigned.
Read_Status: the bin to which this allele is classified.
Aligned_Reference_Names: if there are multiple amplicons, this lists the amplicon names. The order corresponds to the alignment scores in Aligned_Reference_Scores.
Aligned_Reference_Scores: the alignment score (out of 100) for each amplicon.
ref_positions: this represents the indices in the Aligned_Sequence that map back to the original sequence. Negative values represent places that don't map back to the original reference.
all_insertion_positions: all of the indices where there is an insertion regardless of the quantification window.
all_insertion_left_positions: for all insertions, the left most index (e.g. where each insertion starts).
insertion_positions: the insertion positions within the quantification window.
insertion_coordinates: the start and end indices of the insertions within the quantificaiton window.
insertion_sizes: the size of each insertion within the quantification window.
all_deletion_positions: all of the indices where there is a deletion regardless of the quantification window.
deletion_positions: the indices where there is a deletion within the quantification window.
deletion_coordinates: the start and end indices of the deletions within the quantification window.
deletion_sizes: the size of the deletions within the quantification window.
all_substitution_positions: all of the indices where there is a substitution.
substitution_positions: the indices where there is a substitution within the quantification window.
substitution_values: the nucleotide to which it is substituted within the quantification window.
%Reads: the percentage of read this allele represents.

Type: bool

Default: False

Fastq Output

--fastq_output

Help: If set, a fastq file with annotations for each read will be produced.

Type: bool

Default: False

Bam Output

--bam_output

Help: If set, a bam file with alignments for each read will be produced.

Type: bool

Default: False

Bowtie2 Index

-x, --bowtie2_index

Help: Basename of Bowtie2 index for the reference genome

Type: str

Zip Output

--zip_output

Help: If set, the output will be placed in a zip folder.

Type: bool

Default: False

Max Rows Alleles Around Cut To Plot

--max_rows_alleles_around_cut_to_plot

Help: Maximum number of rows to report in the alleles table plot.

Type: int

Default: 50

Suppress Report

--suppress_report

Help: Suppress output report

Type: bool

Default: False

Place Report In Output Folder

--place_report_in_output_folder

Help: If true, report will be written inside the CRISPResso output folder. By default, the report will be written one directory up from the report output.

Type: bool

Default: False

Suppress Plots

--suppress_plots

Help: Suppress output plots

Type: bool

Default: False

Base Editor Output

--base_editor_output

Help: Outputs plots and tables to aid in analysis of base editor studies.

Type: bool

Default: False

Conversion Nuc From

--conversion_nuc_from

Help: For base editor plots, this is the nucleotide targeted by the base editor

Type: str

Default: C

Conversion Nuc To

--conversion_nuc_to

Help: For base editor plots, this is the nucleotide produced by the base editor

Type: str

Default: T

Prime Editing Spacer Sequence

--prime_editing_pegRNA_spacer_seq

Help: pegRNA spacer sgRNA sequence used in prime editing. The spacer should not include the PAM sequence. The sequence should be given in the RNA 5'->3' order, so for Cas9, the PAM would be on the right side of the given sequence.

Type: str

Prime Editing Extension Sequence

--prime_editing_pegRNA_extension_seq

Help: Extension sequence used in prime editing. The sequence should be given in the RNA 5'->3' order, such that the sequence starts with the RT template including the edit, followed by the Primer-binding site (PBS).

Type: str

Prime Editing pegRNA Extension Quantification Window Size

--prime_editing_pegRNA_extension_quantification_window_size

Help: Quantification window size (in bp) at flap site for measuring modifications anchored at the right side of the extension sequence. Similar to the --quantification_window parameter, the total length of the quantification window will be 2x this parameter. Default: 5bp (10bp total window size)

Type: int

Default: 5

Prime Editing pegRNA Scaffold Sequence

--prime_editing_pegRNA_scaffold_seq

Help: If given, reads containing any of this scaffold sequence before extension sequence (provided by --prime_editing_extension_seq) will be classified as 'Scaffold-incorporated'. The sequence should be given in the 5'->3' order such that the RT template directly follows this sequence. A common value is 'GGCACCGAGUCGGUGC'.

Type: str

Prime Editing pegRNA Scaffold Min Match Length

--prime_editing_pegRNA_scaffold_min_match_length

Help: Minimum number of bases matching scaffold sequence for the read to be counted as 'Scaffold-incorporated'. If the scaffold sequence matches the reference sequence at the incorporation site, the minimum number of bases to match will be minimally increased (beyond this parameter) to disambiguate between prime-edited and scaffold-incorporated sequences.

Type: int

Default: 1

Prime Editing Nicking Guide Sequence

--prime_editing_nicking_guide_seq

Help: Nicking sgRNA sequence used in prime editing. The sgRNA should not include the PAM sequence. The sequence should be given in the RNA 5'->3' order, so for Cas9, the PAM would be on the right side of the sequence

Type: str

Prime Editing Override Prime Edited Reference Sequence

--prime_editing_override_prime_edited_ref_seq

Help: If given, this sequence will be used as the prime-edited reference sequence. This may be useful if the prime-edited reference sequence has large indels or the algorithm cannot otherwise infer the correct reference sequence.

Type: str

Prime Editing Override Sequence Checks

--prime_editing_override_sequence_checks

Help: If set, checks to assert that the prime editing guides and extension sequence are in the proper orientation are not performed. This may be useful if the checks are failing inappropriately, but the user is confident that the sequences are correct.

Type: bool

Default: False

CRISPResso 1 Mode

--crispresso1_mode

Help: Parameter usage as in CRISPResso 1

Type: bool

Default: False

dsODN

--dsODN

Help: Label reads with the dsODN sequence provided

Type: str

Auto

--auto

Help: Infer amplicon sequence from most common reads

Type: bool

Default: False

Debug

--debug

Help: Show debug messages

Type: bool

Default: False

No Rerun

--no_rerun

Help: Don't rerun CRISPResso2 if a run using the same parameters has already been finished.

Type: bool

Default: False

Number of Processes

-p, --n_processes

Help: Specify the number of processes to use for analysis. Please use with caution since increasing this parameter will significantly increase the memory required to run CRISPResso. Can be set to 'max'.

Type: str

Default: 1