CRISPRessoWGS

CRISPRessoWGS is a utility for the analysis of genome editing experiment from whole genome sequencing (WGS) data. CRISPRessoWGS allows exploring any region of the genome to quantify targeted editing or potentially off-target effects. The intended use case for CRISPRessoWGS is the analysis of targeted regions, and WGS reads from those regions will be realigned using CRISPResso's alignment aligorithm for more accurate genome editing quantification. To scan the entire genome for mutations VarScan or MuTect are more suitable, and identified regions can be analyzed and visualized using CRISPRessoWGS.

CRISPRessoWGS Inputs

To run CRISPRessoWGS you must provide:

A genome aligned BAM file. To align reads from a WGS experiment to the genome there are many options available, we suggest using either Bowtie2 or BWA.

A FASTA file containing the reference sequence used to align the reads and create the BAM file (the reference files for the most common organism can be download from UCSC: http://hgdownload.soe.ucsc.edu/downloads.html. Download and uncompress only the file ending with .fa.gz, for example for the last version of the human genome download and uncompress the file hg38.fa.gz)

Descriptions file (--region_file) containing the coordinates of the regions to analyze and some additional information. In particular, this file is a tab delimited text file with up to 7 columns (4 required):

  • chr_id: chromosome of the region in the reference genome.

  • bpstart: start coordinate of the region in the reference genome.

  • bpend: end coordinate of the region in the reference genome.

  • REGION_NAME: an identifier for the region (must be unique).

  • sgRNA_SEQUENCE (OPTIONAL): sgRNA sequence used for this genomic segment without the PAM sequence. If not available, enter NA.

  • EXPECTED_SEGMENT_AFTER_HDR (OPTIONAL): expected genomic segment sequence in case of HDR. If more than one, separate by commas and not spaces. If not available, enter NA.

  • CODING_SEQUENCE (OPTIONAL): Subsequence(s) of the genomic segment corresponding to coding sequences. If more than one, separate by commas and not spaces. If not available, enter NA.

CRISPRessoWGS Parameters

CRISPRessoWGS Examples

CRISPRessoWGS‑‑amplicon_min_alignment_score ‑‑default_min_aln_score ‑‑expand_ambiguous_alignments ‑‑assign_ambiguous_alignments_to_first_reference ‑‑guide_seq ‑‑guide_name ‑‑flexiguide_seq ‑‑flexiguide_homology ‑‑flexiguide_name ‑‑discard_guide_positions_overhanging_amplicon_edge ‑‑expected_hdr_amplicon_seq ‑‑coding_seq ‑‑config_file ‑‑min_average_read_quality ‑‑min_single_bp_quality ‑‑min_bp_quality_or_N ‑‑file_prefix ‑‑name ‑‑suppress_amplicon_name_truncation ‑‑output_folder ‑‑verbosity ‑‑trim_sequences ‑‑trimmomatic_command ‑‑trimmomatic_options_string ‑‑flash_command ‑‑fastp_command ‑‑fastp_options_string ‑‑min_paired_end_reads_overlap ‑‑max_paired_end_reads_overlap ‑‑stringent_flash_merging ‑‑quantification_window_size ‑‑quantification_window_center ‑‑exclude_bp_from_left ‑‑exclude_bp_from_right ‑‑use_legacy_insertion_quantification ‑‑ignore_substitutions ‑‑ignore_insertions ‑‑ignore_deletions ‑‑discard_indel_reads ‑‑needleman_wunsch_gap_open ‑‑needleman_wunsch_gap_extend ‑‑needleman_wunsch_gap_incentive ‑‑needleman_wunsch_aln_matrix_loc ‑‑plot_histogram_outliers ‑‑plot_window_size ‑‑min_frequency_alleles_around_cut_to_plot ‑‑expand_allele_plots_by_quantification ‑‑allele_plot_pcts_only_for_assigned_reference ‑‑quantification_window_coordinates ‑‑annotate_wildtype_allele ‑‑keep_intermediate ‑‑dump ‑‑write_detailed_allele_table ‑‑fastq_output ‑‑bam_output ‑‑bowtie2_index ‑‑zip_output ‑‑max_rows_alleles_around_cut_to_plot ‑‑suppress_report ‑‑place_report_in_output_folder ‑‑suppress_plots ‑‑base_editor_output ‑‑conversion_nuc_from ‑‑conversion_nuc_to ‑‑prime_editing_pegRNA_spacer_seq ‑‑prime_editing_pegRNA_extension_seq ‑‑prime_editing_pegRNA_extension_quantification_window_size ‑‑prime_editing_pegRNA_scaffold_seq ‑‑prime_editing_pegRNA_scaffold_min_match_length ‑‑prime_editing_nicking_guide_seq ‑‑prime_editing_override_prime_edited_ref_seq ‑‑prime_editing_override_sequence_checks ‑‑crispresso1_mode ‑‑dsODN ‑‑auto ‑‑debug ‑‑no_rerun ‑‑n_processes ‑‑skip_failed ‑‑crispresso_command ‑‑gene_annotations ‑‑bam_file ‑‑region_file ‑‑reference_file ‑‑min_reads_to_use_region_wgs ‑‑disable_guardrails ‑‑use_matplotlib