Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Demo Data is available on BSSH with the Project name "NovaSeq6000/NovaSeqX: Illumina FFPE DNA Prep with Exome 2.5 Enrichment - Demo Data" and on ICA with the Bundle name "Illumina FFPE DNA Prep with Exome 2.5 Enrichment - Demo Data".
Data includes 14 samples, or seven Tumor/Normal pairs:
SNV-tumor-AF10-NovaSeq6K & SNV-normal-NovaSeq6K: Seracare Seraseq Tumor Mutation DNA Mix v2 AF10 and Seraseq WT (DNA/RNA) Reference Material. Samples with known truth variants at expected Variant Allele frequency at 10%, sequenced on the NovaSeq6000 Sequencing System.
SNV-tumor-AF5-NovaSeqX & SNV-normal-NovaSeqX: Seracare Seraseq Tumor Mutation DNA Mix v2 AF5 and Seraseq WT (DNA/RNA) Reference Material. Seracare Reference Material diluted to expected Variant Allele frequency at 5%, sequenced on the NovaSeqX Sequencing System.
CNV-tumor-NovaSeq6K & CNV-normal-NovaSeq6K: Seracare Seraseq Solid Tumor CNV Mix +3 Copies and Seraseq WT (DNA/RNA) Reference Material. Sample with known CNV events at expected duplication events with fold change at 3x, sequenced on the NovaSeq6000 Sequencing System.
FFPE-tumor-21706-NovaSeq6K & FFPE-normal-21707-NovaSeq6K: clinical formalin-fixed paraffin-embedded (FFPE) tumor sample 21706 and benign adjacent tissue sample 21707. FFPE samples sequenced on the NovaSeq6000 Sequencing System.
FFPE-tumor-21706-NovaSeqX & FFPE-normal-21707-NovaSeqX: clinical formalin-fixed paraffin-embedded (FFPE) tumor sample 21706 and benign adjacent tissue sample 21707. FFPE samples sequenced on the NovaSeqX Sequencing System.
FFPE-tumor-12293-exome-NovaSeq6K & FFPE-normal-12294-exome-NovaSeq6K: FFPE tumor sample 12293 and benign adjacent tissue sample 12294, enriched with the Twist Exome v2.5 panel. FFPE samples sequenced on the NovaSeq6000 Sequencing System.
FFPE-tumor-12293-exome-plus-spike-NovaSeq6K & FFPE-normal-12294-exome-plus-spike-NovaSeq6K: FFPE tumor sample 12293 and benign adjacent tissue sample 12294, enriched with the Twist Exome v2.5 panel and a custom spike-in panel. FFPE samples sequenced on the NovaSeq6000 Sequencing System.
Sample pair #7 (FFPE 12293/12294 enriched with and without a custom spike-in panel) was not included in variant calling because a systematic noise file and a Panel of Normals were not available for these combined covered regions. The pre-built WES systematic noise file based on the exome region (without spike-in) can be used, but precision in the custom spike-in areas may be reduced. The pre-built PON target counts based on the exome region (without spike-in) can be used, but CNV events may be inaccurate. Differences in coverage can seen by viewing the bam files.
Detailed instructions on how to configure the DRAGEN command-line for analyzing samples processed with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment.
The workflow is a multi-phase process. Before DRAGEN v4.4, the somatic UMI Tumor/Normal pipeline can only process pre-collapsed BAM files. To use this pipeline with raw FASTQ files, individually map/align & read collapse the tumor and normal samples. The resulting BAM files are then fed into the variant calling step.
Run mapping and UMI collapsing on each of the samples:
Run variant calling on a tumor/normal pair:
For DRAGEN somatic runs it is recommended to use the linear (non-graph) hashtable.
FQ list Input
or
FQ Input
or
BAM Input
or
CRAM Input
or
Microsatellite Instability (MSI) and Homologous Recombination Deficiency (HRD) have also been evaluated for samples with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment protocol. See Additional Information on determining these biomarkers.
Product files to support the analysis of samples prepared with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment are distributed as a BaseSpace Project and an ICA Bundle. Please reach out to your local sales representative to get access. Note that these product files are considered early access and will be superseded by official resource files published on the page.
Target BED
If no spike-in probes were included in enrichment, download from for hg19 or hg38 reference genomes.
If other spike-in probes were included in enrichment, use a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel.
Systematic noise files: considered essential for reducing false positive calls in Tumor-Only workflows, and they are also effective at improving precision in Tumor-Normal workflows.
Prebuilt systematic noise BED files can be downloaded on . The WES_*_v2.0.0_systematic_noise.snv.bed.gz
noise files are built from a mixture of FF (fresh-frozen) and FFPE samples, with a mix of TruSeq PCR prep and Nextera prep. Reference genome builds (the * in the file name) include hg19 and hg38.
For instructions on how to build your own systematic noise file using internally-sequenced normal samples, see . The normal samples used to generate the systematic noise file should match as closely as possible the sequencer, sample type, and library prep of the tumor samples. Also available is the DRAGEN Baseline Builder App on BSSH.
CNV - Somatic pipelines: A Panel of normals (PONs) is used for calling gene amplification in tumor samples
Provided in the BSSH project/ICA bundle are the individual target.counts files (i.e. one per normal sample) and combined target.count files. The PON was generated from 45 FFPE benign adjacent samples from different tissue types and from male and female donors with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment protocol. Libraries were sequenced on the NovaSeq 6000 sequencing system. Current supported builds include hg19 and hg38, both target.counts and gc-corrected.target.counts.
As with the systematic noise file, internally-sequenced normal samples can be used to . The input should be the UMI-collapsed BAM files of the normal samples, and the reference genome and target BED need to match those used throughout the workflow. Also available are the DRAGEN Baseline Builder App on BSSH and the DRAGEN CNV Baseline Builder 4-3-6 Pipeline on ICA.
MSI - Reference directory of normals: To be used when enabling the biomarker MSI in Tumor-only mode. This is a in a collection of normals
A collection of 44 .dist files from benign adjacent samples from different tissue types and from male and female donors with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment protocol. Libraries were sequenced on the NovaSeq 6000 sequencing system. Current supported builds include hg19 and hg38.
See the reference directory from interally-sequenced normal samples. Also available are the DRAGEN Baseline Builder App on BSSH and the DRAGEN MSI Baseline Builder 4-3-6 Pipeline on ICA.
MSI - Microsatellite sites: list of microsatellite sites from which to calculate instability
An updated list of WES microsatellite sites, with DRAGEN v4.4.
See the
The following DRAGEN recipes are specific for Illumina FFPE DNA Prep with Exome 2.5 Enrichment libraries. A general recipe for Somatic analysis of Tumor Normal samples with UMI is provided on .
Please see:
DRAGEN input sources include: fastq list, fastq, bam, or cram. For preprocessing, whether the sample is tumor or normal does not need to be distinguished (although it can be) because the output is simply a collapsed BAM file. .
HRD is under development.
Performance improvements are planned with upcoming software releases. In the meanwhile, HRD results should be considered preliminary. HRD can be enabled by adding --enable-hrd true
to a command-line or checking the Enable HRD in BaseSpace and ICA applications. Note that CNV calling must be enabled in order to calculate HRD.
Microsatellite Instability (MSI) If MSI values are needed ahead of software updates, it is recommend to calculate MSI on the tumor fastq files in tumor-only mode in a separate DRAGEN analysis. Tumor-o
MSI is under active development.
In upcoming software releases, it will be possible to calculate MSI in Tumor/Only mode even while running a Tumor/Normal workflow, thus eliminating the need for an additional analysis.
Somatic CNV Calling—Select Tumor-only
Input BAM—Select the tumor BAM and the Sample Sex to Unknown. Click Add a New Row to input another tumor sample. Each tumor BAM is entered as a new row.
Systematic Noise Filter—uncheck Enable Systematic Noise Filter.
CNV Baseline—None
View the Biomarkers options by clicking on + Biomarkers
Microsatellite Instability—check Enable MSI calling
MSI References—navigate to the Project containing the reference directory of *microsat_normal.dist files that match the reference genome used.
Custom MSI Regions—navigate to the *.tsv file containing the microsatellite list file that matches the reference genome used.
View the Advanced Settings options by clicking on + Advanced Settings
Enable Duplicate Marking—Uncheck this setting
Enable Variant Calling—Uncheck this setting
Enable Common Germline Variant Tagging—Uncheck this setting
Enable Multi-allelic Filtering—Uncheck this setting
Check the acknowledge and agree box
In Additional DRAGEN Command-line Arguments, add --enable-cnv false
Enter the Tumor BAM File for tumor sample(s)
Enter the Tumor BAM Index for the tumor sample(s)
Systematic Noise BED File—Do not select
CNV Files—Do not select
Microsatellites File—navigate to the *.list file containing the microsatellite list file that matches the reference genome used.
Microsatellites Normal References Directory—navigate to the directory of *microsat_normal.dist files that match the reference genome used.
Enable Small Variant Caller—false
Enable Germline Tagging—false
MSI Command—tumor-only
MSI Coverage Threshold—60
Enable Tumor Mutational Burden—false
Enable Variant Annotation—false
If MSI values are needed ahead of software updates, it is recommend to calculate MSI on the tumor BAM files in in a separate DRAGEN analysis. Tumor-only mode relies on a reference directory of files (.dist) containing the microsatellite repeat distribution in a panel of normals. Both the reference directory and microsatellite list are provided in a BSSH project/ICA bundle - see the for more information - and need to match the reference genome used to create the collapsed BAM in the preprocessing step.
Follow the (Phase 2) with the following changes:
Follow the (Phase 2) with the following changes:
Potential changes to the analysis settings to improve performance when analyzing Illumina FFPE DNA Prep with Exome 2.5 Enrichment.
llumina FFPE DNA Prep with Exome 2.5 Enrichment is part of an integrated whole-exome sequencing (WES) Tumor-Normal workflow to deliver variant calling and biomarker analysis in low-input formalin-fixed paraffin-embedded (FFPE) samples. This page provides software user guides to Illumina's cloud-based and on-premise solutions for the data analysis of this library preparation kit.
The software performs the following workflows to analyze sequencing data.
DNA Mapping and Aligning
Somatic Small Variant (SNV) Caller
Copy Number Variant (CNV) Caller
Tumor Mutational Burden (TMB)
Variant annotation
Notes on additional biomarkers:
Homologous Recombination Deficiency (HRD) is also under development, and performance improvements are expected with the release of upcoming software versions.
Structural Variant (SV) Caller results have not been evaluated for accuracy. Calculations for SV can add significant runtime to the analysis.
Microsatellite Instability (MSI) is under development. With current software versions, it is recommended to run the tumor sample in Tumor-only mode for highest performance of this biomarker. See the
Only those somatic variants that match a known allele from a database. In T/N mode, where the VCF usually includes only somatic variants, then it is possible that some of these somatic variants may match a known allele from a database, and then be tagged as germline.
Germline variants can be labeled in the somatic VCF by adding a flag to the command line recipe or by checking to Enable Common Germline Variant Tagging in the BSSH/ICA applications. Note that this flag has implications for other settings. First, annotation must be enabled. Second, if TMB and germline tagging are both enabled, then the TMB database filter needs to be set to false. With these two flags, germline variants are tagged in the VCF but ignored during TMB calculation.
The matched normal provides a more reliable indicator of somatic variants than a database. Database germline tagging for SNV incorrectly tags ~2% of germline variants as somatic and falsely classifies some somatic variants as germline (i.e. false positives and false negatives for somatic SNVs). Additionally, there is limited support for differentiating between germline and somatic CNVs. If a paired normal sample is available, the highest performance for the detection of both somatic and germline variants is achieved by implementing the Tumor/Normal Somatic Variant Calling workflow (as described in this user guide) and also the Germline Caller on the Normal sample.
In somatic T/N analysis, DRAGEN uses a subtraction model for SNVs. VCF will only emit variants where there is no (or very limited) alternate reads in the normal sample; in other words, variants that are found in both the normal and somatic sample will NOT be emitted in the SNV/INDEL VCF. This means that most germline variants will be excluded from the somatic VCF. To extract germline variants, it is recommended to run an additional analysis, the, on the normal sample. In BSSH, there is an option to Enable Small Variant Calling on Normal Samples, while running Somatic in Tumor-Normal mode. As the germline and somatic variants are output from separate analyses, the variants will need to be combined before further exploration or explored separately.
Detailed instructions on how to configure the BSSH Somatic App for analyzing samples processed with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment.
The workflow is a 2-phase process. Tumor and normal samples need to be run as individual samples for the Map/Align stage, and then Variant Calling is started from tumor and normal UMI collapsed BAM as tumor-normal pairs.
The goal of this phase is to generate UMI collapsed BAM files, and these steps are done using the DRAGEN Enrichment app.
Analysis name—Name of the analysis
Save Result To—Project to store analysis results in.
Input FASTQs—Select the tumor and normal biosample FASTQs.
Sample Sex—Select Unknown.
Variant Caller Mode—Select Germline.
Reference—The reference genome to use for alignment. It is recommended to use linear genome from somatic variant calling, so to use a consistent reference, Homo sapiens [UCSC] hg19 v4 or Homo sapiens [1000 Genomes] hg38 v4 should be used here for mapping.
View the UMI Settings options by clicking on + UMI Settings
Check the box next to Enable UMI
UMI Library Type—Nonrandom-duplex are used with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment kit
UMI-Aware Variant Calling—select Germline
UMI Min Supporting Reads—enter 1. Change from the default of 2.
Analysis name—Name of the analysis
Save Result To—Project to store analysis results in.
Pipeline Configuration—Select Somatic Small Variant Caller (Aligned BAM or CRAM input).
Somatic CNV Calling—Select Tumor-Normal
Input BAM—Select the tumor BAM, the paired normal BAM, and the Sample Sex to Unknown. Click Add a New Row to input another tumor-normal pair. Each pair is entered as a new row.
Reference—The reference genome to use for alignment, and the same Reference used above in the Map/Align step. Currently Homo sapiens [UCSC] hg19 v4 and Homo sapiens [1000 Genomes] hg38 v4 are supported somatic for variant calling.
Targeted Regions—The genomic regions targeted for enrichment, and the same Targeted Regions used above in the Map/Align step. If no spike-in probes were included in enrichment, select Twist Bioscience for Illumina Exome 2.5 Plus Panel. If other spike-in probes were included in enrichment, select Custom BED and under Target BED File select Select Dataset File(s). This must be a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel.
Systematic Noise Filter—Leave Enable Systematic Noise Filter checked. Select Dataset File(s) and choose the systematic noise file that matches the Reference genome used for alignment.
CNV Baseline—Select Custom (Select CNV Baseline Files below). Select Dataset File(s). Files must be of the same type (eg, all .target.counts or all .target.counts.gc-corrected.). Navigate to the Project containing the target.counts files and select those that match the Reference genome selected above and has the desired handling of GC correction.
GC Bias Correction—The type of CNV Baseline Files must match the checkbox here. For instance, if .target.counts.gc-corrected are used for the CNV Baseline Files, then keen GC Bias Correction enabled.
View the Advanced Settings options by clicking on + Advanced Settings
Enable Duplicate Marking—Uncheck this setting.
Enable Common Germline Variant Tagging—Uncheck this setting.
Somatic Quality Filtering—Check Enable Setting Somatic Quality Filtering Thresholds. Set Somatic variant quality call threshold to 3 and Somatic variant quality filter threshold to 15.
Nirvana Annotation—Check Enable Nirvana variant annotation.
The Illumina FFPE DNA Prep with Exome 2.5 Enrichment supports the addition of probes during the enrichment process. Spike-in probes are intended to increase coverage in Exome v2.5 regions or add coverage in areas of no coverage by the Exome 2.5 panel. The analysis workflow will be similar, but attention should be paid to the resource files.
Systematic noise file. If the spike-in panel adds coverage to new regions of the genome, it is recommended to generate a new systematic noise file from normal samples enriched with Exome v2.5 plus spike-in. Otherwise, variants calls in the new regions are more vulnerable to false positive calls than regions where, based on the systematic noise file, a somatic filter tackles noise that consistently appears at specific locations in the reference genome.
Panel of Normals (PON) for CNV calls. Whether the spike-in probes add coverage to new regions of the genome or increases coverage of coverage regions, it is always recommended to generate a custom PON, for your specific conditions, if calling CNVs. CNVs are called in regions characterized in the PON, so if the case sample has coverage in additional regions compared to the PON because of a spike-in, analysis will not call CNV events in those additional regions. However, the addition of spike-in probes can change enrichment broadly and may cause coverage fluctuations in previously characterized regions. Thus, CNV calls in regions that are shared by the PON and the case sample could be impacted by the addition of spike-in probes.
DRAGEN Enrichment and Somatic apps in BaseSpace Sequence Hub are used to analyze sequencing data from llumina FFPE DNA Prep with Exome 2.5 Enrichment libraries. Refer to online support for general information about using the and the . The following DRAGEN Enrichment and Somatic app settings are specific for Illumina FFPE DNA Prep with Exome 2.5 Enrichment libraries.
Targeted Regions—The genomic regions targeted for enrichment. If no spike-in probes were included in enrichment, select Twist Bioscience for Illumina Exome 2.5 Plus Panel from the drop-down list. If other spike-in probes were included in enrichment, select Custom BED and under Target BED File select Select Dataset File(s). This must be a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel. Refer to .
Targeted Regions. This must be a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel. Refer to this.
Detailed instructions on how to configure ICA for analyzing samples processed with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment.
You use the DRAGEN Germline Enrichment and the DRAGEN Somatic Enrichment Pipelines in ICA to analyze sequencing data of llumina FFPE DNA Prep with Exome 2.5 Enrichment libraries. The following settings are specific for this library preparation kit.
The workflow is a 2-phase process. Tumor and normal samples need to be run as individual samples for the Map/Align stage, and then Variant Calling is started from tumor and normal UMI collapsed BAM as tumor-normal pairs.
The goal of this phase is to generate UMI collapsed BAM files, and these steps are done using the DRAGEN Germline Enrichment app.
Input the fastq (or ORA) files for both the tumor and normal samples.
Reference
Target BED file
BED file that contains targeted regions. If no spike-in probes were included in enrichment, select the Twist_ILMN_Exome_2.5_Panel bed file that matches the reference genome selected above, stored in the Illumina Enrichment BEDs directory.
If other spike-in probes were included in enrichment, generate a BED file of the combined coverage areas (ie., Exome 2.5 Plus Panel with the custom panel), upload to ICA, and select as the target BED file. See the Resources Files for information of format requirements.
Additional Files
Select the same Target BED file selected in Step 6.
Map Align Options
Enable Map/Align—true
Enable Map/Align Output—true
Map/Align Output—bam
Enable Duplicate Marking—false
Variant Calling Options
Enable Small Variant Caller—false
Enable CNV calling—false
Enable SV calling—false
Variant Annotation Options
Enable Variant Annotation—false
Additional Options
Additional DRAGEN Args—Paste the following commands into the box, making sure to update the name of the umi-metrics-interval-file
--enable-umi true -umi-library-type nonrandom-duplex --umi-metrics-interval-file <name_of_file_in_Additional_Files> --umi-min-supporting-reads 1 --umi-start-mask-length 1 --umi-end-mask-length 3 --qc-coverage-ignore-overlaps true
The goal of this phase is perform variant calling, CNV calling, and, if interested, certain biomarkers. The inputs are UMI collapsed BAM from Phase 1.
Input files
Enter the input bam for a tumor-normal pair. Each tumor-normal sample pair requires a different run of the pipeline. Because analysis is being done on a tumor-normal pair, realignment needs to be turned off so both the Tumor and Normal BAM Index (.bai) files need be selected along with the Tumor and Normal BAM files.
Reference
Select the same reference genome used above for alignment.
Target BED file
Select the same target bed file used above for alignment.
Systematic Noise BED File
CNV Panel of Normals
General Options
Sample Sex—Choose none from the drop-down list
Map Align Options
Enable Map/Align—false
Enable Map/Align Output—false
Enable Duplicate Marking—false
Variant Calling Options
Enable Small Variant Caller—true
Leave Emit Ref Confidence and VCF File Output blank
Enable CNV calling—true
CNV Use Somatic VC BAF—true
Enable SV calling—false
Targeted Callers
If desired, set Enable Tumor Mutational Burden—true. Note that this requires variant annotation to be enabled.
UMI Options
Enable UMI—false
UMI Library Type—leave blank
UMI Aware Variant Calling—Select Low depth from the drop-down list
Minimum Supporting UMI Reads—remove
Variant Annotation Options
This must be enabled if calculating TMB and adding Germline tagging.
Enable Variant Annotation—true
Variant Annotation Assembly—reference corresponding to the genome used for variant calling
Advanced Options
Additional DRAGEN Args—Paste the following text in the box
--vc-sq-call-threshold 3 --vc-sq-filter-threshold 15
The reference genome to use for alignment. Reference genome files are located with the Illumina DRAGEN v10 Reference directory. The provided support the use of hg19-alt_masked.cnv.hla.methylation_combined.rna_v4.tar.gz and hg38-alt_masked.cnv.hla.methylation_combined.rna_v4.tar.gz.
Select the systematic noise file that matches the Reference genome used for alignment. See the for information on pre-built and generating your own file.
Select the files to comprise the target.counts baseline files. Files must be of the same type (eg, all .target.counts or all .target.counts.gc-corrected.). Select those that match the Reference genome selected above and has the desired handling of GC correction. See the for information on pre-built target.count files.