arrow-left

Only this pageAll pages
gitbookPowered by GitBook
1 of 14

Illumina FFPE DNA Prep with Exome 2.5

Loading...

Loading...

Loading...

Tumor-Normal Software Guides

Loading...

Loading...

Loading...

Additional Information

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Illumina FFPE DNA Prep with Exome 2.5 Enrichment

llumina FFPE DNA Prep with Exome 2.5 Enrichment is part of an integrated whole-exome sequencing (WES) Tumor-Normal workflow to deliver variant calling and biomarker analysis in low-input formalin-fixed paraffin-embedded (FFPE) samples. This page provides software user guides to Illumina's cloud-based and on-premise solutions for the data analysis of this library preparation kit.

The software performs the following workflows to analyze sequencing data.

  • DNA Mapping and Aligning

  • Somatic Small Variant (SNV) Caller

  • Copy Number Variant (CNV) Caller

  • Tumor Mutational Burden (TMB)

  • Microsatellite Instability (MSI)

  • Homologous Recombination Deficiency (HRD)

  • Variant annotation

circle-info

Structural Variant (SV) caller results have not been evaluated for accuracy. Calculations for SV can add significant runtime to the analysis.

Demo Data

Demo Data is available on BSSH with the Project name "NovaSeq6000/NovaSeqX: Illumina FFPE DNA Prep with Exome 2.5 Enrichment - Demo Data" and on ICA with the Bundle name "Illumina FFPE DNA Prep with Exome 2.5 Enrichment - Demo Data".

Data includes 14 samples, or seven Tumor/Normal pairs:

  1. SNV-tumor-AF10-NovaSeq6K & SNV-normal-NovaSeq6K: Seracare Seraseq Tumor Mutation DNA Mix v2 AF10 and Seraseq WT (DNA/RNA) Reference Material. Samples with known truth variants at expected Variant Allele frequency at 10%, sequenced on the NovaSeq6000 Sequencing System.

  2. SNV-tumor-AF5-NovaSeqX & SNV-normal-NovaSeqX: Seracare Seraseq Tumor Mutation DNA Mix v2 AF5 and Seraseq WT (DNA/RNA) Reference Material. Seracare Reference Material diluted to expected Variant Allele frequency at 5%, sequenced on the NovaSeqX Sequencing System.

  3. CNV-tumor-NovaSeq6K & CNV-normal-NovaSeq6K: Seracare Seraseq Solid Tumor CNV Mix +3 Copies and Seraseq WT (DNA/RNA) Reference Material. Sample with known CNV events at expected duplication events with fold change at 3x, sequenced on the NovaSeq6000 Sequencing System.

  4. FFPE-tumor-21706-NovaSeq6K & FFPE-normal-21707-NovaSeq6K: clinical formalin-fixed paraffin-embedded (FFPE) tumor sample 21706 and benign adjacent tissue sample 21707. FFPE samples sequenced on the NovaSeq6000 Sequencing System.

  5. FFPE-tumor-21706-NovaSeqX & FFPE-normal-21707-NovaSeqX: clinical formalin-fixed paraffin-embedded (FFPE) tumor sample 21706 and benign adjacent tissue sample 21707. FFPE samples sequenced on the NovaSeqX Sequencing System.

  6. FFPE-tumor-12293-exome-NovaSeq6K & FFPE-normal-12294-exome-NovaSeq6K: FFPE tumor sample 12293 and benign adjacent tissue sample 12294, enriched with the Twist Exome v2.5 panel. FFPE samples sequenced on the NovaSeq6000 Sequencing System.

  7. FFPE-tumor-12293-exome-plus-spike-NovaSeq6K & FFPE-normal-12294-exome-plus-spike-NovaSeq6K: FFPE tumor sample 12293 and benign adjacent tissue sample 12294, enriched with the Twist Exome v2.5 panel and a custom spike-in panel. FFPE samples sequenced on the NovaSeq6000 Sequencing System.

Sample pair #7 (FFPE 12293/12294 enriched with and without a custom spike-in panel) was not included in variant calling because a systematic noise file and a Panel of Normals were not available for these combined covered regions. The pre-built WES systematic noise file based on the exome region (without spike-in) can be used, but precision in the custom spike-in areas may be reduced. The pre-built PON target counts based on the exome region (without spike-in) can be used, but CNV events may be inaccurate. Differences in coverage can seen by viewing the bam files.

Demo Data in BaseSpace Sequence Hub

Resource Files

hashtag
Distributed as DRAGEN secondary analysis Product Files

It is recommended that reference files are based on normal (non-tumorous) samples processed in the same methods used for clinical samples, matching library prep, sequencer instrumentation, etc. When that is not possible, product files to support the analysis of samples prepared with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment are available mostly on the DRAGEN Resourcesarrow-up-right page and detailed here. Generally, CNV and MSI resource files show good performance across sequencing instruments, while systematic noise files are highly dependent on the sequence instrument and therefore important to match with clinical samples.

Resource
Prebuilt
Custom (Build your own)

Target BED

If no spike-in probes were included in enrichment, download from for hg19 or hg38 reference genomes.

If spike-in probes were included in enrichment, use a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel plus the custom panel.

Systematic noise files: considered essential for reducing false positive calls in Tumor-Only workflows, and they are also effective at improving precision in Tumor-Normal workflows.

contains prebuilt systematic noise BED files. The directory SNV Somatic Systematic Noise v2.0.0 contains WES_*_v2.0.0_systematic_noise.snv.bed.gz noise files, which are built from a mixture of FF (fresh-frozen) and FFPE samples, with a mix of TruSeq PCR prep and Nextera prep and sequenced on the NovaSeq 6000. Reference genome builds (the * in the file name) include hg19 and hg38. What is downloaded from the Resource Page is a tarred directory. Extract the tarball, and within the untarred directory, find the systematic noise file. The g-zipped file can be uploaded to BSSH/ICA or used in the command-line execution of DRAGEN.

For instructions on how to build your own systematic noise file using internally-sequenced normal samples, see . Building your own noise file will be necessary if using spike-in probes. The normal samples used to generate the systematic noise file should match as closely as possible the sequencer, sample type, and library prep of the tumor samples. Also available are the DRAGEN Baseline Builder App on BSSH and the Systematic Noise File Builder on ICA.

CNV - Somatic pipelines: A Panel of normals (PONs) is used for calling gene amplification in tumor samples

contains CNV Panel of Normals for Twist Bioscience for Illumina Exome FFPE 2.5 - DRAGEN 4.4 v1.0. The PON was generated from 45 FFPE benign adjacent samples from different tissue types and from male and female donors with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment protocol. Libraries were sequenced on the NovaSeq 6000 sequencing system. Current supported builds include hg19 and hg38, both target.counts and gc-corrected.target.counts.

As with the systematic noise file, internally-sequenced normal samples can be used to . The input should be the UMI-collapsed BAM files of the normal samples, and the reference genome and target BED need to match those used throughout the workflow. Also available are the DRAGEN Baseline Builder App on BSSH and the DRAGEN CNV Baseline Builder Pipeline on ICA.

MSI - Resource directory

To be used when enabling the biomarker MSI in Tumor-only mode.

contains Microsatellite Files v1.1.1. Within the WES directory are *.MSI_baselines_v1.1.0.combined.dist and *microsatellites.list files. The normal samples combined into this baseline files are from benign adjacent samples from different tissue types and from male and female donors. The normal samples were prepared with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment protocol. Libraries were sequenced on the NovaSeq 6000 sequencing system. Current supported builds include hg19 and hg38. The list of microsatellite sites from which to calculate instability are included in the *microsatellites.list files. Note that these microsatellites files do not contain sites on chrY, and that the same genome reference build should be used for the baseline combined dist and microsatellite list files. What is downloaded from the Resource Page is a tarred directory. Extract the tarball, and within the untarred directory, find the WES directory. Within that directory, are *.combined.dist based on different reference genome builds. Combined .dist files are input for DRAGEN v4.4.

See the reference directory from interally-sequenced normal samples. Also available are the DRAGEN Baseline Builder App on BSSH and the DRAGEN MSI Baseline Builder 4-3-6 Pipeline on ICA. See the

Illumina DNA Prep with Exome 2.5 BED filesarrow-up-right
The requirements of a BED file are defined here.arrow-up-right
DRAGEN Resourcesarrow-up-right
Custom Systematic Noise Filesarrow-up-right
DRAGEN Resourcesarrow-up-right
generate a PONarrow-up-right
DRAGEN Resourcesarrow-up-right
DRAGEN manual for instructions on how to generate your ownarrow-up-right
DRAGEN manual for how to generate custom microsatellite lists.arrow-up-right

Quick check of run quality

Checking coverage on the target regions can be a quick way to determine if the map/align step ran as expected. Following the enrichment and sequencing guidelines, the Mean Target Coverage (MTC) Depth for tumor samples is expected to be ≥130x, for normal samples ≥50x. This metric is called "Average alignment coverage over target region" and can be found in the <sample>.target_bed_coverage_metrics_<tumor/normal>.csv file. Note that results for "Target bed" should be evaluated, as DRAGEN Report defaults to genome, while this is a target-based, WES, assay. Another metric to check is "Aligned reads in target region", which gives a percentage of the reads in the target region relative to the entire genome and can be thought as Percent Read Enrichment. Other quality metrics to check can include the percentage of Mapped Reads (aim of ≥ 90%) and the Insert length (aim of ≥ 125bp) in the <sample>.mapping_metrics.csv file.

Spike-in panels

The Illumina FFPE DNA Prep with Exome 2.5 Enrichment supports the addition of probes during the enrichment process. Spike-in probes are intended to increase coverage in Exome v2.5 regions or add coverage in areas of no coverage by the Exome 2.5 panel. The analysis workflow will be similar, but attention should be paid to the resource files.

Targeted Regions. This must be a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel. Refer to this.

Systematic noise file. If the spike-in panel adds coverage to new regions of the genome, it is recommended to generate a new systematic noise file from normal samples enriched with Exome v2.5 plus spike-in. Otherwise, variants calls in the new regions are more vulnerable to false positive calls than regions where, based on the systematic noise file, a somatic filter tackles noise that consistently appears at specific locations in the reference genome.

Panel of Normals (PON) for CNV calls. Whether the spike-in probes add coverage to new regions of the genome or increases coverage of coverage regions, it is always recommended to generate a custom PON, for your specific conditions, if calling CNVs. CNVs are called in regions characterized in the PON, so if the case sample has coverage in additional regions compared to the PON because of a spike-in, analysis will not call CNV events in those additional regions. However, the addition of spike-in probes can change enrichment broadly and may cause coverage fluctuations in previously characterized regions. Thus, CNV calls in regions that are shared by the PON and the case sample could be impacted by the addition of spike-in probes.

guide on the requirements of the combined BED filearrow-up-right

Germline variants

hashtag
Exploring germline variants that might play a role in tumor analysis

In somatic T/N analysis, DRAGEN uses a subtraction model for SNVs. VCF will only emit variants where there is no (or very limited) alternate reads in the normal sample; in other words, variants that are found in both the normal and somatic sample will NOT be emitted in the SNV/INDEL VCF. This means that most germline variants will be excluded from the somatic VCF. To extract germline variants, it is recommended to run an additional analysis, the DRAGEN Germline pipelinearrow-up-right, on the normal sample. In BSSH, there is an option to Enable Small Variant Calling on Normal Samples, while running Somatic in Tumor-Normal mode. As the germline and somatic variants are output from separate analyses, the variants will need to be combined before further exploration or explored separately.

hashtag
Will enabling germline tagging in T/N workflows provide germline variants in the SNV/INDEV VCF?

Only those somatic variants that match a known allele from a database. In T/N mode, where the VCF usually includes only somatic variants, then it is possible that some of these somatic variants may match a known allele from a database, and then be tagged as germline.

Germline variants can be labeled in the somatic VCF by adding a flag to the command line recipe or by checking to Enable Common Germline Variant Tagging in the BSSH/ICA applications. Note that this flag has implications for other settings. First, annotation must be enabled. Second, if TMB and germline tagging are both enabled, then the TMB database filter needs to be set to false. With these two flags, germline variants are tagged in the VCF but ignored during TMB calculation.

hashtag
If I want both germline and somatic variants, should I run the Tumor sample in the Tumor/Only workflow, because that produces a VCF that contains both germline and somatic variants?

The matched normal provides a more reliable indicator of somatic variants than a database. Database germline tagging for SNV incorrectly tags ~2% of germline variants as somatic and falsely classifies some somatic variants as germline (i.e. false positives and false negatives for somatic SNVs). Additionally, there is limited support for differentiating between germline and somatic CNVs. If a paired normal sample is available, the highest performance for the detection of both somatic and germline variants is achieved by implementing the Tumor/Normal Somatic Variant Calling workflow (as described in this user guide) and also the Germline Caller on the Normal sample.

Microsatellite Instability (MSI)

Microsatellite Instability (MSI) If MSI values are needed ahead of software updates, it is recommend to calculate MSI on the tumor fastq files in tumor-only mode in a separate DRAGEN analysis. Tumor-o

It is recommend to calculate MSI in . Tumor-only mode relies on a reference directory of files (.dist) containing the microsatellite repeat distribution in a panel of normals. Both the reference directory and microsatellite list are provided - see the for more information - and need to match the reference genome used in the alignment step.

tumor-only modearrow-up-right
Resource Files page
--vc-enable-germline-tagging true
--tmb-skip-db-filter false

Tumor-only analysis

The performance of the Illumina FFPE DNA Prep with Exome 2.5 Enrichment kit with tumor only samples is currently being evaluated. The DRAGEN v4.4 user guide provides a recipe for DNA Somatic Tumor-Only Solid WES UMI workflowarrow-up-right. Settings specific for this kit are included here.

/opt/dragen/$VERSION/bin/dragen        #DRAGEN install path 
--ref-dir $REF_DIR                     #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir /staging    #tmp dir on fast SDD 
--output-file-prefix $PREFIX 
# Inputs
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
--fastq-list $PATH                      #see 'Input Options' for FQ, BAM or CRAM 
--fastq-list-sample-id $STRING 
# Mapper
--enable-map-align true
--enable-map-align-output true         #save the output BAM (default=false)
--qc-coverage-ignore-overlaps true     #do not double-count overlapping mates
--validate-pangenome-reference false   #currently the linear reference is recommended for somatic analysis
# UMI
--umi-enable true 
--umi-library-type nonrandom-duplex
--umi-min-supporting-reads 1
--umi-start-mask-length 1
--umi-end-mask-length 3
# Small variant caller 
--enable-variant-caller true 
--vc-target-bed $VC_TARGET_BED         #see Resource Files page
--vc-systematic-noise $PATH            #see Resource Files page
--vc-enable-umi-solid true 
--vc-enable-germline-tagging true 
--vc-enable-non-primary-allelic-filter true
--vc-enable-triallelic-filter false       
# CNV 
--enable-cnv true
--cnv-population-b-allele-vcf $POP_SNPs  #Path to population SNP VCF; see https://help.dragen.illumina.com/product-guide/dragen-v4.4/dragen-dna-pipeline/cnv-calling/additional-documentation/cnv-preprocessing#b-allele-counts-ascn-callers
--cnv-target-bed $VC_TARGET_BED 
--cnv-combined-counts $PATH            #see Resource Files page
# Annotation                           #annotation is required if enabling TMB
--enable-variant-annotation true
--variant-annotation-data $PATH        #see notes below
--variant-annotation-assembly GRCh37/8
# TMB 
--enable-tmb true
--tmb-enable-proxi-filter true
# HRD Scoring 
--enable-hrd true                       #requires CNV 
# Microsatellite Instability (MSI) 
--msi-command tumor-only
--max-base-quality 63                                    #default if UMI is enabled
--msi-coverage-threshold 40                              
--msi-microsatellites-file ${microsatellite_file}        #see Resource Files page
--msi-ref-normal-dir ${normal_reference_directory}       #see Resource Files page

DRAGEN v4.4.6 T/N Recipe

The following DRAGEN recipes are specific for Illumina FFPE DNA Prep with Exome 2.5 Enrichment tumor and normal libraries (do not use for tumor only analysis or for a different library prep method). A general recipe for Somatic analysis of Tumor Normal samples with UMI is provided on DRAGEN v4.4 support pagesarrow-up-right.

hashtag
Notes and additional options

hashtag
Hashtable

For DRAGEN somatic runs it is recommended to use the linear (non-graph) hashtable.

Please see:

hashtag
Inputs

.

FQ list Input

or

FQ Input

or

BAM Input

or

CRAM Input

or

hashtag
Mapper

Option
Description

hashtag
UMI

The above recipe details UMI options specific for the Illumina FFPE DNA Prep with Exome 2.5 Enrichment Library kit.

hashtag
SNV

The vc-sq-filter-threshold flag can be used to fine-tune the "somatic quality" value at which variants are called in order to balance sensitivity and specificity. Variants with values between the vc-sq-call-threshold and vc-sq-filter-threshold values are labeled as "weak_evidence" in the output VCF file. The for tumor-normal analysis is 17.5.

hashtag
Annotation

Top directory containing Nirvana data file. Instructions on how to download the resource are at . DRAGEN is expecting a top directory containing the sub-directories Cache, References, Supplementary Annotation.

hashtag
Microsatellite Instability (MSI)

Microsatellite Instability (MSI) is run in tumor-only mode, but other variants and biomarkers are calculated in tumor-normal mode. msi-coverage-threshold is a required parameter for both tumor-only and tumor-normal, and should be set according to the sample coverage. The sites in the msi-microsatellites-file need to match those in the dist files within the msi-ref-normal-dir .

/opt/dragen/$VERSION/bin/dragen        #DRAGEN install path 
--ref-dir $REF_DIR                     #path to DRAGEN linear hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir /staging    #tmp dir on fast SDD 
--output-file-prefix $PREFIX 
# Inputs
--tumor-fastq-list $PATH                #see 'Input Options' for FQ, BAM or CRAM 
--tumor-fastq-list-sample-id $STRING 
--fastq-list $PATH                      #see 'Input Options' for FQ, BAM or CRAM 
--fastq-list-sample-id $STRING 
# Mapper
--enable-map-align true
--enable-map-align-output true         #save the output BAM (default=false)
--qc-coverage-ignore-overlaps true     #do not double-count overlapping mates
# UMI
--umi-enable true 
--umi-library-type nonrandom-duplex
--umi-min-supporting-reads 1
--umi-start-mask-length 1
--umi-end-mask-length 3
--tumor-normal-has-umi both
# Small variant caller 
--enable-variant-caller true 
--vc-target-bed $VC_TARGET_BED         #see Resource Files page
--vc-systematic-noise $PATH            #see Resource Files page
--vc-enable-umi-solid true 
--vc-sq-call-threshold 3 
--vc-sq-filter-threshold 15
--vc-enable-non-primary-allelic-filter true 
--vc-enable-triallelic-filter false 
--vc-skip-germline-tagging true        
# CNV 
--enable-cnv true
--cnv-use-somatic-vc-baf true
--cnv-target-bed $VC_TARGET_BED 
--cnv-combined-counts $PATH            #see Resource Files page
# Annotation                           #annotation is required if enabling TMB
--enable-variant-annotation true
--variant-annotation-data $PATH        #see notes below
--variant-annotation-assembly GRCh37/8
# TMB 
--enable-tmb true
# HRD Scoring 
--enable-hrd true                       #requires CNV 
# Microsatellite Instability (MSI) 
--msi-command tumor-only
--max-base-quality 63                                    #default if UMI is enabled
--msi-coverage-threshold 40                              #see notes below 
--msi-microsatellites-file ${microsatellite_file}        #see Resource Files page
--msi-ref-normal-dir ${normal_reference_directory}       #see Resource Files page

--qc-coverage-ignore-overlaps true

resolve all of the alignments for each fragment and avoid double-counting any overlapping bases

DRAGEN referencesarrow-up-right
Here are support pages for the formats of different input file typesarrow-up-right
default SQ Filter threshold in DRAGENarrow-up-right
https://support.illumina.com/content/dam/illumina-support/help/Illumina_DRAGEN_Bio_IT_Platform_v3_7_1000000141465/Content/SW/Informatics/Dragen/Nirvana_DownloadData_fDG.htmarrow-up-right
--fastq-list $PATH 
--fastq-list-sample-id $STRING 
--tumor-fastq-list $PATH 
--tumor-fastq-list-sample-id $STRING 
--fastq-file1 $PATH 
--fastq-file2 $PATH 
--RGSM $STRING 
--RGID $STRING 
--tumor-fastq1 $PATH 
--tumor-fastq2 $PATH 
--RGSM-tumor $STRING 
--RGID-tumor $STRING 
--bam-input $PATH
--tumor-bam-input $PATH
--cram-input $PATH
--tumor-cram-input $PATH

Troubleshooting the analysis

Potential changes to the analysis settings to improve performance when analyzing Illumina FFPE DNA Prep with Exome 2.5 Enrichment.

chevron-rightWhy are the Tumor Mutation Burden (TMB) values not what I expected?hashtag

Low TMB values

Variants that do not meet the minimum depth (DP) threshold are excluded from the TMB calculation. The --vc-callability-tumor-thresh command line option specifies the threshold value. The default value is 50, and this value assumes at least 100x coverage. If raw coverage is lower than 100x and and tumor purity >= 80%, the threshold value can be decreased to 30, by specifying --vc-callability-tumor-thresh 30 as an additional argument.

High TMB values

DRAGEN outputs both TMB and NonSyn TMB metrics, where . One metric may be more appropriate than the other for a given comparison. For example, Seracare TMB reference samples include only nonsynonymous variants in their TMB metric.

chevron-rightWhy do I have too many/too few PASSing somatic variants?hashtag

Somatic variants with a quality score less than the threshold value [0..30] are marked as filtered in the output vcf. The --vc-sq-filter-thresholdcommand line option specifies the threshold value. The default value for Tumor/Normal mode is 17.5. We observed that with this library preparation kit, the threshold needed to be lowered slightly to 15 to achieve high sensitivity. Raise this value to improve specificity at the cost of sensitivity, or lower it to improve sensitivity at the cost of specificity.

chevron-rightWhy do some qc-coverage metrics report different values between command line and BSSH?hashtag

There are some rare instances where DRAGEN and BSSH apps have different default settings that might lead to slight difference in coverage calculations. For instance, the BSSH Enrichment App outputs a Aggregate Summary Metrics file that include all reads with MAPQ>=0 in the coverage calculations, while in DRAGEN command-line, MAPQ=0 reads are filtered. To include MAPQ=0 reads in DRAGEN command-line coverage, the flag --qc-coverage-filters-1 “mapq<0” can be added to the recipe.

chevron-rightWhy is the CNV VCF file empty?hashtag

The first thing to check is the coverage of the normal sample. Normal sample counts are added to PON, and if the counts from the normal sample are nearly empty, the GC correction stage will fail. In that case, DRAGEN will output an empty VCF.

TMB includes all Filtered Variants in the calculation and NonSyn TMB includes Filtered Nonsynonymous Variantsarrow-up-right

BaseSpace Sequence Hub (BSSH) T/N

Detailed instructions on how to configure the DRAGEN command-line for analyzing paired tumor-normal samples processed with the Illumina FFPE DNA Prep with Exome 2.5 Enrichment.

DRAGEN Somatic App Version 4.4.6 in BaseSpace Sequence Hub is used to analyze sequencing data from llumina FFPE DNA Prep with Exome 2.5 Enrichment tumor and normal libraries. Refer to online support for general information about using the DRAGEN Somatic apparrow-up-right. The settings defined here in this user guide are specific for this library preparation kit and for paired tumor-normal analysis (do not use for tumor only analysis).

hashtag
Access to Resource Files

The analysis workflow uses resource files. See the Resource Files page for details on where to find those files, which should be manually uploaded to BSSH and available for analyses.

hashtag
Analysis Settings

1

hashtag
Open the DRAGEN Somatic App

Version 4.4.6

2

hashtag
Input Data

Select Biosamples or Samples as the input type to launch the analysis.

3

hashtag
Configuration

  • Analysis name—Name of the analysis

  • Save Result To—Project to store analysis results in.

  • Pipeline Configuration—Select Map/Align + Somatic Small Variant Caller (FASTQ or BAM or CRAM input).

  • Somatic CNV Calling—Select Tumor-Normal

  • Input FASTQ—Select the TUMOR FASTQ and the paired NORMAL FASTQ. The Sample Sex can remain as Auto-Detect. Click Add a New Row to input another tumor-normal pair. Each pair is entered as a new row.

  • Reference—The reference genome to use for alignment. Currently Homo sapiens [UCSC] hg19 v5 and Homo sapiens [1000 Genomes] hg38 v5 are supported for somatic for variant calling. For DRAGEN somatic runs it is recommended to use the linear (non-graph) hashtable.

  • Library Specific Settings—Select Illumina FFPE DNA Prep with Exome 2.5 Enrichment. Selecting this option presets settings optimized for this sample prep.

  • Targeted Regions—If no spike-in probes were included in enrichment, select Twist Bioscience for Illumina Exome 2.5 Plus Panel. If the mitochondrial probes were included, select Illumina Exome 2.5 Plus Panel with Mitochondrial Panel. If other spike-in probes were included in enrichment, select Custom BED. This must be a BED file of the combined coverage areas, ie., Exome 2.5 Plus Panel and the custom panel.

  • Systematic Noise Filter—Leave Enable Systematic Noise Filter checked. The noise file will automatically be selected based on the hg19 or hg38 reference genome specified above.

4

hashtag
Biomarkers

  • View the Biomarkers options by clicking on + Biomarkers

  • If HRD calculation is desired, check Enable HRD scoring.

circle-info

CNV calling must be enabled in order to calculate HRD.

  • If Tumor Mutational Burden calculation is desired, check Enable TMB calculation

circle-info

TMB will be calculated using WES coding regions for the selected genome reference, so Custom TMB Regions do not need to be supplied.

5
  • If Microsatellite Instability calculation is desired, check Enable MSI calling.

  • MSI References—Select Dataset File(s) and select the combined distance files of the normal reference samples (e.g. WES_FFPE_hg*_MSI_baselines_v1.1.0.combined.dist). The reference genome must be hg19 or hg38.

  • Custom MSI Regions—Select Dataset File(s) and select the microsatellite list that matches the hg19 or hg38 reference genome used for alignment (e.g. WES_v1.1.0_hg*_microsatellites.tsv)

hashtag
CNV

  • GC Bias Correction—The correct CNV Combined Counts File will be selected based on the hg19 or hg38 reference genome selected above.

circle-info

Structural Variant (SV) caller results have not been evaluated for accuracy. Calculations for SV can add significant runtime to the analysis.

hashtag
UMI Settings

These settings will be automatically applied when using the Library Specific Settings.

6

hashtag
Advanced Settings

  • View the Advanced Settings options by clicking on + Advanced Settings

  • Germline Variant Calling: Uncheck Enable Small Variant Calling on Normal Samples.

  • Germline Tagging: Uncheck Enable Common Germline Variant Tagging.

  • Variant Annotation—Check Enable variant annotation.

circle-info

Variant Annotation must be enabled if calculating TMB.

7

hashtag
Additional Arguments

  • View the Additional Arguments options by clicking on + Additional Arguments

  • Check the box acknowledging the warning around using command-line arguments

  • Include these flags as Additional DRAGEN Command-line Arguments --qc-coverage-ignore-overlaps true

    • This specifies that overlapping mates should not be double-counted.

8

hashtag
Click Launch Application

Illumina Connected Analytics (ICA) T/N

Use the DRAGEN_Somatic_Enrichment_4-3-17 Pipeline as part of the DRAGEN 4.3 Bundle in ICA to analyze sequencing data of llumina FFPE DNA Prep with Exome 2.5 Enrichment libraries. The settings defined here in this user guide are specific for this library preparation kit and should be used for paired tumor-normal samples.

hashtag
Access to Resource Files

The analysis workflow uses resource files released with DRAGEN 4.4, so link both the DRAGEN 4.3 and DRAGEN 4.4 Bundles to your project. Additionally, Microsatellite Resource Files need to be manually uploaded and linked to your project if calculating MSI. See the page for details on where to find those files. Note that the input for the Microsatellite Normal References can be a directory of individual distance files or combined distance file.

hashtag
Analysis Settings

Start the DRAGEN Somatic Enrichment analysis

Under Pipelines, select DRAGEN_Somatic_Enrichment_4-3-17. Click Start Analysis on the top right of the page.

hashtag
General

User Reference—A run name meaningful to the user

User tags, email notifications, and output folders are optional. If no output folder is selected, the output folder will be located in the root of the project.

hashtag
Pricing

Select an Entitlement Bundle from the drop-down menu following Subscription.

hashtag
Input files

Input the fastq (or ORA) files for the tumor and normal sample in a sample pair.

hashtag
Reference

The reference genome to use for alignment. Reference genome files are located with the Illumina DRAGEN v10 Reference directory. The provided support the use of hg19-alt_masked.cnv.hla.methylation_combined.rna_v4.tar.gz and hg38-alt_masked.cnv.hla.methylation_combined.rna_v4.tar.gz.

hashtag
Target BED file

BED file that contains targeted regions. If no spike-in probes were included in enrichment, select the Twist_ILMN_Exome_2.5_Panel bed file that matches the reference genome selected above, stored in the Illumina Enrichment BEDs directory.

If other spike-in probes were included in enrichment, generate a BED file of the combined coverage areas (ie., Exome 2.5 Plus Panel with the custom panel), upload to ICA, and select as the target BED file. See the Resources Files for information of format requirements.

hashtag
Systematic Noise BED File

Select the systematic noise file that matches the panel used for enrichment and the reference genome used for alignment. The Illumina Systematic Noise directory within DRAGEN 4.4 Bundle contains WES_FFPE_NovaSeq_TwistV2.5_hg*_v1.0_systematic_noise.bed.gz. See the for information on pre-built and generating your own file.

hashtag
CNV Panel of Normals/CNV Combined Counts

Select the files to comprise the target.counts baseline files. Files must be of the same type (eg, all .target.counts or all .target.counts.gc-corrected.). Select those files that match the panel used for enrichment, match the reference genome used for alignment, and has the desired handling of GC correction. The Illumina DRAGEN CNV Baseline Files/DRAGEN 4.4/ directory contains combined counts files for both hg19 and hg38. See the for information on pre-built target.count files.

hashtag
MSI - Microsatellites File and Microsatellites Normal References Directory

As noted above, resource files for calculating MSI need to be uploaded to ICA and made available to the Project for inclusion in analysis.

hashtag
Microsatellites File

Select the microsatellite list that maches the reference genome used for alignment: WES_v1.1.0_hg*_microsatellites.list

hashtag
Microsatellites Normal References Directory/Combined Microsatellites Normal References File

Input the directory containing the individual distance files of the normal reference samples (hg*-WES-FFPE-*.microsat_normal.dist), or input a single merged file of the distances for each normal sample.

hashtag
Settings

hashtag
General Options

Output File Prefix—If something other than tumor is desired for the output file prefix, enter that here.

Sample Sex—Choose none from the drop-down list

hashtag
Map Align Options

Enable Map/Align—true

Enable Map/Align Output—true

Enable Duplicate Marking—false

Map/Align Output—bam

hashtag
Variant Calling Options

Enable Small Variant Caller—true

Leave Emit Ref Confidence and VCF File Output blank

Enable Germline Tagging—false

Enable CNV calling—true

CNV Use Somatic VC BAF—true

Enable SV calling—false

circle-info

Structural Variant (SV) caller results have not been evaluated for accuracy. Calculations for SV can add significant runtime to the analysis.

hashtag
Targeted Callers

MSI command—tumor-only

circle-info

It is recommended to run the tumor sample in tumor-only mode for highest performance of this biomarker. Note that this setting allows MSI to be calculated in tumor-only mode while other analyses are calculated in tumor-normal mode. This biomarker is under active development.

MSI Coverage Threshold—60

If desired, set Enable Tumor Mutational Burden—true. Note that this requires variant annotation to be enabled.

Enable HLA—false

Enable HRD—false

circle-info

CNV calling must be enabled in order to calculate HRD.

Enable Tumor Mutational Burden—true

circle-info

TMB will be calculated using WES coding regions for the selected genome reference, so Custom TMB Regions do not need to be supplied.

hashtag
UMI Options

Enable UMI—true

UMI Library Type—nonrandom-duplex

UMI Aware Variant Calling—Low depth

Minimum Supporting UMI Reads—1

hashtag
Variant Annotation Options

circle-info

Variant Annotation must be enabled if calculating TMB.

Enable Variant Annotation—true

Variant Annotation Assembly—reference corresponding to the genome used for variant calling

hashtag

hashtag
Additional Options

Additional DRAGEN Args—Paste the following commands into the box, making sure to update the name of the umi-metrics-interval-file

--umi-start-mask-length 1 --umi-end-mask-length 3 --qc-coverage-ignore-overlaps true --vc-sq-call-threshold 3 --vc-sq-filter-threshold 15

circle-info

Not Double-Counting Mates provides a more realistic overview of coverage, as reported in the Mean Region Coverage within the Coverage section of DRAGEN Reports.

hashtag
Resources

Select the resources setting

hashtag
Start analysis

Once all parameters have been set, click Start analysis on the top right of the page

hashtag
Rerun analysis settings with additional tumor-normal pairs

These settings can be rerun on additional tumor-normal sample pairs by selecting the particular analysis and clicking Rerun on the top right corner. Update the User reference and swap out the Tumor and Normal input files for the new pair.

Resource Files
Resource Files
Resource Files
Resource Files
Screenshot highlighting the Rerun feature