Option 1: View demo data

Please wait a couple seconds after clicking and you should be redirected to the Visualize and Explore tab.

Option 2: Upload your own data Files

(Required) Please upload the aggregate report file. Note that this will be the data displayed in the main table in the Explore tab.

Does this aggregate report file correspond to Class I or Class II prediction data?

Class I data (e.g. HLA-A*02:01)

Class II data (e.g. DPA1*01:03)

(Required) Please upload the corresponding metrics file for the main file that you have chosen.

(Optional) If you would like, you can upload an additional aggregate report file generated with either Class I or Class II results to supplement your main table. (E.g. if you uploaded Class I data as the main table, you can upload your Class II report here as supplemental data)

Please provide a label for the additional file uploaded (e.g. Class I data or Class II data)

(Optional) Additionally, you can upload a gene-of-interest list in a tsv format, where each row is a single gene name. These genes (if in your aggregate report) will be highlighted in the Gene Name column.

4. Gene-of-interest List (tsv required)

Browse...

Basic Instructions: How to explore your data using pVACview?

Step 1: Upload your own data / Load demo data

You can either choose to explore a demo dataset that we have prepared from the HCC1395 cell line, or choose to upload your own datasets.

If you are uploading your own datasets, the two required inputs are output files you obtain after running the pVACseq pipeline. The aggregated tsv file is a list of all predicted epitopes and their binding affinity scores with additional variant information and the metrics json file contains additional transcript and peptide level information.

You have the option of uploading an additional file to supplement the data you are exploring. This includes: additional class I or II information and a gene-of-interest tsv file.

Step 2: Exploring your data

To explore the different aspects of your neoantigen candidates, you will need to navigate to the Aggregate Report of Best Candidate by Variant on the visualize and explore tab. For detailed variant, transcript and peptide information for each candidate listed, you will need to click on the Investigate button for the specific row of interest. This will prompt both the transcript and peptide table to reload with the matching information.

By hovering over each column header, you will be able to see a brief description of the corresponding column and for more details, you can click on the tooltip located at the top right of the aggregate report table.
After investigating each candidate, you can label the candidate using the dropdown menu located at the second to last column of the table. Choices include: Accept, Reject or Review.

Step 3: Exporting your data

When you have either finished ranking your neoantigen candidates or need to pause and would like to save your current evaluations, you can export the current main aggregate report using the export page.

Navigate to the export tab, and you will be able to name your file prior to downloading in either tsv or excel format. The excel format is user-friendly for downstream visualization and manipulation. However, if you plan on to continuing editing the aggregate report and would like to load it back in pVACview with the previous evaluations preloaded, you will need to download the file in a tsv format. This serves as a way to save your progress as your evaluations are cleared upon closing or refreshing the pVACview app.

Advanced Options: Regenerate Tiering with different parameters

*Please note that the metrics file is required in order to regenerate tiering information with different parameters
Current version of pVACseq results defaults to positions 1, 2, n-1 and n (for a n-mer peptide) when determining anchor positions. If you would like to use our allele specific anchor results and regenerate the tiering results for your variants, please specify your contribution cutoff and submit for recalculation. More details can be found here.

For further explanations on these inputs, please refer to the pVACview documentation.

Original Parameters for Tiering

These are the original parameters used in the tiering calculations extracted from the metrics data file given as input.

Current Parameters for Tiering

These are current parameters used in the tiering calculations which may be different from the original parameters if candidates were re-tiered.

Add Comments for selected variant

Please add/update your comments for the variant you are currently examining

Comment:

Aggregate Report of Best Candidates by Variant

Currently investigating row:

Variant Information

Best Peptide Data

Best Peptide:

AA Change:

Pos:

Gene:

Query Data

Query Sequence:

Hits:

Hits

Additional Data Type:

Median MT IC50:

Median MT Percentile:

Best Peptide:

Corresponding HLA allele:

Best Transcript:

Variant & Gene Info

DNA VAF

RNA VAF

Gene Expression

Genomic Information (chromosome - start - stop - ref - alt)

Additional variant information:

Peptide Evaluation Overview

Transcript and Peptide Set Data

Allele specific anchor prediction heatmap for the candidates in peptide table.

HLA allele specific anchor predictions overlaying candidate peptide sequences for selected transcript set.

Anchor vs Mutation position Scenario Guide

Anchor Positions

Anchor Weights

Additional Peptide Information

Violin Plots showing distribution of MHC IC50 predictions for selected peptide pair (MT and WT).

Showcases individual binding prediction scores from each algorithm used. A solid line is used to represent the median score.

Violin Plots showing distribution of MHC percentile predictions for selected peptide pair (MT and WT).

Showcases individual percentile scores from each algorithm used. A solid line is used to represent the median percentile score.

Prediction score table showing exact MHC binding values for IC50 and percentile calculations.

Prediction score table showing exact MHC scores for elution, immunogenicity, and percentile calculations.

BigMHC_EL / BigMHC_IM : A deep learning tool for predicting MHC-I (neo)epitope presentation and immunogenicity. ( Citation )
DeepImmuno : Deep-learning empowered prediction of immunogenic epitopes for T cell immunity. ( Citation )
MHCflurryEL Processing : An "antigen processing" predictor that attempts to model MHC allele-independent effects such as proteosomal cleavage. ( Citation )
MHCflurryEL Presentation : A predictor that integrates processing predictions with binding affinity predictions to give a composite "presentation score." ( Citation )
NetMHCpanEL / NetMHCIIpanEL : A predictor trained on eluted ligand data. ( Citation )

Error: Missing required files (both aggregate report and metrics files are required to properly visualize and explore candidates).

Export filename:

Main table full column descriptions

If using pVACview with pVACtools output, the user is required to provide at least the following two files: all_epitopes.aggregated.tsv all_epitopes.aggregated.metrics.json

The all_epitopes.aggregated.tsv file is an aggregated version of the all_epitopes TSV. It presents the best-scoring (lowest binding affinity) epitope for each variant, along with additional binding affinity, expression, and coverage information for that epitope. It also gives information about the total number of well-scoring epitopes for each variant, the number of transcripts covered by those epitopes, and the HLA alleles that those epitopes are well-binding to. Here, a well-binding or well-scoring epitope is any epitope that has a stronger binding affinity than the aggregate_inclusion_binding_threshold described below. The report then bins variants into tiers that offer suggestions about the suitability of variants for use in vaccines.

The all_epitopes.aggregated.metrics.json complements the all_epitopes_aggregated.tsv and is required for the tool's proper functioning.

Column Names : Description

ID : A unique identifier for the variant

Index : A unique identifier for the variant and best neoantigen candidate

HLA Alleles : For each HLA allele in the run, the number of this variant’s epitopes that bound well to the HLA allele (with lowest or median mutant binding affinity < aggregate_inclusion_binding_threshold )

Gene : The Ensembl gene name of the affected gene

AA Change : The amino acid change for the mutation

Num Passing Transcripts : The number of transcripts for this mutation that resulted in at least one well-binding peptide ( lowest or median mutant binding affinity < aggregate_inclusion_binding_threshold )

Best Peptide : The best-binding mutant epitope sequence (lowest binding affinity) prioritizing epitope sequences that resulted from a protein_coding transcript with a TSL below the maximum transcript support level and having no problematic positions.

Best Transcript : Transcript corresponding to the best peptide with the lowest TSL and shortest length.

TSL : Transcript support level of the best peptide

Pos : The one-based position of the start of the mutation within the epitope sequence. 0 if the start of the mutation is before the epitope (as can occur downstream of frameshift mutations)

Prob Pos : If you specify problematic amino acids when running pVACseq, the number of problematic peptides within the best peptide.

Num Passing Peptides : The number of unique well-binding peptides for this mutation.

IC50 MT : Lowest or Median ic50 binding affinity of the best-binding mutant epitope across all prediction algorithms used.

IC50 WT : Lowest or Median ic50 binding affinity of the corresponding wildtype epitope across all prediction algorithms used.

%ile MT : Lowest or Median binding affinity percentile rank of the best-binding mutant epitope across all prediction algorithms used (those that provide percentile output)

%ile WT : Lowest or Median binding affinity percentile rank of the corresponding wildtype epitope across all prediction algorithms used (those that provide percentile output)

RNA Expr : Gene expression value for the annotated gene containing the variant.

RNA VAF : Tumor RNA variant allele frequency (VAF) at this position.

Allele Expr : RNA Expr * RNA VAF

RNA Depth : Tumor RNA depth at this position.

DNA VAF : Tumor DNA variant allele frequency (VAF) at this position.

Tier : A tier suggesting the suitability of variants for use in vaccines.

Evaluation : Column to store the evaluation of each variant when evaluating the run in pVACview. Can be Accept, Reject or Review .

How is the Tiering column determined / How are the Tiers assigned?

Tier : Criteria

Pass : (MT binding < binding threshold) AND allele expr filter pass AND vaf clonal filter pass AND tsl filter pass AND anchor residue filter pass

Anchor : (MT binding < binding threshold) AND allele expr filter pass AND vaf clonal filter pass AND tsl filter pass AND anchor residue filter fail

Subclonal : (MT binding < binding threshold) AND allele expr filter pass AND vaf clonal filter fail AND tsl filter pass AND anchor residue filter pass

LowExpr : (MT binding < binding threshold) AND low expression criteria met AND allele expr filter pass AND vaf clonal filter pass AND tsl filter pass AND anchor residue filter pass

Poor : Best peptide for current variant FAILS in two or more categories

NoExpr : ((gene expr == 0) OR (RNA VAF == 0)) AND low expression criteria not met

Here we list out the exact criteria for passing each respective filter:

Allele Expr Filter: (allele expr >= allele expr cutoff) OR (rna_vaf == 'NA') OR (gene_expr == 'NA')

VAF Clonal Filter: (dna vaf < vaf subclonal) OR (dna_vaf == 'NA')

TSL Filter: (TSL != 'NA') AND (TSL < maximum transcript support level)

Anchor Residue Filter:
1. (Mutation(s) is at anchor(s)) AND ((WT binding < binding threshold) OR (WT percentile < percentile threshold))
OR
2. Mutation(s) not or not entirely at anchor(s)

Low Expression Criteria: (allele expr > 0) OR ((gene expr == 0) AND (RNA Depth > RNA Coverage Cutoff) AND (RNA VAF > RNA vaf cutoff))

Note that if a percentile threshold has been provided, then the %ile MT column is also required to be lower than the given threshold to qualify for tiers, including Pass, Anchor, Subclonal and LowExpr.

Transcript Set Table

Upon selecting a variant for investigation, you may have multiple transcripts covering the region.

These transcripts are grouped into Trancripts Sets , based on the good-binding peptides produced. (Transcripts that produce the exact same set of peptides are grouped together.)

The table also lists the number of transcripts and corresponding peptides in each set (each pair of WT and MT peptides are considered 1 when counting).
A sum of the total expression across all transcripts in each set is also shown.
A light green color is used to highlight the Transcript Set producing the Best Peptide for the variant in question.

Transcript Set Detailed Data

Upon selecting a specific transcript set, you can see more details about the exact transcripts that are included.

The Transcripts in Set table lists all information regarding each transcript including:

Transcript ID, Gene Name, Amino Acid Change, Mutation Position, individual transcript expression, transcript support level, biotype and transcript length.

A light green color is used to highlight the specific Transcript in Selected Set that produced the Best Peptide for the variant in question.

Reference Match

When the --run-reference-proteome-similarity option is chosen, pVACseq will output a file of found matches of the epitode candidates in the reference proteome.

The Reference Matches tab will display the subsequent matches for the candidate currently being investigated:

The tab shows the best peptide with the variant highlighted in red, the query data which includes the flanking sequence and the best peptide highlighted in yellow, and a table of reference proteome hits

The Hits table will display the peptides that match the query sequence and the genes, transcripts, and Hit IDs of the found matches.

Peptide Table

Upon selecting a specific transcript set, you can also visualize which well-binding peptides are produced from this set. The best peptide is highlighted in light green.

Both, mutant ( MT ) and wildtype ( WT ) sequences are shown, along with either the lowest or median binding affinities, depending on how you generated the aggregate report.

An X is marked for binding affinities higher than the aggregate_inclusion_binding_threshold set when generating the aggregate report.

We also include three extra columns, one specifying the mutated position(s) in the peptide, one providing information on any problematic amino acids in the mutant sequence, and one identifying whether the peptide failed the anchor criteria for any of the HLA alleles.
Note that if users wish to utlitize the problematic positions feature, they should run the standalone command pvacseq identify_problematic_amino_acids or run pVACseq with the --problematic-amino-acids option enabled to generate the needed information.

Anchor Heatmap

The Anchor Heatmap tab shows the top 30 MT/WT peptide pairs from the peptide table with anchor probabilities overlayed as a heatmap. The anchor probabilities shown are both allele and peptide length specific. The mutated amino acid is marked in red (for missense mutations) and each MT/WT pair are separated from others using a dotted line.
For peptide sequences with no overlaying heatmap, we currently do not have allele-specific predictions in our database.

The Anchor Weights section shows a table of the per-allele per-length anchor weights for each peptide position.

For more details and explanations regarding anchor positions and its influence on neoantigen prediction and prioritization, please refer to the next section: Advanced Options: Anchor Contribution

Additional Information

IC50 Plot

By clicking on each MT/WT peptide pair, you can then assess the peptides in more detail by navigating to the Additional Peptide Information tab.

There are five different tabs in this section of the app, providing peptide-level details on the MT/WT peptide pair that you have selected.
The IC50 Plot tab shows violin plots of the individual IC50-based binding affinity predictions of the MT and WT peptides for HLA alleles that the MT binds well to. These peptides each have up to 8 binding algorithm scores for Class I alleles or up to 4 algorithm scores for Class II alleles.

%ile Plot

The %ile Plot tab shows violin plots of the individual percentile-based binding affinity predictions of the MT and WT peptides for HLA alleles that the MT binds well to.

Binding Data

The Binding Data tab shows the specific IC50 and percentile binding affinity predictions generated from each individual algorithm. Each cell shows the IC50 prediction followed by the percentile predictions in parenthesis.

Elution and Immunogenicity Table

The Elution and Immunigenicity Table tab shows prediction results based on algorithms trained from peptide elution data. This includes algorithms such as NetMHCpanEL/NetMHCIIpanEL, MHCflurryELProcessing and MHCflurryELPresentation.

Anchor vs Mutation Positions

Neoantigen identification and prioritization relies on correctly predicting whether the presented peptide sequence can successfully induce an immune response. As the majority of somatic mutations are single nucleotide variants, changes between wildtype and mutated peptides are typically subtle and require cautious interpretation.

In the context of neoantigen presentation by specific MHC alleles, researchers have noted that a subset of peptide positions are presented to the T-cell receptor for recognition, while others are responsible for anchoring to the MHC, making these positional considerations critical for predicting T-cell responses.

Multiple factors should be considered when prioritizing neoantigens, including mutation location, anchor position, predicted MT and WT binding affinities, and WT/MT fold change, also known as agretopicity.

Examples of the four distinct possible scenarios for a predicted strong MHC binding peptide involving these factors are illustrated in the figure on the right. There are other possible scenarios where the MT is a poor binder, however those are not listed as they would not pertain to our goal of neoantigen identification.

Scenario 1 shows the case where the WT is a poor binder and the MT peptide is a strong binder, containing a mutation at an anchor location. Here, the mutation results in a tighter binding of the MHC and allows for better presentation and potential for recognition by the TCR. As the WT does not bind (or is a poor binder), this neoantigen remains a good candidate since the sequence presented to the TCR is novel.

Scenario 2 and Scenario 3 both have strong binding WT and MT peptides. In Scenario 2 , the mutation of the peptide is located at a non-anchor location, creating a difference in the sequence participating in TCR recognition compared to the WT sequence. In this case, although the WT is a strong binder, the neoantigen remains a good candidate that should not be subject to central tolerance.

However, as shown in Scenario 3 , there are neoantigen candidates where the mutation is located at the anchor position and both peptides are strong binders. Although anchor positions can themselves influence TCR recognition, a mutation at a strong anchor location generally implies that both WT and MT peptides will present the same residues for TCR recognition. As the WT peptide is a strong binder, the MT neoantigen, while also a strong binder, will likely be subject to central tolerance and should not be considered for prioritization.

Scenario 4 is similar to the first scenario where the WT is a poor binder. However, in this case, the mutation is located at a non-anchor position, likely resulting in a different set of residues presented to the TCR and thus making the neoantigen a good candidate.

Anchor Guidance

To summarize, here are the specific criteria for prioritizing (accept) and not prioritizing (reject) a neoantigen candidate:
Note that in all four cases, we are assuming a strong MT binder which means (MT IC50 < binding threshold) OR (MT percentile < percentile threshold)

I: WT Weak binder : (WT IC50 < binding threshold) OR (WT percentile < percentile threshold)

II: WT Strong binder : (WT IC50 > binding threshold) AND (WT percentile > percentile threshold)

III: Mutation at Anchor : set(All mutated positions) is a subset of set(Anchor Positions of corresponding HLA allele)

IV: Mutation not at Anchor : There is at least one mutated position between the WT and MT that is not at an anchor position

Scenario 1 : I + IV -> Accept

Scenario 2 : II + IV -> Accept

Scenario 3 : II + III -> Reject

Scenario 4 : I + III -> Accept

Reassigning Tiers for all variants after adjusting parameters

The Tier column generated by pVACtools is aimed at helping users group and prioritize neoantigens in a more efficient manner. For details on how Tiering is done, please refer to the Variant Level tutorial tab where we break down each specific Tier and its criteria.

While we try to provide a set of reasonable default parameters, we fully understand the need for flexible changes to the parameters used in the underlying Tiering algorithm. Thus, we provide an Advanced Options tab in our app where users can change the following cutoffs custom to their individual analysis:

Binding Threshold

IC50 cutoff for a peptide to be considered a strong binder. Note that if allele-specific binding thresholds are in place, those will stay the same and not overwritten by this parameter value change.

Percentile Threshold

Percentile cutoff for a peptide to be considered a strong binder.

Clonal DNA VAF

VAF cutoff that is taken into account when deciding subclonal variants. Note that variants with a DNA VAF lower than half of the clonal VAF cutoff will be considered subclonal (e.g. setting a 0.6 clonal VAF cutoff means anything under 0.3 VAF is subclonal).

Allele Expr

Allele expression cutoff for a peptide to be considered expressed. Note for each variant, the allele expression is calculated by multiplying gene expression and RNA VAF.

Default Anchors vs Allele-specific Anchors
By default, pVACtools considers positions 1, 2, n-1, and n to be anchors for an n-mer allele. However, a recent study has shown that anchors should be considered on an allele-specific basis and different anchor patterns exist among HLA alleles. Here, we provide users with the option to utilize allele-specific anchors when generating the Anchor Tier. However, to objectively determine which positions are anchors for each individual allele, the users need to set a contribution percentage threshold (X). Per anchor calculation results from the described computational workflow in the cited paper, each position of the n-mer peptide is assigned a score based on how its binding to a certain HLA allele was influenced by mutations. These scores can then be used to calculate the relative contribution of each position to the overall binding affinity of the peptide. Given the contribution threshold X, we rank the normalized score across the peptide in descending order (e.g. [2,9,1,3,2,8,7,6,5] for a 9-mer peptide) and start summing the scores from top to bottom. Positions that together account for X% of the overall binding affinity change (e.g. 2,9,1) will be assigned as anchor locations for tiering purposes.

However, we recommend users also navigating to the Anchor Heatmap Tab in the peptide level description for a less binary approach.

Original Parameters

In this box, we provide users with the original parameters they had used to generate the currently loaded aggregate report and metrics file.

Note that the app will keep track of your peptide evaluations and comments accordingly even when changing or reseting the parameters.

If you see a parameter in the original parameter box but did not see an option to change it in the advanced options section, it is likely that you will be required to rerun the pvacseq generate-aggregate-report command. This is likely due to the current metrics file not having the necessary peptide information to perform this request.

Current Parameters

In this box, we provide users with the tiering parameters that currently applied to the aggregate report. This not only allows users to compare their current parameters (if changed) with the original parameters.

Resetting Parameters

The reset button allows the user to restore the original tiering when desired.

Module for Exploring NeoFox Annotated Neoantigens

The one required file should end with the suffix "_neoantigen_candidates_annotated.tsv". The module expects all all NeoFox annotated features to be in in the file and can handle input with other annotations you might append to the neoantigen candidates.

Features

Annotated Neoantigen Table

The annotated neoantigen table is generated as output from NeoFox and includes many annotations based on published neoantigen features. You can page through the candidates, sort by any feature, and select one or more candidates for further investgation. We have marked the features we find most informative with an asterisk. These columns are selected by default but additional columns can be selected using the "Column Visibility" dropdown.

Colored heatmap cell backgrounds on binding affinity and rank columns indicate where the value falls in comparison to the default 500 nM binding affinity and 0.5 percentile thresholds, respectively. Green background colors indicate a value below the threshold while yellow to red colors indicate a progressively higher value above the threshold. Horizontal barplot backgrounds on the expression and VAF columns reflect how close the values are to the "ideal" values of 50 and 1, respectively.

Comparative Violin Plots

You can understand how selected candidates relate to the the rest of the dataset using the comparative violin plots. You can select as many candidates as you would like which will then be highlighted in red in the violin plots. You can also select up to six features to view at a time. We have pre-selected five features which we found informative.

Dynamic Scatter Plot

You can also further investigate the data using the dynamic scatter plot where you can choose any feature to be the X-axis, Y-axis, color, or size variable. The X and Y scale can be transformed and a range of values subsetted. The color represents the minimum and maximum values can also be changed to any HEX value.

To view information about different points on the plot simply mouse over individual points. You can also export the current scatter plot by using the camera icon at the top right corner of the plot.

Evaluation and Commenting

The evaluation buttons at the right of each candidate row can be used to capture the final decision on whether to accept, reject, or further review the candidate. The total counts for each type of evaluation are displayed in the "Peptide Evaluation Overview" panel.

You are able to leave a comment on all selected candidates by using the form in the panel on the top right of the page. This panel also displays the comments left on each selected candidate. Both the selected evaluation and comment are included in the exported table.

Exporting

After investigating and evaluating your candidates, you can export the main table, including the final evaluation and comment for each candidate. After browsing to the "Export Data" tab, click the "Download as TSV" or "Download as excel" button to download the table in your desired file format.

Module for Exploring Any Annotated Neoantigens

The custom module boasts the most flexibility for viewing your data, since there are no required features that are expected to be in the file.

We provide three examples of neoantigen prediction pipeline output data: Vaxrank, NeoPredPipe, and antigen.garnish.2

When you upload your file, you can then choose how to visualize the data by selecting which feature from your input you would like to group and sort candidates by. The feature you choose to group by will allow you to explore candidates that are simliar to one another in a separate table. For example, to mimic the pVACseq Module grouping you could select to group by variant. The order of the candidates in each grouping is determined by the numeric feature you choose to sort by. Canidates within the pVACseq Module are sorted by best binders. Finally, you can select what features to display for each group of peptides, the default selection is all features.

Features

Overview of Neoantigen Features

The Overview of Neantigen Features table displays the groups of candidates as designated by the feature you specify. The top candidate of the group according to the sort by feature is shown in the table. To investage the candidates within the group, click the Investigate button.

Detailed Data

The Detailed Data table shows you all the candidates within the group so that you can compared them to one another. This table will only display the features that you selected on the upload page.

Dynamic Scatter Plot

To view information about different points on the plot simply mouse over individual points. You can also export the current scatter plot by using the camera icon at the top right corner of the plot.

Option 1: View NeoFox demo data

Please wait a couple seconds after clicking for the data to load.

Option 2: Upload your own neofox data files

(Required) Please upload your neofox output file. This file should be a table generated by NeoFox with the suffix “_neoantigen_candidates_annotated.tsv“

NeoFox (NEOantigen Feature toolbOX)

NeoFox (NEOantigen Feature toolbOX) is a python package that annotates a given set of neoantigen candidate sequences with relevant neoantigen features.

The tool covers neoepitope prediction by MHC binding and ligand prediction, similarity/foreignness of a neoepitope candidate sequence, combinatorial features and machine learning approaches by running a wide range of published toolsets on the given input data. For more detailed information on the specific neoantigen-related algorithms and how to generate your own NeoFox results, please refer to the link below:

Peptide Evaluation Overview