VCF Quality Check

Assessing quality of exome variant data using a standardized similarity score

Upload your exome sequence variants in VCF format for quality assessment. You can also use the following sample of the 1000 genomes project for testing: NA06986.

I hereby certify that I have the following consent for the analysis of electronic data:
We care about privacy and confidentiality of genetic information!
There is a vivid debate in the bioethics community about what consent is needed to analyze and share genetic data. Please learn more about GeneTalk’s position in this important issue and discuss with us in this Blog article!

Note that your VCF file will be removed from our server after the analysis.

Download the QC software.

How to interpret your results

We use a genotype-weighted metric to measure the similarity between your uploaded variant set and the appropriate reference data of the 1000 genomes project. The resulting distance matrix is then visualized in 2 dimensions by MDS. In the figure on the left individuals of European descent that were analyzed by the 1000 genomes project form a homogeneous cluster. Test sample 1 forms part of this reference cluster and is therefore of high quality while test sample 2 is separated from this cluster indicating lower quality. The exome-wide genotyping accuracy may be estimated by intersecting the computed standardized dissimilarity score with the reference curve (right) indicating higher accuracy for sample 1 compared to sample 2.

