Documentation
LDlink is designed to be an intuitive and simple tool for investigating patterns of linkage disequilibrium across a variety of ancestral population groups. This help documentation page gives detailed description of the metrics calculated by LDlink modules and aids users in understanding all aspects of the required input and returned output. This application's source code can be viewed on GitHub. The documentation is divided into the following sections:
Understanding Linkage Disequilibrium
Understanding Linkage Disequilibrium
Linkage Equilibrium exists when alleles from two different genetic variants occur independently of each other. The inheritance of such variants follows probabilistic patterns governed by population allele frequencies. The vast majority of genetic variants on a chromosome are in linkage equilibrium. Variants in linkage equilibrium are not considered linked.
Linkage Disequilibrium is present when alleles from two nearby genetic variants commonly occur together in a non-random, linked fashion. This linked mode of inheritance results from genetic variants in close proximity being less likely to be separated by a recombination event and thus alleles of the variants are more commonly inherited together than expected. Alleles of variants in linkage disequilibrium are correlated; with the degree of correlation generally greater in magnitude the closer the variants are in physical distance. Measures of linkage disequilibrium include D prime (D') and R squared (R2).
Haplotype is a cluster of genetic variants that are inherited together. Humans are diploid; having maternal and paternal copies of each autosomal chromosome. Each chromosomal copy is organized into segments of high linkage disequilibrium, called haplotype "blocks". Due to unique population histories and differences in variant allele frequencies, haplotype structure tends to be population specific. Although haplotypes are essential for calculating measures of linkage disequilibrium, haplotypes are seldom directly observed. Statistical chromosome phasing techniques are often necessary to infer individual haplotypes.
Data Sources
dbSNP (source: GRCh37 and GRCh38) - To investigate patterns of linkage disequilibrium, LDlink focuses on two main class of genetic variation: single nucleotide polymorphisms (SNPs) and insertions/deletions (indels). Every module of LDlink requires the entry of at least one variant as identified by a RefSNP number (RS number) or genomic position (chr#:position). RS numbers have been assigned by dbSNP and are well-curated identifiers that follow the format "rs" followed by 1 to 8 numbers. The current implementation of LDlink references dbSNP and only accepts input for bi-allelic variants.
1000 Genomes Project (source: GRCh37, GRCh38, and GRCh38 High Coverage) - Publicly available reference haplotypes from the 1000 Genomes Project are used by LDlink to calculate population-specific measures of linkage disequilibrium. Haplotypes are available for continental populations (ex: European, African, and Admixed American) and sub-populations (ex: Finnish, Gambian, and Peruvian). All LDlink modules require the selection of at least one 1000 Genomes Project sub-population, but several sub-populations can be selected simultaneously. Available haplotypes vary by sub-population based on sample size.
UCSC RefSeq (source: GRCh37 and GRCh38) - Publicly available gene transcripts from the UCSC Table Browser are used by LDlink's LDassoc, LDmatrix, and LDproxy modules to display genes within the genomic window of interest.
RegulomeDB (source: GRCh37) - Publicly available scores from RegulomeDB are used by LDlink's LDassoc and LDproxy modules to rank available datatypes for a single coordinate. GRCh38 support is added via liftOver.
Genetic Map (source: GRCh37) - Publicly available combined recombination rates (cM/Mb) from the 1000 Genomes Project are used by LDlink's LDassoc and LDproxy modules to show recombination at specific coordinates. GRCh38 support is added via liftOver.
GTEx Portal (source: GRCh38) - Publicly available single-tissue cis-QTL data from the GTEx Portal is used by LDlink's LDexpress module to show significant variant-gene associations in multiple tissue types. GRCh37 support is added via GTEx lookup table.
GWAS Catalog (source: GRCh38) - Publicly available NHGRI-EBI Catalog of human genome-wide association studies from GWAS Catalog is used by LDlink's LDtrait module to search if variants have previously been associated with a trait or disease. GRCh37 support is added via dbSNP.
FORGEdb (source: FORGEdb) - Publicly available scores from FORGEdb are used by LDlink's LDassoc and LDproxy modules to rank putative functionality for a single variant by RSID.
Calculations
LDlink modules report the following measures of linkage disequilibrium: D prime, R squared, and goodness-of-fit statistics. Below is a brief description of each measure.
D prime (D') - an indicator of allelic segregation for two genetic variants. D' values range from 0 to 1 with higher values indicating tight linkage of alleles. A D' value of 0 indicates no linkage of alleles. A D' value of 1 indicates at least one expected haplotype combination is not observed.
R squared (R2) - a measure of correlation of alleles for two genetic variants. R2 values range from 0 to 1 with higher values indicating a higher degree of correlation. An R2 value of 0 indicates alleles are independent, whereas an R2 value of 1 indicates an allele of one variant perfectly predicts an allele of another variant. R2 is sensitive to allele frequency.
Goodness of Fit (Χ2 and p-value) - statistical test testing whether observed haplotype counts follow frequencies expected from variant allele frequencies. High chi-square statistics and low p-values are evidence that haplotype counts deviate from expected values and suggest linkage disequilibrium may be present.
Modules
LDlink consists of eleven modules: LDassoc, LDexpress, LDhap, LDmatrix, LDpair, LDpop, LDproxy, LDtrait, LDscore, SNPchip, SNPclip.Choose between GRCh37 (hg19), GRCh38 (hg38), and GRCh38 High Coverage (hg38) 1000 Genome Project datasets with the Genome Build (1000G) dropdown menu on the top left. Each module can be accessed by clicking on LD Tools dropdown at the top of all LDlink pages. Below is a description of each module, the required user input, and an explanation of the returned output.
LDassoc
- Interactively visualize association p-value results and linkage disequilibrium patterns for a genomic region of interest. Based on the file size, number of variants and selected query populations, this module may take some time to run.Input:
- Association data file - upload a space or tab delimited file containing chromosome, position and p-value for each variant. A file header, while not required, will be useful when selecting input columns. Once the file is uploaded the user needs to specify which columns are the respective chromosome, position, and association p-values. Uploaded files will be stored on our secure server for use during LDassoc sessions and will be deleted after 4 hours.
- Select region - choices are Gene, Region, or Variant. Gene requires a RefSeq gene name and base pair window and allows for an optional index variant. Region requires genomic start and end coordinates and allows for an optional index variant. Variant requires an index variant and allows for an optional base pair window (500,000bp is the default). The index variant will be plotted in blue and is used to calculate all pairwise LD statistics. If not required, LDassoc will designate the variant with the lowest p-value as the index variant, unless otherwise specified.
- Reference population(s) - selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- LD measure - select if desired output is based on estimated R2 or D'.
- Collapse transcripts - choose to combine gene transcripts with the same name in the gene plot.
- RegulomeDB annotation - choose to display RegulomeDB scores in the interactive plot.
Output:
- Interactive plot - interactive plot of index variant and p-values of all bi-allelic variants in the specified plotting region. X axis is the chromosomal coordinates and the Y axis is the -log10 p-value as well as the combined recombination rate. Each point represents a variant and is colored based on D' or R2, sized based on minor allele frequency, and labeled based on regulatory potential. Hovering over the point will display detailed information on the index and proxy variants.
- UCSC link - external link to the plotted region in the UCSC Genome Browser. This is useful for exploring nearby genes and regulatory elements in the region.
- Table of Association Results - by default, the ten variants with the lowest p-values and closest distance to the index variant are displayed. External links lead to the variant RS number in dbSNP, coordinates in the UCSC Genome Browser, and regulatory information (if any) in RegulomeDB.
- Download association data for all variants - download a file with information on all variants within the plotting region.
LDexpress
- Search if a list of variants (or variants in LD with those variants) is associated with gene expression in tissues of interest. Quantitative trait loci data is downloaded from the GTEx Portal (GTEx v8).Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 10 variant RS numbers or chromosomal coordinates (GRCh37) are permitted. All input variants must match a bi-allelic variant. The text entry field is automatically filled if a file of SNPs or genomic coordinates is uploaded.
- Reference population(s) - selected from the drop-down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- R2/D' toggle - select if desired output is filtered from a threshold based on estimated R2 or D'.
- R2/D' threshold - set threshold for LD filtering. Any variants within -/+ of the specified genomic window and R2 or D' less than the threshold will be removed. Value needs to be in the range 0 to 1. Default value is 0.1.
- Tissue(s) - select the GTEx tissue or tissues of interest for searching for eQTLs.
- P-value threshold - define the eQTL significance threshold used for returning query results. Default value is 0.1 which returns all GTEx eQTL associations with P-value less than 0.1. Values can be entered in scientific notation (i.e. "1e-5").
- Window size - set genomic window size for LD calculation. Specify a value greater than or equal to zero and less than or equal to 1,000,000bp. Default value is -/+ 500,000bp.
Output:
- Variants in LD with variants in GTEx QTL - a list of queried variants in LD with a variant reported in the GTEx QTL data. Each variant in the list is a clickable link that brings up a detailed table showing genes with expression associated variants that are in linkage disequilibrium with the input variant(s).
- Details Tables - output table that appears when a variant in the variant list is clicked. Details are provided on SNP rsID, genomic position, R2, D', Gene Symbol of associated gene expression, Gencode ID of the expressed gene, tissue where the association was observed, Effect Size, and P-value. External links lead to the QTL page in the GTEx portal.
- Variants with Warnings - output table that appears when a variant produces a warning in LDexpress. Common reasons for a variant to be on the warning list are if the variant is not found in dbSNP/GTEx or if no variants in LD are found within window of the 1000G reference VCF file. Details are provided on genomic position and why the variant was excluded from the variant list.
- Download GTEx QTL list - clickable link to download a text file in tab-delimited format that lists all query variant RS numbers, respective QTL which are in LD with query variant, and associated gene expression.
LDhap
- Calculate haplotype frequencies for a list of up to 30 bi-allelic variants in a selected population. Results include a table of haplotype frequencies and a downloadable file.Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 30 variant RS numbers are permitted. All input variants must be on the same chromosome and match a bi-allelic variant. The text entry field is automatically filled with the contents of the uploaded file.
- Reference population(s) - selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
Output:
- Table of observed haplotypes - haplotypes with frequencies greater than 1 percent are displayed vertically and ordered by observed frequency in the selected query sub-population. Variant genotypes are reported in rows and and are sorted by genomic position. Queries should be limited to nearby variants, since haplotype switch rate errors become more common as genomic distance increases. Links are available to dbSNP RS numbers and coordinates in theUCSC Genome Browser.
- Variant file download - download a file listing the order of the queried variants. Variants are ordered by genomic location. This file is used as a key for the order of genotypes in the haplotype file. Only bi-allelic variants with matching dbSNP RS numbers are included. All others are filtered out.
- Haplotype file download - download a file of all observed haplotypes. Haplotype genotypes are in the order of the above variant file.
LDmatrix
- Create an interactive heatmap matrix of pairwise linkage disequilibrium statistics.Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 300 variant RS numbers are permitted. All input variants must be on the same chromosome and match a bi-allelic variant. The text entry field is automatically filled with the contents of the uploaded file.
- Reference population(s) - selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- Genome build - choose to combine gene transcripts with the same name in the gene plot.
Output:
- Interactive heat map - a square plot with dimensions equal to the number of query variants that match a dbSNP RS number. Hovering over the plot with the mouse will display pairwise LD metrics for the respective row and column variants. Variants are ordered by genomic coordinates.
- D prime download - download a file of all pairwise D' statistics.
- R squared download - - download a file of all pairwise R2 statistics.
LDpair
- Investigate correlated alleles for a pair of variants in high LD.Input:
- Pair of RS numbers - RS number for query variant 1. RS number must match a bi-allelic variant.
- Reference population(s) - RS number for query variant 2. RS number must match a bi-allelic variant.
- Genome build selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
Output:
- Two-by-two table - contingency table displaying haplotype counts and allele frequencies for the two query variants. External links are available to dbSNP RS numbers and coordinates in the UCSC Genome Browser.
- Haplotypes - haplotype genotypes, counts, and frequencies.
- Statistics - calculated metrics of linkage disequilibrium including: D prime (D'), R square (R2), and goodness-of-fit (Chi-square and p-value). Goodness-of-fit tests for deviations of expected haplotype frequencies based on allele frequencies.
- Correlated Alleles - alleles that are correlated if linkage disequilibrium is present (R2 > 0.1). If linkage equilibrium, no alleles are reported.
LDpop
- Investigate allele frequencies and linkage disequilibrium patterns across 1000G populations.Input:
- Variant1 RS number - RS number for query variant 1. RS number must match a bi-allelic variant.
- Variant2 RS number - RS number for query variant 2. RS number must match a bi-allelic variant.
- Reference population(s) selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- R2/D' toggle - select if desired output is based on estimated R2 or D'.
Output:
- Interactive maps - three interactive maps will be returned as tabbed output. The first tab displays a map with the chosen linkage disequilibrium measure between query variants 1 and 2 for the selected 1000G populations. The second and third tabs will display maps of allele frequency distribution for query variants 1 and 2, respectively, for the selected 1000G populations. Map pins are color coded based on LD/allele frequency and when clicked display additional information for the population.
- Table of populations - a searchable and sortable table is generated showing allele frequency and LD values for the query variants and selected 1000G populations.
- Download table - link to download data in the table of populations.
LDproxy
- Interactively explore proxy and putatively functional variants for a query variant. Based on the selected query populations, this module may take some time to run.Input:
- Variant RS number - RS number for query variant. RS number must match a bi-allelic variant.
- Reference population(s) - selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- LD measure - select if desired output is based on estimated R2 or D'.
- Collapse transcripts - choose to combine gene transcripts with the same name in the gene plot.
Output:
- Interactive plot - interactive plot of query variant and all bi-allelic dbSNP variants plus or minus 500 kilobases(Kb) of the query variant. X axis is the chromosomal coordinates and the Y axis is the pairwise R2 value with the query variant as well as the combined recombination rate. Each point represents a proxy variant and is colored based on function, sized based on minor allele frequency, and labeled based on regulatory potential. Hovering over the point will display detailed information on the query and proxy variants.
- UCSC link - external link to the plotted region (query variant -/+ 500 Kb) in the UCSC Genome Browser. This is useful for exploring nearby genes and regulatory elements in the region.
- Table of proxy variants - by default, the ten variants with the highest R2 values and closest distance to the query variant are displayed. External links lead to the variant RS number in dbSNP, coordinates in the UCSC Genome Browser, and regulatory information (if any) in RegulomeDB.
- Download all proxy variants - download a file with information on all variants -/+ 500 Kb of the query variant with a pairwise R2 value greater than 0.01.
LDtrait
- Search if a list of variants (or variants in LD with those variants) that have previously been associated with a trait or disease. Trait and disease data is updated nightly from the GWAS Catalog. Problematic data entries in the GWAS Catalog can be browsed here.Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 50 variant RS numbers or chromosomal coordinates (GRCh37) are permitted. All input variants must match a bi-allelic variant. The text entry field is automatically filled with the contents of the uploaded file.
- Reference population(s) selected from the drop-down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- R2/D' toggle - select if desired output is filtered from a threshold based on estimated R2 or D'.
- R2/D' threshold - set threshold for LD filtering. Any variants within -/+ of the specified genomic window and R2 or D' less than the threshold will be removed. Value needs to be in the range 0 to 1. Default value is 0.1.
- Window size - set genomic window size for LD calculation. Specify a value greater than or equal to zero and less than or equal to 1,000,000bp. Default value is -/+ 500,000bp.
Output:
- Variants in LD with GWAS Catalog - a list of queried variants in LD with a variant reported in the GWAS Catalog. Each variant in the list is a clickable link that brings up a detailed table showing disease-associated variants that are in linkage disequilibrium with the variant.
- Details Tables - output table that appears when a variant in the variant list is clicked. Details are provided on GWAS trait, genomic position, alleles, R2, D', Effect Size (95% CI), Beta or OR, and P-value. External links lead to the result variant in the GWAS Catalog.
- Variants with Warnings - - output table that appears when a variant produces a warning in LDtrait. Common reasons for a variant to be on the warning list are if the variant is not found in dbSNP or if no variants in LD are found within window of the GWAS Catalog. Details are provided on genomic position and why the variant was excluded from the variant list.
- Download GWAS Catalog annotated variant list - clickable link to download a text file in tab-delimited format that lists all query variant RS numbers and their respective GWAS Catalog results.
LDscore
- Perform LD score regression analysis online utilizing the ldsc tool developed in https://github.com/bulik/ldsc.Input:
- For LD score calculation - genotype information as .bim/.bed/.fam files, as well as window in centiMorgan (cM) or kilobasepairs (kb) selected from the drop-down menu.
- For heritability analysis GWAS summary statistics (format: tab-separated files with columns accepted by https://github.com/bulik/ldsc/blob/master/munge_sumstats.py, e.g. with effect size, p-value, SNP identifiers, and allelic information, note header naming format is important). For example: SNP (Variant ID e.g., rs number), A1 (Allele 1, interpreted as reference allele for signed sumstat), A2 (Allele 2, interpreted as non-reference allele for signed sumstat), Frq (Allele frequency), BETA or [linear/logistic] regression coefficient (0 --> no effect; above 0 --> A1 is trait/risk increasing), P (p-Value), N (Sample size). In addition, a reference population needs to be selected from the drop down menu.
- For genetic correlation - GWAS summary statistics (format: tab-separated files with columns accepted by https://github.com/bulik/ldsc/blob/master/munge_sumstats.py, e.g. with effect size, p-value, SNP identifiers, and allelic information, note header naming format is important). For example: SNP (Variant ID e.g., rs number), A1 (Allele 1, interpreted as reference allele for signed sumstat), A2 (Allele 2, interpreted as non-reference allele for signed sumstat), Frq (Allele frequency), BETA or [linear/logistic] regression coefficient (0 --> no effect; above 0 --> A1 is trait/risk increasing), P (p-Value), N (Sample size). In addition, a reference population needs to be selected from the drop down menu.
- Reference population - One 1000 Genomes Project/Genome Aggregation Database (gnomAD) sub-population is required for heritability analysis or genetic correlation.
Output:
- For LD Score calculation - A summary of LD Scores, the MAF/LD Score Correlation Matrix, full LD Scores file, as well as the option to download input data, in addition to a code prompt output summary.
- For heritability analysis - Total Observed scale heritability, Lambda GC, Mean Chi^2, Intercept and LDSC Ratio, as well as a code prompt output summary.
- For genetic correlation - tables with heritability of phenotype 1, heritability of phenotype 2, Genetic Covariance, Genetic Correlation and a Summary of Genetic Correlation Results, as well as a code prompt output summary.
SNPchip
- Find commercial genotyping platforms for variants.Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 5,000 variant RS numbers are permitted. All input variants do not need to be on the same chromosome. The text entry field is automatically filled with the contents of the uploaded file.
- Genotyping Arrays - selected commercial genotyping arrays from the drop down list. At least one array must be selected, but more than one may be selected as input.
Output:
- Array Table - output table of variant rows and array columns. The presence of an "X" designates the variant is present on the respective commercial genotyping array. Variants are ordered by genomic location. External links lead to the variant RS number in dbSNP and coordinates in the UCSC Genome Browser
- Download Chip Details - download a file with information on all variants and commercial genotyping array membership.
SNPclip
- Prune a list of variants for linkage disequilibirum.Input:
- List of RS numbers - this can either be entered one per line in the text entry box or uploaded as a file that contains a list of RS numbers in the first column. A maximum of 5,000 variant RS numbers are permitted. All input variants must be on the same chromosome and match a bi-allelic variant. The text entry field is automatically filled with the contents of the uploaded file.
- Reference population(s) - selected from the drop down menu. At least one 1000 Genomes Project sub-population is required, but more than one may be selected.
- R2 threshold - set R2 threshold for LD pruning. One of each pair of variants with a R2 greater than the threshold is removed. Value needs to be in the range 0 to 1. Default value is 0.1.
- MAF threshold - set MAF threshold for LD pruning. Variants with a MAF less than or equal to the threshold are removed. Value needs to be in the range 0 to 1. Default value is 0.01.
Output:
- LD Thinned Variant List - a list of LD pruned variants based on the specified SNPclip parameters. Each variant in the list is a clickable link that brings up a detailed table showing other variants that are in linkage disequilibirum with the variant. Variants are ordered by input order, so to force a variant to be in the LD thinned list it should be at the top of the input list.
- Details Tables - output table that appears when a variant in the thinned variant list is clicked. Details are provided on genomic position, alleles, and why the variant was included or excluded from the thinned variant list. External links lead to the variant RS number in dbSNP and coordinates in the UCSC Genome Browser
- Variants with Warnings - output table that appears when a variant produces a warning in SNPclip. Common reasons for a variant to be on the warning list include a variant with an MAF below the MAF threshold, a query that is not a RS number, and a RS number not found in the 1000G reference VCF file. Details are provided on genomic position, alleles, and why the variant was included or excluded from the thinned variant list. External links lead to the variant RS number in dbSNP and coordinates in the UCSC Genome Browser
- Download Thinned Variant List - download a text file that lists all thinned variant RS numbers, one per line.
- Download Thinned Variant List with Details - download a text file that lists all query variant RS numbers, genomic locations, alleles and details of whether the variant was kept or removed.
Frequently Asked Questions
Why is my variant RS number bringing up an error?
LDlink modules only accepts input for variant RS numbers that are bi-allelic. Ensure your query SNP has A/C, A/G, A/T, C/G, C/T, or G/T alleles. RS numbers for insertions or deletions (i.e. "indels") are also now accepted as input. If a variant you believe is bi-allelic is not accepted, check dbSNP to ensure there are only two alleles for the variant. Even if there is only one reports of a variant being tri- or multi-allelic in dbSNP, the variant is not considered valid input under the current implementation of LDlink.
What 1000 Genomes population should I select for my LDlink query?
Choosing the correct reference 1000 Genomes Project population is essential for comparability of results from LDlink to your population of interest. In general, try to select the sub-population that best matches the ancestry of your study population. While LDlink allows for multiple 1000 Genomes sub-populations to be selected simultaneously, it is recommended to first investigate the patterns of linkage disequilibrium in the query region of each sub-population before considering whether to combine sub-populations in a query.
What is the estimated running time for an LDproxy query?
Running time varies greatly for LDproxy ranging from a few seconds to approximately one minute. For each query, LDproxy actively calculates all LD metrics based on user input. A variety of factors affect overall running time including: (1) the number of haplotypes available for the sub-population(s), (2) the number of dbSNP variants in the region queried, (3) the current utilization of the LDlink server, and (4) the download speed of your internet connection. In general, the best way to speed up an LDlink query is to only select one sub-population per query. Queries that include multiple sub-populations or ancestral groups take much longer to complete than single sub-population queries and tie up limited system resources.
How do I save output files?
Output from a variety of LDlink modules (LDhap, LDmatrix, and LDproxy) is available for download. To save output either (1) right click or long press on the link and select the "Save Link As…" option or (2) open the link in a new browser window and copy and paste the contents to a new file.
Can I save the plots generated in LDmatrix or LDproxy?
Yes. Click the Save icon on the top right of the plot. A preview window will appear. Right click or long press on the preview image, select "Save As…", and choose a filename for the plot image.
Does LDproxy display all variants in the window around the query variant?
For plotting and performance issues, LDproxy plots only include variants with R2 values greater than 0.01. Additionally, the LDproxy download also only includes variants with R2 values greater than 0.01.
What is the maximum number of variants LDhap and LDmatrix can accept as input?
LDhap has a limit of 30 variants and LDmatrix has a limit of 300 variants. LDmatrix supports up to 1,000 variants via API call. These limits were selected to optimize query speed and visualization of results.
Why are the variants in LDhap and LDmatrix output not in the same order as the input?
Variants in LDhap and LDmatrix output are not ordered based on user input. Rather, results are ordered based on chromosomal order.
Why does my computer seem slow when using LDlink?
The interactive plots produced by LDmatrix and LDproxy require a sizeable amount of data and java script to be loaded into the web browser. On some computers this results in a notable lag in performance. To improve performance, simply refresh the LDlink webpage to clear the temporary data from memory. Refreshing the LDlink webpage will remove all interactive plots.
What genome build does LDlink use for genomic coordinates?
LDlink supports GRCh37 (hg19) and GRCh38 (hg38). Choose between GRCh37 (hg19), GRCh38 (hg38), and GRCh38 High Coverage (hg38) 1000 Genome Project datasets with the Genome Build (1000G) dropdown menu on the top left.
Why do variant coordinates not always match dbSNP or UCSC coordinates?
All coordinates in LDlink are based on variant positions in the 1000 Genomes VCF files. Slight difference in positions may be observed due to updates in mapping and alignment. Please refer to dbSNP for the most up-to-date information on variant coordinates.
What browsers are supported by LDlink?
LDlink has been tested to work with Internet Explorer 10+, Chrome, Firefox 36+ and Safari. LDlink will not display correctly with Internet Explorer 9 and below or Firefox 35 and below.