Id	pipeline/hg38_autism_annotation
Type	annotation_pipeline
Version	0
Summary
Description
Labels

Summary	Autism Annotation Pipeline for hg38
Description	This is a pipeline to annotate variants in hg38 assembly with autism related attributes.
Input reference genome	hg38/genomes/ucsc-hg38

worst_effect

Type:

Worst effect across all transcripts.

source: worst_effect

gene_effects

Type:

<gene_1>:<effect_1>|... A gene can be repeated.

source: gene_effects

effect_details

Type:

Effect details for each affected transcript. Format: < transcript 1 >:<gene 1>:<effect 1>:<details 1>|...

source: effect_details

gene_list

Type: (Internal)

List of all genes

source: gene_list

Annotator type: effect_annotator

Annotator to identify the effect of the variant on protein coding.

More info

Resource

Id: hg38/genomes/GRCh38.p14

Type: genome

Summary:

Nucleotide sequence of the GRCh38.p14 genome assembly

Resource

Id: hg38/gene_models/GENCODE/49/basic/PRI

Type: gene_models

Summary:

GENCODE 49, basic gene annotation on the primary assembly (chromosomes and scaffolds) sequence regions

worst_effect_MANE_1.5

Type:

Worst effect across all transcripts.

source: worst_effect

effect_details_MANE_1.5

Type:

Effect details for each affected transcript. Format: < transcript 1 >:<gene 1>:<effect 1>:<details 1>|...

source: effect_details

gene_effects_MANE_1.5

Type:

<gene_1>:<effect_1>|... A gene can be repeated.

source: gene_effects

Annotator type: effect_annotator

Annotator to identify the effect of the variant on protein coding.

More info

Resource

Id: hg38/genomes/GRCh38.p14

Type: genome

Summary:

Nucleotide sequence of the GRCh38.p14 genome assembly

Resource

Id: hg38/gene_models/MANE/1.5

Type: gene_models

Summary:

MANE gene model version 1.5

normalized_allele

Type: (Internal)

Normalized allele.

source: normalized_allele

Annotator type: normalize_allele_annotator

No description

Resource

Id: hg38/genomes/ucsc-hg38

Type: genome

Summary:

Nucleotide sequence of the GRCh38/hg38 human genome assembly from UCSC

dbSNP_RS

Type:

dbSNP ID (i.e. rs number)

allele_aggregator: list

source: RS

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/scores/dbSNP

Type: allele_score

Summary:

dbSNP: A public database of genetic variations for research and clinical use.

gnomad_v4_exomes_ALL_af

Type:

Alternate allele frequency

allele_aggregator: max

source: AF

gnomad_v4_exomes_ALL_af_percent

Type:

Alternate allele frequency as percent

allele_aggregator: max

source: AF_percent

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/variant_frequencies/gnomAD_4.1.0/exomes/ALL

Type: allele_score

Summary:

gnomAD v4.1.0 exome variants (ALL)

gnomad_v4_genomes_ALL_af

Type:

Alternate allele frequency

allele_aggregator: max

source: AF

gnomad_v4_genomes_ALL_af_percent

Type:

Alternate allele frequency as percent

allele_aggregator: max

source: AF_percent

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/variant_frequencies/gnomAD_4.1.0/genomes/ALL

Type: allele_score

Summary:

gnomAD v4.1.0 genome variants (ALL)

CLNDN

Type:

ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB

allele_aggregator: list

source: CLNDN

CLNSIG

Type:

Aggregate germline classification for this single variant; multiple values are separated by a vertical bar

allele_aggregator: list

source: CLNSIG

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/scores/ClinVar_20251019

Type: allele_score

Summary:

Measure used to assess the clinical significance of genetic variants

phyloP100way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP100way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP100way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 100 species

phyloP30way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP30way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP30way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 30 species

phyloP20way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP20way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP20way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 20 species

phyloP7way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP7way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP7way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 7 species

phastCons100way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons100way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons100way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 100 species

phastCons30way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons30way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons30way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 30 species

phastCons20way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons20way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons20way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 20 species

phastCons7way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons7way

Annotator type: position_score_annotator

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons7way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 7 species

cadd_raw

Type:

CADD raw score for functional prediction of a SNP. The larger the score the more likely the SNP has damaging effect

allele_aggregator: max

source: cadd_raw

cadd_phred

Type:

CADD phred-like score. This is phred-like rank score based on whole genome CADD raw scores. The larger the score the more likely the SNP has damaging effect.

allele_aggregator: max

source: cadd_phred

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

Resource

Id: hg38/scores/CADD_v1.7

Type: allele_score

Summary:

CADD (Combined Annotation Dependent Depletion score) predicts the potential impact of a SNP

am_pathogenicity

Type:

AlphaMissense Pathogenicity score is a deleteriousness score for missense variants

allele_aggregator: max

source: am_pathogenicity

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

Resource

Id: hg38/scores/AlphaMissense

Type: allele_score

Summary:

Functional impact of mutations on protein function

hg19_annotatable

Type: (Internal)

The lifted over annotatable

source: liftover_annotatable

Annotator type: liftover_annotator

Annotator to lift over a variant from one reference genome to another.

More info

Resource

Id: liftover/hg38_to_hg19

Type: liftover_chain

Summary:

Liftover Chain hg38 to hg19

Resource

Id: hg38/genomes/ucsc-hg38

Type: genome

Summary:

Nucleotide sequence of the GRCh38/hg38 human genome assembly from UCSC

Resource

Id: hg19/genomes/ucsc-hg19

Type: genome

Summary:

Nucleotide sequence of the GRCh37/hg19 genome assembly from UCSC

mpc

Type:

Missense badness, PolyPhen-2, and Constraint. A deleteriousness prediction score for missense variants"

allele_aggregator: max

source: MPC

Annotator type: allele_score_annotator

Annotator to use with scores that depend on allele like variant frequencies, etc.

Mode (mode parameter, applies to VCFAllele inputs only):

allele (default): exact chrom/pos/ref/alt match.
region: aggregates scores for all allele lines overlapping the annotatable's span.

Non-VCFAllele annotatables always use region aggregation.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/MPC

Type: allele_score

Summary:

MPC (Missense badness, PolyPhen-2, and Constraint) is a composite score that predicts the impact of missense variants.

pLI_rank_all

Type:

Gene rank after sorting by pLI intolerance score

source: pLI_rank

pLI_rank_min

Type:

Gene rank after sorting by pLI intolerance score

source: pLI_rank

Annotator type: gene_score_annotator

No description

Resource

Id: gene_properties/gene_scores/pLI

Type: gene_score

Summary:

Probability of Loss-of-Function Intolerance

LOEUF_rank_all

Type:

Gene ranks after sorting by LOEUF scores

source: LOEUF_rank

LOEUF_rank_min

Type:

Gene ranks after sorting by LOEUF scores

source: LOEUF_rank

Annotator type: gene_score_annotator

No description

Resource

Id: gene_properties/gene_scores/LOEUF

Type: gene_score

Summary:

Degree of intolerance to predicted Loss-of-Function (pLoF) variation

Satterstrom_Buxbaum_Cell_2020_qval

Type:

Multiple hypothesis adjusted p-value for gene-autism association

source: Satterstrom_Buxbaum_Cell_2020_qval

Annotator type: gene_score_annotator

No description

Resource

Id: gene_properties/gene_scores/Satterstrom_Buxbaum_Cell_2020

Type: gene_score

Summary:

TADA derived gene-autism association score

Iossifov_Wigler_PNAS_2015_post_noaut

Type:

Probability of a gene to be associated with autism

source: Iossifov_Wigler_PNAS_2015_post_noaut

Annotator type: gene_score_annotator

No description

Resource

Id: gene_properties/gene_scores/Iossifov_Wigler_PNAS_2015

Type: gene_score

Summary:

Probability of a gene to be associated with autism

SFARI_gene_score

Type:

Evidence strength supporting a gene's association with autism

source: SFARI Gene Score

Annotator type: gene_score_annotator

No description

Resource

Id: gene_properties/gene_scores/SFARI_gene_score_2024_Q1

Type: gene_score

Summary:

SFARI gene score 2024 Q1 release

autism candidates from Iossifov PNAS 2015

Type:

(239) Iossifov I., et al. Low load for disruptive mutations in autism genes and their biased transmission. PNAS (2015)

source: autism candidates from Iossifov PNAS 2015

autism candidates from Sanders Neuron 2015

Type:

(65) Sanders S., et. al, Insights into Autism Spectrum Disorder Genomic Architecture and Biology from 71 Risk Loci. Neuron (2015)

source: autism candidates from Sanders Neuron 2015

Yuen Scherer Nature 2017

Type:

(55) List of genes (unique) from paper YuenSchererNature2017NEW, Supplementary Table 5. All damaging variants in ASD-risk genes.

source: Yuen Scherer Nature 2017

Turner Eichler ajhg 2019

Type:

(52) List of genes from paper TurnerEichlerajhg2019, Supplementary mmc2 Excel file, Sheet: TableS5

source: Turner Eichler ajhg 2019

Satterstrom Buxbaum Cell 2020 top

Type:

(102) List of genes (unique) from paper SatterstromBuxbaumCell_2020, Supplementary mmc2 Excel file, Sheet: 102 ASD

source: Satterstrom Buxbaum Cell 2020 top

Annotator type: gene_set_annotator

This gene set collection annotator uses the autism gene set collection.

Resource

Id: gene_properties/gene_sets/autism

Type: gene_set_collection

Summary:

Autism gene sets derived from publications

number_of_deletions_in_SSC_affected

Type:

The number of CNVs overlapping with the annotatable.

source: count

Annotator type: cnv_collection

No description

Resource

Id: hg38/cnv_collections/Iossifov_Lab_SSC_AGRE_2021

Type: cnv_collection

Summary:

De novo CNVs from SSC and AGRE WGS

number_of_duplications_in_SSC_affected

Type:

The number of CNVs overlapping with the annotatable.

source: count

Annotator type: cnv_collection

No description

Resource

Id: hg38/cnv_collections/Iossifov_Lab_SSC_AGRE_2021

Type: cnv_collection

Summary:

De novo CNVs from SSC and AGRE WGS

in_a_SFARI_gene_CNV

Type:

The number of CNVs overlapping with the annotatable.

source: count

Annotator type: cnv_collection

No description

Resource

Id: hg38/cnv_collections/SFARI_gene_CNV

Type: cnv_collection

Summary:

SFARI_Gene CNV collection

Filename	Size	md5
genomic_resource.yaml	204.0 B	d2798b17ff4fb927fe4528f2ce48eeec
hg38_autism_annotation.yaml	4.29 KB	deb9a282613b6b66ea54e64cfb26f29e
statistics/

Resource

Pipeline Documentation

preamble

Annotators

Files