International standards and resources

Standards for bacterial gene nomenclature were proposed by Demerec et al in 1966 (A proposal for a uniform nomenclature in bacterial genetics. Genetics 54:61–76(Opens in a new tab/window)) and are still in use.

Nomenclature committees for other organisms set guidelines, including approved gene names, symbols, case and type style.

Plant gene nomenclature is set on a species basis by various organisations (see examples in list below).

Mouse and rat gene nomenclature(Opens in a new tab/window) is set by the International Committee on Standardized Genetic Nomenclature for Mice.

Guidelines for human gene nomenclature(Opens in a new tab/window) are published by the HUGO Gene Nomenclature Committee (HGNC).

Other nomenclature committees and databases include:

The Vertebrate Gene Nomenclature Committee(Opens in a new tab/window), an extension of the HGNC, assigns gene names to vertebrate species that do not have a gene nomenclature committee. Gene names are available for the complete gene set of some species, but only for selected gene families in a number of other species (see the VGNC species list(Opens in a new tab/window)).

For those writing scholarly articles, the instructions to authors for ASM journals(Opens in a new tab/window) provide further advice and references.

There is no international standard for naming gene products, but the HUGO Gene Nomenclature Committee(Opens in a new tab/window) makes recommendations about how proteins should be referred to. In addition, many journals specify nomenclature for gene products.

Australian conventions and resources

In Australia, gene and gene product terminology follow international standards.

Gene names and symbols

Caution! Gene naming is very complex. Consult a specialist!

Gene names usually describe the function of the gene. Use lower case, apart from proper nouns and acronyms:

haemoglobin gene     alpha 1 gene

Gene symbols vary in different organisms. International nomenclature committees for specific organisms set guidelines, including approved gene names, symbols, case and type style.

Bacteria

Bacterial genes are named using a 3-letter designation, usually an abbreviation for the pathway or for the phenotype of mutants. The 3-letter designation is written in lower case and italics. Different genes that affect the same pathway are distinguished by a capital letter following the 3-letter designation (without a space). An allele number can also be added (also without a space) to designate a particular mutation:

lac     lacZ     lacZ19

It is important to distinguish the phenotype of a bacterial strain from its genotype. The phenotype is usually indicated with the same 3-letter designation as the genotype, but phenotypes begin with a capital letter and are not italicised. Wildtype alleles can be designated by superscript ‘+’ or ‘–’:

Lac+     Cys

Other designations can also be added using superscripts, but must be defined:

Strr [streptomycin resistance]

Plants

Plant gene nomenclature follows the same basic guidelines as animal gene nomenclature.

Animals, including humans

A gene symbol should be no more than 6 characters, and most guidelines now call for 3-letter symbols. Most guidelines specify capital letters and italics, but there are a number of exceptions (e.g. mice and rats). Symbols should start with the first letter of, and reflect, the gene name. Greek symbols and roman numerals should be avoided (arabic numbers are acceptable), as should commas, hyphens and superscripted or subscripted characters:

HBA1 [hemoglobin, alpha 1 gene]

Human gene symbols follow these general rules and are italicised:

CREBBP [CREB binding protein gene]

HTT [Huntington disease gene]

Mouse and rat gene names follow the same basic rules as for other species, but gene symbols have a slightly different format. The main difference is that mouse and rat gene symbols have only the initial letter capitalised. Hyphens, superscripts and subscripts are also allowed when referring to alleles or pseudogenes:

Tlr2 [toll-like receptor 2 gene]

Hba-ps3 [hemoglobin alpha pseudogene 3]

For flies, the gene name and symbol are in sentence case if the gene is named after the protein or if the gene was first named for a mutant phenotype that is dominant to the wild-type phenotype. A gene name and symbol start with a lower-case letter if the gene was first named for a mutant phenotype that is recessive to the wild-type phenotype. Gene symbols are italicised:

Actin 5C [gene name]     Act5C [gene symbol]

will die slowly [gene name]     wds [gene symbol]

Zebrafish gene names are in lower case and italicised. The gene symbols are also in lower case and italicised, and use 3 or more letters:

engrailed 1a [gene name]     eng1a [gene symbol]

If there is a need to distinguish between species (i.e. for homologues), include the species in brackets after the gene name:

 LFNG (Drosophila) [lunatic fringe homologue of Drosophila]

Gene products

To refer to an mRNA or cDNA gene product, use the gene symbol followed by the type of gene product:

HTT mRNA     CREBBP cDNA

The HUGO Gene Nomenclature Committee recommends that the protein symbol is the same as the gene symbol, but without italics:

CREBBP [gene symbol]     CREBBPP [protein symbol]

For mice and rat proteins, designations follow the same rules as gene symbols, but protein symbols use all capital letters and are not italicised:

TLR2 [protein symbol for toll-like receptor 2 (gene symbol Tlr2)]