Discovering Enzyme Substrates and Functions

There are four research areas in the Department of Pharmaceutical Chemistry. Discovering enzyme substrates and functions is a research challenge within chemical biology and medicinal chemistry.

The challenge

Before researchers can intervene in biological signaling or metabolic pathways gone awry in a disease, they need to discover the roles of key players—the enzymes that catalyze chemical reactions at each step. But the biological functions of many enzymes remain unknown.

One way to discover an enzyme’s function is to determine its substrate specificity—which particular proteins or small molecules are recognized and bound by its catalytic cavities known as active sites.

Determining substrates can also be a first step toward creating molecules that will inhibit enzymes, as drugs based on substrate analogs or mimics may competitively bind to the active site and thus block its catalytic action.

Examples of our research and methods include

Determining protease targets via synthetic peptides

Proteases are the largest class of enzymes performing post-translational modifications in the human proteome. (There are about 550 types of human proteases.) These enzymes cleave the peptide bonds that link amino acids in other proteins, irreversibly changing them, yet the function and substrates of most are not known.

Proteases act directly and indirectly (activating enzyme precursors called proenzymes) in processes such as immune response, blood coagulation, and programmed cell death (apoptosis). Their dysregulation is associated with myriad diseases, making them prime therapeutic targets. Proteases also play key biological roles in pathogens such as HIV, where they cleave proteins for viral replication, again making them potential drug targets.

HIV1 protease enzyme

An HIV1 protease enzyme. Ball-and-stick figures at center represent the side chains of the enzyme’s catalytic aspartic acid residues. Molecular image made with UCSF Chimera developed by UCSF Resource for Biocomputing, Visualization, and Informatics.


One way to identify potential protease substrates is to determine the peptide sequences they cleave in vitro, in other words, which amino acids span the cleavage site and are recognized by the enzyme’s active site. These sequences are then used, like partial license plate numbers, to search the proteome for substrates.

To this end, department researchers develop and apply synthetic peptide libraries, which sample all 20 amino acid possibilities at each position (positional scanning synthetic combinatorial libraries) and then use various methods for detecting peptide cleavage.

Positional scanning libraries

Positional scanning libraries of synthetic tetrapeptides combine each of the 20 amino acids at three positions (“Mix”) while one remains the same. Fluorogenic leaving group (ACC) reveals bond cleavage activity by a protease.

Fluorescence detection

One approach uses tetrapeptides (four linked amino acids) linked to a fluorogenic leaving group—a molecular fragment that departs and indicates a bond cleavage by glowing. One of the four positions contains the same amino acid residue, while the other three are cycled through the 20 possibilities. Thus, during assays against a given protease, the role of the fixed position residue can be analyzed independently of the randomized ones.

Researchers here improved this method by synthesizing a leaving group that is much brighter, increasing assay sensitivity and allowing for full randomization of all four residues (160,000 different amino acid combinations).

Multiplex mass spectrometry profiling

Department researchers have also developed a substrate-profiling assay that combines advances in mass spectrometry sensitivity with a hypothesis, based on data from myriad experiments employing substrate libraries, that that recognition (and ultimately cleavage) of substrates by a protease requires only two amino acids suitably positioned in the substrate sequence. In some cases these amino acid pairs are at either side of the bond that gets cleaved (e.g. HIV protease) while other pairs are distant from each other and the cleaved bond (e.g. granzyme B).

This reporter-free method of detecting protease products was tested using five protease families against a diverse, synthesized library of tetradecapeptides (14 amino acids). The results matched or expanded upon the enzymes’ known specificity sequences and, via mass spectrometry quantification of proteolysis products, also measured their relative activity.

Library of peptides

A library of peptides that contains all neighbor (XY) and near-neighbor pairs (X*Y), with a defined X and Y from among 20 amino acids and the asterisk representing a random choice.


Proteases are added and cleavage products are detected in samples at different time intervals via mass spectrometry.

Substrate specificities of enzymes

Both methods have been used by department scientists, in collaboration with others, to help profile the substrate specificity of enzymes such as:

  • Cathepsin K, a cysteine protease implicated in osteoporotic bone breakdown—revealing sequences used to make a selective inhibitor.
  • The major polyprotein processing protease of the coronavirus associated with SARS, leading to the identification of inhibitor leads.
  • Cysteine proteases of malaria parasites, including those that degrade the host hemoglobin needed for the parasite’s protein synthesis.
  • Elastase, secreted by the worm that causes schistosomiasis (a parasitic disease second to malaria in overall morbidity), finding it likely to be the primary protease providing for parasitic penetration of the skin.

Substrate specificities of enzymes 2

Cleavage sites in a human hemoglobin molecule for two protease enzymes used by an organism that causes malaria. Red marks sites within molecule helices and blue within loop regions.

  • Red = cut within helix
  • Blue = cut within loop