- About
- Research
- Research Overview
- Chemical Biology and Medicinal Chemistry
- Chemical Biology and Medicinal Chemistry Overview
- Discovering Enzyme Substrates and Functions
- Discovering Protein Ligands to Probe and Alter Function
- Discovering Enzyme Activators
- Analyzing Mechanisms of Drug Resistance via Chemical Biology
- Analyzing Enzyme Conformational Dynamics, Substrate Binding, and Catalysis
- Effective Drug Targeting of Pathogens via Medicinal Chemistry
- Computational Chemistry and Biology
- Computational Chemistry and Biology Overview
- Modeling protein regulation via allostery and post-translational modifications
- Visualizing and integrating bioinformatics and biomolecular data
- Modeling membrane permeation to optimize pharmacokinetics
- Determining enzyme function by predicting substrate specificity
- Physical Biology
- Protein and Cellular Engineering
- Protein and Cellular Engineering Overview
- Monitoring enzyme activity and disease biomarkers
- Generating human proteome antibodies via phage display and directed evolution
- Globally analyzing and dissecting apoptosis
- Proximity tagging of protein-protein interactions
- Investigating cellular interactions in tissues
- Creating fluorescent probes targeting the genome and key bio-pathways
- De novo design of catalytic and membrane proteins
- Probing and modulating membrane proteins
- Education
- People
- News
- Events
Discovering Enzyme Substrates and Functions
Examples of our research and methods include
Determining protease targets via synthetic peptides
Proteases are the largest class of enzymes performing post-translational modifications in the human proteome. (There are about 550 types of human proteases.) These enzymes cleave the peptide bonds that link amino acids in other proteins, irreversibly changing them, yet the function and substrates of most are not known.
Proteases act directly and indirectly (activating enzyme precursors called proenzymes) in processes such as immune response, blood coagulation, and programmed cell death (apoptosis). Their dysregulation is associated with myriad diseases, making them prime therapeutic targets. Proteases also play key biological roles in pathogens such as HIV, where they cleave proteins for viral replication, again making them potential drug targets.

An HIV1 protease enzyme. Ball-and-stick figures at center represent the side chains of the enzyme’s catalytic aspartic acid residues. Molecular image made with UCSF Chimera developed by UCSF Resource for Biocomputing, Visualization, and Informatics.
One way to identify potential protease substrates is to determine the peptide sequences they cleave in vitro, in other words, which amino acids span the cleavage site and are recognized by the enzyme’s active site. These sequences are then used, like partial license plate numbers, to search the proteome for substrates.
To this end, department researchers develop and apply synthetic peptide libraries, which sample all 20 amino acid possibilities at each position (positional scanning synthetic combinatorial libraries) and then use various methods for detecting peptide cleavage.

Positional scanning libraries of synthetic tetrapeptides combine each of the 20 amino acids at three positions (“Mix”) while one remains the same. Fluorogenic leaving group (ACC) reveals bond cleavage activity by a protease.
Fluorescence detection
One approach uses tetrapeptides (four linked amino acids) linked to a fluorogenic leaving group—a molecular fragment that departs and indicates a bond cleavage by glowing. One of the four positions contains the same amino acid residue, while the other three are cycled through the 20 possibilities. Thus, during assays against a given protease, the role of the fixed position residue can be analyzed independently of the randomized ones.
Researchers here improved this method by synthesizing a leaving group that is much brighter, increasing assay sensitivity and allowing for full randomization of all four residues (160,000 different amino acid combinations).
Multiplex mass spectrometry profiling
Department researchers have also developed a substrate-profiling assay that combines advances in mass spectrometry sensitivity with a hypothesis, based on data from myriad experiments employing substrate libraries, that that recognition (and ultimately cleavage) of substrates by a protease requires only two amino acids suitably positioned in the substrate sequence. In some cases these amino acid pairs are at either side of the bond that gets cleaved (e.g. HIV protease) while other pairs are distant from each other and the cleaved bond (e.g. granzyme B).
This reporter-free method of detecting protease products was tested using five protease families against a diverse, synthesized library of tetradecapeptides (14 amino acids). The results matched or expanded upon the enzymes’ known specificity sequences and, via mass spectrometry quantification of proteolysis products, also measured their relative activity.

A library of peptides that contains all neighbor (XY) and near-neighbor pairs (X*Y), with a defined X and Y from among 20 amino acids and the asterisk representing a random choice.

Proteases are added and cleavage products are detected in samples at different time intervals via mass spectrometry.
Substrate specificities of enzymes
Both methods have been used by department scientists, in collaboration with others, to help profile the substrate specificity of enzymes such as:
- Cathepsin K, a cysteine protease implicated in osteoporotic bone breakdown—revealing sequences used to make a selective inhibitor.
- The major polyprotein processing protease of the coronavirus associated with SARS, leading to the identification of inhibitor leads.
- Cysteine proteases of malaria parasites, including those that degrade the host hemoglobin needed for the parasite’s protein synthesis.
- Elastase, secreted by the worm that causes schistosomiasis (a parasitic disease second to malaria in overall morbidity), finding it likely to be the primary protease providing for parasitic penetration of the skin.

Cleavage sites in a human hemoglobin molecule for two protease enzymes used by an organism that causes malaria. Red marks sites within molecule helices and blue within loop regions.
- Red = cut within helix
- Blue = cut within loop