• Nem Talált Eredményt

Extended sialylated O-glycan repertoire of human urinary glycoproteins discovered and characterized using EThcD

N/A
N/A
Protected

Academic year: 2022

Ossza meg "Extended sialylated O-glycan repertoire of human urinary glycoproteins discovered and characterized using EThcD"

Copied!
48
0
0

Teljes szövegt

(1)

Extended sialylated O-glycan repertoire of human urinary glycoproteins discovered and characterized using EThcD

AUTHOR NAMES

Zsuzsanna Darula1*, Ádám Pap1,2, Katalin F. Medzihradszky1*

CORRESPONDING AUTHORS Zsuzsanna Darula

e-mail: darula.zsuzsanna@brc.mta.hu

Katalin F. Medzihradszky

e-mail: medzihradszky.katalin@brc.mta.hu

ADDRESS

1Biological Research Centre of the Hungarian Academy of Sciences, Temesvari krt. 62., H-6726 Szeged, Hungary

2Doctoral School in Biology, Faculty of Science and Informatics, University of Szeged, Kozep fasor 52., H- 6726 Szeged, Hungary

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(2)

ABSTRACT

A relatively novel activation technique, EThcD was used in the LC-MS/MS analysis of tryptic glycopeptides enriched with wheat germ agglutinin from human urine samples. We focused on the characterization of mucin-type O-glycopeptides. EThcD in a single spectrum provided information on both the peptide

modified and the glycan carried. Unexpectedly, glycan oxonium ions indicated the presence of O-acetyl, and even O-diacetyl-sialic acids. B and Y fragment ions revealed that i) in core 1 structures the Gal residue featured the O-acetyl-sialic acid, when there was only one in the glycan; ii) several glycopeptides featured core 1 glycans with disialic acids, in certain instances O-acetylated; iii) the disialic acid was linked to the GalNAc residue whatever was the degree of O-acetylation; iv) core 2 isomers with a single O-acetyl-sialic acid were chromatographically resolved. Glycan fragmentation also helped to decipher additional core 2 oligosaccharides: a LacdiNAc-like structure, glycans carrying sialyl LewisX/A at different stages of O-

acetylation, and blood antigens. A sialo core 3 structure was also identified. We believe this is the first study when such structures were characterized from a very complex mixture and were linked not only to a specific protein, but also the sites of modifications have been determined.

KEYWORDS

affinity chromatography, blood-type antigen, disialic acid, EThcD, mass spectrometry, O-acetyl-sialic acid, O-glycopeptides, urinary glycoproteins, wheat germ agglutinin

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(3)

INTRODUCTION

Protein glycosylation is a common post-translational modification that is responsible for a wide range of biochemical interactions. Protein glycosylation can be C-, N-, O- and S-linked and hence a range of amino acid side-chains harbor these glycan linkages1. O-glycosylation may alter Ser and Thr side-chains within the cytosol as well as within the Golgi1,2. While the intracellular proteins feature a single GlcNAc as

modification, the Golgi-derived glycans are much more complex, with alternate cores that are usually elongated1,2. In mammals the GalNAc-core (mucin-type) glycosylation is the most common1,2. Mucin-type O-glycosylation of Tyr residues has also been reported recently3. Early on it has also been observed that the same sequence positions may or may not be always glycosylated (macroheterogeneity) and may feature several different glycans (microheterogeneity)1. For technical reasons glycoprotein decorating glycans are frequently studied after being released4-7. This permits the fractionation and in-depth characterization of these oligosaccharides, however, the site-specific information is lost in the process. With the advent of very sensitive, high resolution mass spectrometers and new activation techniques, such as ETD (electron transfer dissociation8) and EThcD (electron-transfer/higher-energy collision dissociation9), the focus is shifting towards intact glycopeptide or even intact glycoprotein (or at least middle-down) analysis10. While mass spectrometry is suitable for identifying most PTM-bearing sequences in an unbiased manner, collision- induced dissociation (CID), the most commonly used MS/MS activation method for obtaining structural information on peptides, performs poorly for O-glycopeptide characterization. The culprit is the glycosidic bond that is more prone to fragmentation than the peptide bond and thus, fragment ions resulting from sugar cleavages dominate the spectra. The glycan is usually completely released from the peptide, without leaving any clue behind on the originally modified amino acid. As a result, little to no information is available to assign the glycosylation site(s).10 Hence ion trap CID, i.e. resonance activation only delivers information about the glycan size and composition, and the peptide size, while beam-type CID (HCD) may produce peptide sequence information, but information on the glycan(s) may be more limited than in ion trap CID because of its extensive fragmentation10. Electron transfer dissociation usually results solely in the peptide backbone fragmentation, thus, both the modified amino acid sequence and the modification site(s) can be

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(4)

assigned8. However, ETD spectra frequently yield incomplete sequence ion series, and thus, limited

information about the glycans present (i.e. it cannot be decided whether the glycopeptide features two small glycans or a larger oligosaccharide of the same combined sugar composition) and their location is

available10. In the newest activation method, EThcD, electron-transfer is performed first, then newly

generated fragments and intact precursor ions (activated) are subjected to beam-type collisional activation9. In this process only some of the ETD-generated fragments undergo further breakdown (such as w ion formation11), but the precursor ions are definitely cleaved9. Most of this fragmentation results from

glycosidic bond cleavages, however, the activation is gentler than in normal HCD and the resulting glycan fragments are more abundant, while larger structures also survive12. Thus, we gain information both about the peptides modified and the oligosaccharides featured.

In this manuscript we demonstrate that the information content of EThcD spectra enabled the

characterization of unexpected/unusual sialylated structures on intact glycopeptides that were enriched from the urine of healthy individuals. As far as we know this is the first time that such structures have been identified from a complex biological matrix, in a high throughput manner, with protein site specificity. The presented method opens new possibilities for the discovery of novel glycan structures with site specificity on proteins to reveal their biological function and potential as diagnostic biomarkers.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(5)

EXPERIMENTAL

Here we utilized a glycopeptide enrichment strategy as previously described 12. We would like to emphasize that no acetate buffer or acetic acid containing solution was used in any part of the protocol. Briefly, tryptic digests of urine of 4 healthy donors (collected with appropriate consents approved by the Hungarian

Scientific and Research Ethics Committee (approval number: 1011/16)) were subjected to a 2-round glycopeptide enrichment using a wheat germ agglutinin affinity column collecting 2 glycopeptide fractions representing the shoulder of the flow through peak and a fraction eluted by GlcNAc. Fractions were

analyzed separately by LC-MS/MS using a Waters M-Class nanoUPLC on-line coupled to an Orbitrap Fusion Lumos mass spectrometer operated in positive ion mode. Peptides loaded onto a trap column (Waters Acquity UPLC MClass Symmetry C18 180 μm× 20 mm column, 5-μm particle size, 100-Å pore size; flow rate 10 μl/min) were separated by a linear gradient of 10-30% B in 60 min (Waters Acquity UPLC M-Class BEH C18 75 μm× 250 mm column, 1.7-μm particle size, 130-Å pore size; solvent A: 0.1% formic

acid/water; solvent B: 0.1% formic acid/ACN; flow rate: 300 nl/ min), MS/MS data were acquired using HCD product-ion dependent EThcD data acquisition mode. The HexNAc-specific oxonium ion, m/z 204.0867 among the 20 most abundant HCD fragments triggered EThcD acquisition. HCD spectra were acquired at 28% NCE, while supplemental activation in EThcD was set at 15% NCE. Precursors with z=3-5 were considered (intensity threshold: 106) according to a decision tree with decreasing charge state and increasing precursor m/z, in a total cycle time of 3 s. Some samples were also analyzed using direct injection (i. e. without trapping) and allowing z=2 precursors. All measurements were performed in the Orbitrap, with a resolution of 60000 and 15000 for MS1 and MS/MS, respectively.

Separate HCD and EThcD peaklists were generated using Proteome Discoverer (Thermo Scientific, v2.2.0.388) requiring at least 40 peaks per spectrum. EThcD data were filtered for the presence of m/z 292.0927 within the top 20 most abundant fragment ions (required mass accuracy: 10 ppm) using the MS- Filter software of Protein Prospector13 and retained spectra were searched using the Protein Prospector search engine with the following settings: enzyme: semitrypsin with maximum 1 missed cleavage site;

database: human subset of the Swissprot database (2017.9.19.version, 20219 sequences) concatenated with

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(6)

the same number of random sequences; mass accuracy: 5 ppm for precursor ions, 10 ppm for fragment ions;

fixed modification: carbamidomethylation (Cys), variable modifications: acetylation (protein N-terminus), cyclization (N-terminal Gln), oxidation (Met), and a set of 28 glycan structures representing core 1 and core 2 O-glycans with up to 4 sialic acids/glycan (Supporting Information 1, Table S-1) on Ser/Thr. All variable modifications were defined as “common” allowing 2 variable modifications per peptide. The 80 most abundant peaks were considered from each MS/MS spectrum. Glycopeptide identifications were accepted applying 5% and 1% FDR on the protein- and peptide-level, respectively, and setting a SLIP- score14 threshold of 6. All identifications presented were validated manually.

The raw data files have been uploaded to MassIVE

(https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp), the project’s identifier is MSV000083070.

RESULTS The workflow

We intend to gain information on the site-specific O-glycosylation of human proteins, and for this purpose the analysis of intact glycopeptides from different sources is necessary. In this study tryptic glycopeptides were enriched from urine of healthy donors. Previous studies on urinary glycoproteins, applying the sialic acid release and capture method15 or the glycoblotting-assisted O-glycomics approach16,17 indicated that core 2 and disialylated core 1 O-glycans are present at considerable amounts. As glycopeptides bearing these structures would not be enriched by lectins frequently used for O-glycopeptide analysis including Jacalin and peanut agglutinin, we used wheat germ agglutinin (WGA) that binds a wide array of glycan structures18-

20 including both N- and O-glycopeptides. The glycopeptide mixtures were analyzed applying HCD product ion-dependent EThcD with mild supplemental activation (15% NCE), and EThcD data were interpreted using Protein Prospector12. As deduced from the number of MS/MS spectra acquired, the two-round lectin enrichment was highly efficient: ≥ 94% of the acquired HCD spectra featured the diagnostic glycan

oxonium ion m/z 204 in all collected WGA fractions12. N- and O-glycopeptides were identified in separate searches (for details see 12 and Online Resources 3-4 therein). As the vast majority of the glycopeptides represented sialylated glycans12 we used peak lists filtered for the presence of the diagnostic m/z 292. For O-

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(7)

glycopeptide identification a-, mono- and disialo core 1 and the disialo core 2 O-glycan structures were considered. Approximately 2/3 of the O-glycosylated sequences reported earlier for urinary proteins15 were identified along with several new, mostly multiply glycosylated peptides indicating that WGA affords efficient enrichment of O-glycopeptides. However, the overall identification rate was rather low (~5%)12. One of the obvious reasons is that in order to keep false discovery rates at acceptable levels, only 2 variable modifications per peptide was permitted in these searches. Thus, multiply modified peptides could be (mis)assigned only if the combined sugar composition of their glycans equaled to a permitted combination.

Obviously, glycopeptides with glycans other than the specified structures were also missed.

Iterative search for unexpected/unusual structures

In order to learn what additional structures occur on the glycopeptides enriched, a non-specified modification search18,19 was performed on proteins whose presence was already ascertained: any

modification up to 3000 Da was allowed on Asn, Ser and Thr residues. From the recurring modification masses and manual inspection of some corresponding MS/MS spectra we concluded that structures

containing disialic acids or O,N-diacetylneuraminic acid(s) also were present and needed to be considered.

Thus, a more complete glycan database was designed (Table S-1) and a new database search was performed with these accurately defined structures as variable modifications. Then the assigned spectra were carefully inspected. We observed earlier with O-glycopeptides bearing sialyl core 1 and core 2 structures, that the mild supplemental activation during EThcD enables the detection of higher-mass oxonium ions, sometimes even the intact glycan12. We hypothesized that such ions together with Y-type fragments (for nomenclature see 21) could be used to decipher the glycan structures, for example, the linkage position of the unusual disialic acid moiety and the O,N-diacetylneuraminic acids. In addition, our observations could be used to establish fragmentation rules for the identification of novel structures.

The final list of unexpected sialic acid-containing glycoforms derived from the above searches is presented in Table 1. Table S-2 and Figures S-1-29 contain additional information about the identifications and display the supporting MS/MS data. As the focus of this manuscript is to demonstrate how EThcD data can be used for glycan structure elucidation, only fragmentation relevant to the glycan structures is detailed below

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(8)

(although we would like to point out that peptide side-chain w ions were observed in several spectra that may aid peptide identification). Here we would like to emphasize that our data does not distinguish the isomeric sugar units, resolve position or the stereochemistry of the glycan linkages. These assignments (if presented) are based on assumptions and on our general knowledge on mucin-type glycosylation. However, as described below, the MS/MS spectra usually provided sufficient information on the direct connection of the sugar units or on the branching of the glycans.

Table 1. Glycopeptides carrying O-glycans with O-acetyl- and/or disialic acid(s)1

Tetra3 Penta3 Hexa3 Uniprot

Acc # Protein Name Peptide2

1 2 0 1 2 3 1 2

Q9GZM5 Protein YIPF3 342AVAVTLQSH350 + + + + + + + +

P01344 Insulin-like

growth factor II 93DVSTPPTVLPDNFPR107 + +

175AQDGGPVGTELF186 +

175AQDGGPVGTELFR187 +

247VWGQGQSPRPE257 +

247VWGQGQSPRPENSLER262 + +

P78423 Fractalkine

263EEMGPVPAHTDAF275 + +

Q9BZL3 Small integral membrane protein

3

1MDAVSQVPMEVVLPK15 + +

33SLPEDIETVTASQMR47 +

P98160

Basement membrane-specific

heparan sulfate proteoglycan core

protein

48WTHSYL53 +

Q13508 Ecto-ADP-

ribosyltransferase 3

316ILEPTQIPGMK326 +

Q9H2B2 Synaptotagmin-4 9EEFDEIPTVVGIF21 +

Q8TBP5 Membrane protein

FAM174A 50PRTLPPLPPGPTPAQQPGR68 +

P09603

Macrophage colony-stimulating

factor 1

370VGPVRPTGQDWNHTPQK386 +

P14209 CD99 antigen 29SDALPDNENKKPTAIPK45 +

P60022 Beta-defensin 1 22GNFLTGLGHR31 +

1 identified by Protein Prospector and validated manually.

2 Ser/Thr residues printed in red denote the site of glycosylation.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(9)

3Tetra, penta and hexa stands for the total number of the monosaccharides within the glycan structure.

Numbers in the next row specify the number of NeuAcAc residues in the glycan. Proposed structures are shown in Table S-1, and in Figures S1-29.

EThcD fragmentation ‘rules’ at low normalized collision energy

As a general observation, gentle supplemental energy (NCE: 15%) applied in EThcD typically seemed to favor single bond cleavages. Glycan fragmentation yielded non-reducing end B ions, frequently even larger (up to 6 residues) oligosaccharides were detected12, and less prominent internal fragments could be observed compared to HCD (acquired with NCE: 28%) (Figures S-1-29 display the corresponding HCD and EThcD spectra for each assigned glycopeptide, in certain instances we highlighted the fragmentation differences).

These ions together with reducing end Y fragments usually provided insight into the glycan structure as detailed below.

O-acetyl-sialic acid-containing glycans

Fragments at m/z 334.113 and 316.103 indicated the presence of 9-O-acetyl,N-acetylneuraminic acid (Neu5,9Ac2 according to convention, but defined as NeuAcAc in the database searches) within some glycan structures (m/z 316 represents water-loss from Neu5,9Ac2). Obviously, from these data it cannot be

determined which position was acetylated, however, this structure is the most likely1. Screening for these fragments with MS-Filter of Protein Prospector13 showed that ~5% of all EThcD spectra featured at least one of these oxonium ions. Database searches allowing Neu5,9Ac2 (NeuAcAc) containing glycans yielded confident identifications (Table 1, Table S-2, Figures S-1-29).

Neu5,9Ac2 was always found in terminal position as evidenced by the Y ions formed via Neu5,9Ac2-loss from the charge-reduced precursors. Confidently assigned glycopeptides featured tetra-, penta- and

hexasaccharides with up to three Neu5,9Ac2 residues/glycan (Table 1). However, manual evaluation altered the assignment that indicated the presence of an ‘internal’ Neu5,9Ac2 (see below).

In the core 1 tetrasaccharides containing one Neu5,9Ac2, this sugar was Gal-linked as evidenced by the B2

ion representing GalNeu5,9Ac2 (at m/z 496.166) and the corresponding Y1,α ion, at m/z 710.345(2+) (Figure

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(10)

1). This B2 ion in EThcD is usually of higher abundance than the GalNAcNeuAc internal fragment (at m/z 495.182) that is more prominent in HCD (see Figure S-1). Actually, the GalNeu5,9Ac2 B2 ion may be overlooked in HCD as the isotopic peak of the internal fragment. The ion at m/z 990.34 represents the intact oligosaccharide.

30 25 20 15 10 5 0 104 Intensity

1800 1600

1400 1200

1000

m/z 1.0

0.8

0.6

0.4

0.2

0.0 106 Intensity

800 600

400 200

m/z

228.097 259.176 274.091 316.102 334.112

297.119 358.244 425.177 463.258(2+) 468.232 495.181 496.164 564.798(2+) 710.345(2+)699.248 791.371(2+) 812.376(2+) 936.920(2+) 957.928(2+)

990.336

z*2c3 w3 Ac c4 w4 Y0

z4

Ac Y1

Ac

Y1 Y2 Y1

pr pr

-Ac

Ac

1128.586

Y1 1277.610

*

1448.622

c5

1485.709 1561.705 1581.735 1623.746 1657.679 1689.764

c7 z6 Y1

c6 Y2

1776.803

1729.722

z*7 c8

1916.859

pr

1828.802

z*8

1419.679

Y1 1873

.837 1899.838

1558.610

z5

1281.649

c8 c8

355.147

z3

Figure 1. EThcD spectrum of m/z 638.954(3+), identified as

342AVAVT(HexNAcHexNeuAcNeuAcAc)LQSH350 of Protein YIPF3. The Gal of the mucin-type core 1 structure is capped with a Neu5,9Ac2 moiety. (This sugar is listed as NeuAcAc in the glycan library for database searches). Oxonium and related ions are labeled according to the CFG recommendations, for the reducing end fragments – printed in red - the Domon-Costello nomenclature is followed21. The sialic acid loss from the core GalNAc yields the Y fragment. z+1 peptide ions are distinguished with asterisks.

Peptide fragments in blue indicate that some glycan fragmentation occurred in EThcD, the loss of the

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(11)

NeuAc and GalNeu5,9Ac2 from the c8 fragment, respectively. The asterisk-labeled ion is the charge-reduced form of a doubly charged coeluting molecule.

In a disialo core 2 hexasaccharide with one Neu5,9Ac2, this capping residue can be linked to either of the galactoses. Protein Prospector delivered the same identification for EThcD data acquired at different time points from precursor ion m/z 760.664(3+), as 342AVAVT(HexNAc2Hex2NeuAcNeuAcAc)LQSH250 of Protein YIPF3 (Q9GZM5) (Figures S-7 & S-8). Both spectra display the intact glycan fragment (at m/z 1355.46) to confirm the presence of a hexasaccharide as opposed to two smaller structures, and Neu5,9Ac2

is Gal-linked (see m/z 496.166). However, as Figure 2 shows in detail, two isomeric glycoforms were present in this sample. In the earlier eluting, more abundant glycoform, the Neu5,9Ac2 is capping the GlcNAc-Gal arm (designated as ‘β-chain’, Figure 2, lower panel). This arm can readily be removed by a single cleavage yielding an oxonium ion of HexNAcHexNeuAcAc composition at m/z 699.245 (B), and the corresponding Y fragment at m/z 791.372(2+). These ions are missing from the other isoform, where the Neu5,9Ac2 is capping the core GalNAc-linked Gal (designated as ‘α-chain’, Figure 2, upper panel).

Thus, when the bond between the core GalNAc and its modifying GlcNAc is cleaved, an abundant B can be observed at m/z 657.234, and the corresponding Y was recorded at m/z 812.378(2+). The other

fragments are of identical composition, but they are the products of different bond cleavages as indicated in their assignments. In addition, the relative intensities of Y and Y are different, it seems that the cleavage between the core GalNAc and Gal is preferred to that between the GlcNAc and the Gal linked to it. Here again, we would like to point out that the internal fragmentation of the glycan is less extensive than in HCD:

m/z 366.139 representing HexNAcHex is more abundant in the HCD spectra, and the HexNAc2 fragment (m/z 407.166) was not detected in EThcD (See Figures S 7-9).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(12)

60 50 40 30 20 10 0 104 Intensity

1000 900

800 700

600

m/z 1.2

1.0 0.8 0.6 0.4 0.2 0.0 106 Intensity

1000 900

800 700

600

m/z

657.231657.234 699.245

pr

pr

791.372(2+)

747.366(2+)

645.828(2+) 666.339(2+) 812.380(2+) 828.392(2+) 892.918(2+) 913.916(2+) 973.939(2+) 994.943(2+)

925.508

747.363(2+) 812.378(2+) 828.391(2+) 973.939(2+) 994.943(2+)

892.912(2+) 925.512

913.921(2+)

Y1

Y1

B3

B3

Ac

Ac

Y2

Y2

Y1

Y2

Y1

Y2

Y0

Y0

Y2

Y3

Y2

Y3

Y1

Y1

Y2

Y2

Figure 2. Part of the EThcD spectra acquired from m/z 760.664(3+) at RT=14.95min (Lower panel) and at RT=18.29min (Upper panel). Both spectra were assigned as

342AVAVT(HexNAc2Hex2NeuAcNeuAcAc)LQSH350 of Protein YIPF3 (for full spectra see Figures S-7 &

S-8). (NeuAcAc stands for Neu5,9Ac2 in the glycan library for database searches.) However, the glycan fragments revealed isomeric glycoforms. The diagnostic fragment ions and the underlying cleavages are indicated in matching colors.

Finally, an identification indicated the presence of a pentasaccharide structure with three NeuAcAc residues (Figure 3). Although the additive mass of the glycan (1364.45 Da) is correct, we found this identification highly unlikely as it suggests the presence of a non-terminal Neu5,9Ac2 considering the structure of the pentasaccharide glycans (see below). Careful investigation of the EThcD spectra revealed that three different sialic acids modified the core 1 glycan: the Gal is capped with a Neu5,9Ac2, while the GalNAc is linked directly to a NeuAc that is further elongated with a sialic acid bearing 2 ‘extra’ acetyl groups. This is likely

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(13)

an O-4,9-diacetyl-N-acetylneuraminic acid1. The presence of such a neuraminic acid derivative, perhaps Neu4,5,9Ac3, was confirmed by its proper oxonium ion and the corresponding water-loss fragment at m/z 376.124 and 358.112, respectively (both within 4 ppm of the theoretical m/z values). The GalNeu5,9Ac2

linkage was validated by Y representing the loss of the hexose along with the terminal sialic acid, at m/z 897.899(2+). The GalNAc-linked NeuAcNeu4,5,9Ac3 structure is supported by the observation of the above described Y fragment, and by the consecutive losses of Neu4,5,9Ac3 and NeuAcNeu4,5,9Ac3 at m/z 957.924(2+) and 1623.747, respectively.

Figure 3. EThcD spectrum of m/z 763.992(3+) manually assigned as 342AVAVTLQSH350, modified at Thr- 346 with a core 1 glycan that features a Neu5,9Ac2 onthe Gal residue, and a NeuAc-Neu4,5,9Ac3 disialo- unit on the core GalNAc. ‘Pr’ indicates the different forms of the precursor ion. z+1 peptide fragments are distinguished with asterisks. The oxonium ions identifying the three different sialic acids are labeled with the CFG symbols, for the reducing end fragments – printed in red - the Domon-Costello nomenclature is

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(14)

followed21. The GalNAc modifications were considered the β-arm. The asterisk-labeled ion is the charge- reduced form of a doubly charged coeluting molecule.

Scheme 1. Fragmentation of the glycan structure assigned from the spectrum in Figure 3.

Disialic acid-containing glycans

Obviously, there is an overlap between the unusual glycopeptides featuring O-acetyl-sialic acids and the ones bearing core 1 structures modified with a disialic acid unit (Table 1). The terminal residue of this disialic unit may be NeuAc or may bear additional O-acetyl groups, i.e. it may be Neu5,9Ac2 or

Neu4,5,9Ac3 (as described above). The disialic acid unit usually produces a relatively abundant oxonium ion at m/z 583.197 or 625.210 for (NeuAc)2 or NeuAcNeu5,9Ac2, respectively. Filtering for the presence of these ions with MS-Filter13 showed that ~2% of all EThcD spectra represented glycopeptides with disialic acid within the glycan structure. The composition of the modifying glycan was frequently confirmed by an oxonium ion representing the intact glycan, at m/z 1239.43 or 1281.44.

EThcD fragmentation of a glycopeptide with a pentasaccharide glycan is illustrated in Figure 4. The glycan carries three “simple” NeuAc units, Y ions representing neutral losses of (NeuAc)2 at m/z 1156.04(2+), and HexNeuAc at m/z 1220.56(2+) prove that the glycan is branched, and the disialic acid is linked to the core GalNAc. The inter-sialic acid bond is confirmed by the (NeuAc)2 oxonium ion (B) at m/z 583.197. The oxonium ion representing the intact glycan at m/z 1239.41 confirms that a single glycan modifies the peptide. In a similar fashion we could decipher that (whenever there was conclusive evidence) the disialic acid was always GalNAc-linked. Limited internal fragmentation of the glycan was observed for this class of glycopeptides as well. While in HCD the internal fragment representing GalNAcNeuAc (m/z 495.182) was

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(15)

usually abundant, in EThcD the B2 ion representing GalNeuAc (m/z 454.155) was dominant (for example, see Figures S-10-12).

Figure 4. EThcD spectrum of m/z 965.092(3+), identified as

93DVSTPPT(HexNAcHexNeuAc3)VLPDNFPR107 of Insulin-like growth factor II. The precursor ion and its charge-reduced form are labeled with ‘pr’. The nonreducing end fragments are labeled with cartoons

according to the CFG recommendations, the reducing end fragments, printed in red, follow the

nomenclature21. The inset shows the intact glycan and its oxonium ion is labeled by a red asterisk. The other asterisk-labeled ion is the charge-reduced form of a doubly charged coeluting molecule. For full peaklist see Figure S-10.

Interestingly, if only one Neu5,9Ac2 was present in the glycan structure, it was always Gal-linked in the tetrasaccharides, while in pentasaccharides the GalNAc-linked disialic acid was terminated with it.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(16)

Additional unusual glycans

Finally, the database search permitting unspecified modifications delivered two intriguing suggestions. First, it identified a glycopeptide with a glycan mass of 1062.39 Da (Figure 5). This mass corresponds to a

HexNAc3HexNeuAc glycan composition, that is most unusual for an O-glycan (the calculated mass is 1062.386, within 2 ppm). The observed non-reducing end B and reducing end Y carbohydrate fragments suggest that this unusual glycan might be a Galβ3(GalNAcβ4GlcNAcβ6)GalNAc, the LacdiNAc-like structure described by Jin at al.22, from a human gastric mucin sample. We detected this structure modified with a sialic acid on the galactose (Figure 5, Scheme 2). More specifically, the fragment ion at m/z 407.165 confirms the inter-HexNAc linkage and also indicates that this is a terminal HexNAc-HexNAc structure.

Although m/z 407 can, in principle, also be formed as an internal fragment of core 2 glycans, it was not detected in their respective EThcD spectra (see Figures S-7-9). Similarly, m/z 454.154 indicates a terminal HexNeuAc group. These ions together with reducing end Y ions unambiguously show that the glycan is branched and a HexNeuAc and a HexNAc2 are linked to the core GalNAc. The B3 fragment representing the intact glycan was detected at m/z 1063.398.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(17)

6 5 4 3 2 1 0 104 Intensity

1000 800

600 400

200

m/z 6

5 4 3 2 1 0 104 Intensity

1800 1600

1400 1200

1000

m/z 204.086 219.134 274.092 292.102289.186 302.169 331.208 407.165 415.253 430.277

z*2 c3

w3

z3

z4 w4

454.154 714.372(2+) 737.865(2+) 795.398(2+) 839.406(2+) 919.940(2+) 940.934(2+)

1021.579 1253.578*

1063.398 1124.546 1196.503 1237.579 1386.702 1452.620 1458.720

c4

1474.701 1589.791 1593.695

1551.690 1677.782 1694.740 1709.799 1751.798 1766.812 1796.852 1822.792 1838.878 1865.878 1880.851

Y2

Y2

Y1

Y2

Y2

Y1

Y1

B2

B1

B2

B1

B3 c5 z6

pr

pr

c7 z*7 -Ac -Ac pr

z5 Y2

Y1

1664.774

c6

Figure 5. EThcD spectrum of m/z 627.634(3+). The database search permitting unspecified modifications identified as 45VATT(1062.388)VISK52 of Plasma protease C1 inhibitor. z* indicates a z+1 peptide

fragment. The intact glycan oxonium ion confirms that the peptide is modified by a single oligosaccharide of HexNAc3HexNeuAc composition. The glycan fragments helped to decipher the branching and linkages within the oligosaccharide, a core 2 based LacdiNAc-like structure22. Scheme 2 illustrates which bonds are cleaved. z* indicates a z+1 peptide fragment. The asterisk-labeled ion is the charge-reduced form of a doubly charged coeluting molecule.

Scheme 2. Fragmentation of the glycan structure assigned from the spectrum in Figure 5.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(18)

The second fascinating identification was 345SLTVS(1370.497)LGPVSKT(HexNAcHexNeuAc)EGFPK361 from Protein HEG homolog 1 (Q9ULI3), where the modification mass, suggested by the search engine, corresponded to a HexNAc3Hex2FucNeuAc composition, that could represent a core 2 structure bearing an A-type antigen (see later). However, an abundant oxonium ion at m/z 674.252 (HexNAcHex2Fuc) and the similarly abundant corresponding Y fragment, at m/z 1034.16 (3+) strongly suggested the presence of a core 2 glycan carrying a blood-type B antigen. The assigned modifications and the fragments observed

contradicted each other. Our investigation revealed that a series of unassigned doubly charged fragment ions could be identified as z ions (z7-z12) from the mass differences between them. These fragments’ mass shift (697 Da) clearly indicated that the Thr carries a HexNAc2NeuAc oligosaccharide, most likely a sialylated core 3 structure , i.e. a GlcNAc3(NeuAc)GalNAc instead of the core 1 trisaccharide. The oxonium ion of this glycan was subsequently identified at m/z 698.266. Thus, the ‘leftover’ glycan on Ser-349 features a HexNAc2Hex3FucNeuAc composition that is indeed a core 2 structure decorated with the B blood-type antigen (Figure 6, Scheme 3). The donor’s blood type was AB.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(19)

Figure 6. EThcD spectrum of m/z 944.178 (4+) manually assigned as 345SLTVS*LGPVSKT*EGFPK361 from Protein HEG homolog 1 (Q9ULI3), with a core 2 structure carrying a B-antigen on Ser-349 and a sialyl core 3, GlcNAc3(NeuAc)GalNAc on Thr-356. The intact sugar oxonium of this glycan was detected at m/z 698.266. Its modification site was determined from the series of doubly charged z7-z12

fragments. Scheme 3 illustrates bond cleavages within the larger glycan structure.The precursor ion and its charge-reduced form are labeled with ‘pr’; the charge-reduced form of a coeluting doubly charged

component is indicated with an asterisk. Y ions reflect the fragmentation of the core 2 glycan carrying the B antigen.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(20)

Scheme 3. Fragmentation of the mucin core 2 glycan carrying blood-type antigen B structure assigned from the spectrum in Figure 6.

Targeted glycoform identification for a specific peptide

These unusual structures prompted us to further investigate the presence of additional glycoforms for the peptide that already featured the highest microheterogeneity: 342AVAVTLQSH250 of Protein YIPF3. Using the MS-filter option of Protein Prospector13 with one of the datafiles representing the earlier eluting

glycopeptides, we identified all HCD spectra that displayed the gas-phase deglycosylated Y0 fragment (m/z 925.510) among the 5 most abundant ions and with a mass measurement error below 10 ppm. Then the potential glycan masses were calculated for each such component and 4 new glycan structures emerged with compositions of HexNAc3Hex2FucNeuAc, HexNAc2Hex2FucNeuAc2, HexNAc2Hex2FucNeuAcNeuAcAc and HexNAc2Hex2Fuc(NeuAcAc)2, represented by incremental monoisotopic masses of 1370.499,

1458.514, 1500.516 and 1542.537 Da, respectively (Table S-3). Placing these numbers as modifications within the sequence, and comparing the MS-product predictions (Protein Prospector) with the EThcD data it became clear that Thr-346 was always the glycosylation site.

The first structure is most likely, a core-2 glycan carrying the A blood antigen as shown in Figure 7, Scheme 4. While the identity of the sugar units and their linkage positions obviously could not be determined from these data, the connections between the units could be deciphered from the EThcD spectrum (Figure 7). An oxonium ion at m/z 512.202 and the corresponding Y fragment indicated a terminal HexNAcHexFuc

structure. Other reducing end fragments identified a hexose-linked sialic acid as well as a terminal HexNAc.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(21)

Since single bond cleavages are preferred in EThcD, the presence of an abundant B-fragment at m/z 657.237 and its reducing end counterpart at m/z 820.399(2+) suggest that the core GalNAc is modified with a

HexNAcHexNeuAc structure and the other extension might be the A-antigen because of its composition and terminal HexNAc. These conclusions were further strengthened by the Y2β,1α fragment detected in HCD at m/z 1437.715 (Figure S-30). This A-antigen featuring peptide was identified from the same donor as the B- antigen carrying molecule described above.

The other sugar compositions suggest the presence of Sialyl LewisX/A in different stages of O-acetylation (See Table S-3; Figures S-31-34). The isomeric structures containing one O-acetyl-sialic acid were chromatographically resolved, and the earlier eluting isoform carried this modification on the GlcNAc- containing arm, just like in the hexasaccharide isomers presented in Figure 2. Interestingly, all fucose- containing glycoforms produced a few ions that indicated fucose transfer between the extensions (Figures 6- 7, Figures S-31-35). Fucose ‘migration’ has been reported in glycan analyses23 as well as for N-linked glycopeptides24,25.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Ábra

Table 1. Glycopeptides carrying O-glycans with O-acetyl- and/or disialic acid(s) 1
Figure 1. EThcD spectrum of m/z 638.954(3+), identified as
Figure 2. Part of the EThcD spectra acquired from m/z 760.664(3+) at RT=14.95min (Lower panel) and at  RT=18.29min (Upper panel)
Figure 3. EThcD spectrum of m/z 763.992(3+) manually assigned as  342 AVAVTLQSH 350 , modified at Thr- Thr-346 with a core 1 glycan that features a Neu5,9Ac 2  on the Gal residue, and a NeuAc-Neu4,5,9Ac 3   disialo-unit on the core GalNAc
+5

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

ABSTRACT: Colloidal stability was investigated in two types of particle systems, namely, with bare (h-HNT) and polyimidazolium- functionalized (h-HNT − IP-2) alkali-treated

Recent studies demonstrated that seropositivity for anti-glycan antibodies was associated with early disease onset, small bowel disease, complicated disease behav- ior, and

Major research areas of the Faculty include museums as new places for adult learning, development of the profession of adult educators, second chance schooling, guidance

Then, I will discuss how these approaches can be used in research with typically developing children and young people, as well as, with children with special needs.. The rapid

Usually hormones that increase cyclic AMP levels in the cell interact with their receptor protein in the plasma membrane and activate adenyl cyclase.. Substantial amounts of

the steady-state viscosity, where \f/(t) is the normalized relaxation function and G is the total relaxable shear modulus. The data of Catsiff et αΖ. 45 furnish in this way

Another paper b y Carswell and N a s o n 26 gives useful comparative data on the variation of the properties of several plastics (including vinyls, thermosetting materials,

It has been shown in Section I I that the stress-strain geometry of laminar shear is complicated b y the fact that not only d o the main directions of stress and strain rotate