Protein Production
293FT, 293E, CHO

Truly Functional Protein
95% Purity
1-10 mg in 2 weeks

GeneExpressoMax™
293Expresso™

Transfection Reagents
* 90% Efficiency
* 95% Viability
* No sera interference
* Simple protocol
* High-throughput
* Only $98/ml

Baculovirus
Functional Protein
95% Purity
Fast turnaround
1-10 mg from Sf9 cells

Adenovirus, AAV
& Lentivirus

ORF or shRNA
* High Titer
* Cre, FLP, ΦC31
* Protein Kinases
* Transcription Factors
* Luciferases, GFP, RFP
* Protein Production
* Stable Cell Line


Excellgen

GENERATING AND MANAGING LARGE SCALE PROTEOGENOMIC DATA FOR ENCODE CELL LINES

Morgan Corinne Giddings, Associate Professor
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Grant 1RC2HG005591-01 from National Human Genome Research Institute

Abstract: The first human genome sequence was published in 2001, yet as of now, eight years later, major questions remain, such as how many genes are encoded by the genome, and of those genes, how many functional products are encoded due to phenomena like alternative splicing. The Encyclopedia of DNA Elements (ENCODE) project has been coordinated by National Human Genome Research Institute (NHGRI) to answer these questions by comprehensively classifying functional elements on the human genome. The pilot phase of the project studied 1% of the genome in detail, revealing extensive transcription well beyond that predicted by classical gene models. The biological function of a significant portion of the discovered transcripts is unclear. The ENCODE project is now scaling up to examine the whole human genome. It is likely that results will echo the pilot project, revealing extensive transcription, a significant fraction of which has unexplained function. Proteomic technologies can be applied, in a process called proteogenomic mapping, to determine which of the myriad transcripts encode proteins. This approach has been used to reveal new genes, new alternative splice variants, new start sites, and upstream open reading frames (ORFs). While substantive progress has been made in developing proteogenomic mapping technologies, a significant hurdle in using proteogenomics to assist with the ENCODE project is the lack of proteomic data sets that are coordinated with the ENCODE transcription mapping efforts. Here we propose to generate large-scale proteomic data sets directly from the same tier I ENCODE cell lines studied by the transcription efforts, coordinating the results with the transcription mapping efforts to determine which of the pervasive transcripts are translated. Our specific aims are to 1) produce large scale proteomic data sets on ENCODE cell lines using the most advanced mass spectrometry methods, 2) use our database technologies to store, manage, and make accessible to the community all results of the project, and 3) use our software pipeline to map the results to the latest human genome drafts, producing a UCSC (University of California Santa Cruz) genome browser track with the results. We believe the result will be a significant advancement in knowledge about our genomes and the functional products they encode. The human genome is the blueprint for human life and human health, but we do not yet understand its language - the language of genes. The ENCODE project is deciphering that language systematically, and the goal of this proposal is to accelerate that effort by revealing which parts of the blueprint contain instructions to build proteins

Keywords: Affect; Alternate Splicing; Alternative Splicing; Assay; Bioassay; Biochemical; Biologic Assays; Biological; Biological Assay; Biological Function; Biological Process; California; Cell Line; Cell Lines, Strains; CellLine; Code; Coding System; Communities; Computer Programs; Computer software; Crossmatching, Tissue; DNA; DNA Sequence; Data; Data Banks; Data Bases; Data Coordinating Center; Data Coordination Center; Data Element; Data Set; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Deoxyribonucleic Acid; Elements; Exons; Faculty; Functional RNA; Gene Products, RNA; Gene Transcription; Genes; Genetic Transcription; Genome; Genome, Human; Goals; Grant; Health; Histocompatibility Testing; Human; Human Cell Line; Human Genome; Human, General; Immunology; Immunology (Including BRMP); Immunology (NCI Program); In element; Indium; Instruction; Investigators; Knowledge; Language; Life; Man (Taxonomy); Man, Modern; Management Information Systems; Maps; Mass Spectrum; Mass Spectrum Analysis; Methods; Modeling; Molecular and Cellular Biology; National Human Genome Research Institute; Nature; Non-Coding; Non-Coding RNA; ORFs; Open Reading Frames; Peptides; Phase; Photometry/Spectrum Analysis, Mass; Pilot Projects; Policies; Process; Protein Coding Region; Protein Splicing; Proteins; Proteome; Proteomics; Publishing; RNA; RNA Expression; RNA Splicing; RNA Splicing, Alternative; RNA, Non-Polyadenylated; Research Personnel; Researchers; Ribonucleic Acid; Site; Software; Spectrometry, Mass; Spectroscopy, Mass; Spectrum Analyses, Mass; Spectrum Analysis, Mass; Splicing; Structure; Technology; Tissue Crossmatchings; Tissue Typing; Transcript; Transcription; Transcription, Genetic; Translating; Translatings; Universities; Variant; Variation; Work; base; clinical data repository; clinical data warehouse; computer program/software; cultured cell line; data repository; develop software; developing computer software; experience; gene product; genome sequencing; histocompatibility typing; improved; insight; language translation; pilot study; public health relevance; relational database; repository; scale up; software development

Relevance: Narrative The human genome is the blueprint for human life and human health, but we do not yet understand its language - the language of genes. The ENCODE project is deciphering that language systematically, and the goal of this proposal is to accelerate that effort by revealing which parts of the blueprint contain instructions to build proteins

Project start date: 2009-09-26

Project end date: 2011-06-30

Budget start date: 26-SEP-2009

Budget end date: 30-JUN-2010

PFA/PA: RFA-OD-09-004

1RC2HG005591-01 (2009): $800000


Sponsored Links Excellgen http://Excellgen.com

Recombinant Lentivirus & Adenovirus
High Yield and High Titer up to 1010 (lentivirus) and 1013 (adenovirus) for Guaranteed Expression of GOI. $3000, $2500
Baculovirus Protein Expression
Fast turn around, >95% purity functional protein. No outsourcing to China or India. $5500, $3950
Transient Protein Expression in CHO and HEK293 Cells
Transient Expression, Truly Functional Protein, 95% purity, 1~20 mg, fast turnaround. $5500, $3950


Grants awarded to Morgan Corinne Giddings

SOFTWARE TO IDENTIFY POST-TRANSLATIONAL MODIFICATIONS FROM PROTEOMIC DATA SETS

Morgan Corinne Giddings, Associate Professor
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Abstract: This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. To accelerate and enhance related studies by producing an easy-to-use and fully validated software package for automatically finding modifications on proteins

Keywords: CRISP; Computer Programs; Computer Retrieval of Information on Scientific Projects Database; Computer software; Data Set; Dataset; Funding; Grant; Institution; Investigators; Modification; NIH; National Institutes of Health; National Institutes of Health (U.S.); Post-Translational Modifications; Post-Translational Protein Processing; Posttranslational Modifications; Protein Modification; Protein Modification, Post-Translational; Protein Processing, Post-Translational; Protein Processing, Posttranslational; Protein/Amino Acid Biochemistry, Post-Translational Modification; Proteins; Proteomics; Research; Research Personnel; Research Resources; Researchers; Resources; Software; Source; United States National Institutes of Health; computer program/software; gene product

Project start date: 2009-08-31

Project end date: 2011-08-30

Budget start date: 31-AUG-2009

Budget end date: 30-AUG-2011

PFA/PA: PA-07-070

3R01RR020823-05S1_8128 (2009): $249709


GENERATING AND MANAGING LARGE SCALE PROTEOGENOMIC DATA FOR ENCODE CELL LINES

Morgan Corinne Giddings
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Grant 5RC2HG005591-02 from National Human Genome Research Institute

Abstract: The first human genome sequence was published in 2001, yet as of now, eight years later, major questions remain, such as how many genes are encoded by the genome, and of those genes, how many functional products are encoded due to phenomena like alternative splicing. The Encyclopedia of DNA Elements (ENCODE) project has been coordinated by National Human Genome Research Institute (NHGRI) to answer these questions by comprehensively classifying functional elements on the human genome. The pilot phase of the project studied 1% of the genome in detail, revealing extensive transcription well beyond that predicted by classical gene models. The biological function of a significant portion of the discovered transcripts is unclear. The ENCODE project is now scaling up to examine the whole human genome. It is likely that results will echo the pilot project, revealing extensive transcription, a significant fraction of which has unexplained function. Proteomic technologies can be applied, in a process called proteogenomic mapping, to determine which of the myriad transcripts encode proteins. This approach has been used to reveal new genes, new alternative splice variants, new start sites, and upstream open reading frames (ORFs). While substantive progress has been made in developing proteogenomic mapping technologies, a significant hurdle in using proteogenomics to assist with the ENCODE project is the lack of proteomic data sets that are coordinated with the ENCODE transcription mapping efforts. Here we propose to generate large-scale proteomic data sets directly from the same tier I ENCODE cell lines studied by the transcription efforts, coordinating the results with the transcription mapping efforts to determine which of the pervasive transcripts are translated. Our specific aims are to 1) produce large scale proteomic data sets on ENCODE cell lines using the most advanced mass spectrometry methods, 2) use our database technologies to store, manage, and make accessible to the community all results of the project, and 3) use our software pipeline to map the results to the latest human genome drafts, producing a UCSC (University of California Santa Cruz) genome browser track with the results. We believe the result will be a significant advancement in knowledge about our genomes and the functional products they encode. The human genome is the blueprint for human life and human health, but we do not yet understand its language - the language of genes. The ENCODE project is deciphering that language systematically, and the goal of this proposal is to accelerate that effort by revealing which parts of the blueprint contain instructions to build proteins

Keywords: Affect; Alternate Splicing; Alternative Splicing; Assay; Bioassay; Biochemical; Biologic Assays; Biological; Biological Assay; Biological Function; Biological Process; California; Cell Line; Cell Lines, Strains; CellLine; Code; Coding System; Communities; Computer Programs; Computer software; Crossmatching, Tissue; DNA; DNA Sequence; Data; Data Banks; Data Bases; Data Coordinating Center; Data Coordination Center; Data Element; Data Set; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Deoxyribonucleic Acid; Elements; Exons; Faculty; Functional RNA; Gene Products, RNA; Gene Transcription; Genes; Genetic Transcription; Genome; Goals; Grant; Health; Histocompatibility Testing; Human; Human Cell Line; Human Genome; Human, General; Immunology; Immunology (Including BRMP); Immunology (NCI Program); In element; Indium; Instruction; Investigators; Knowledge; Language; Life; Man (Taxonomy); Man, Modern; Management Information Systems; Maps; Mass Spectrum; Mass Spectrum Analysis; Methods; Modeling; Molecular and Cellular Biology; National Human Genome Research Institute; Nature; Non-Coding; Non-Coding RNA; ORFs; Open Reading Frames; Peptides; Phase; Photometry/Spectrum Analysis, Mass; Pilot Projects; Policies; Process; Protein Coding Region; Protein Splicing; Proteins; Proteome; Proteomics; Publishing; RNA; RNA Expression; RNA Splicing; RNA Splicing, Alternative; RNA, Non-Polyadenylated; Research Personnel; Researchers; Ribonucleic Acid; Site; Software; Spectrometry, Mass; Spectroscopy, Mass; Spectrum Analyses, Mass; Spectrum Analysis, Mass; Splicing; Structure; Technology; Tissue Crossmatchings; Tissue Typing; Transcript; Transcription; Transcription, Genetic; Translating; Translatings; Universities; Variant; Variation; Work; base; clinical data repository; clinical data warehouse; computer program/software; cultured cell line; data repository; develop software; developing computer software; experience; gene product; genome sequencing; histocompatibility typing; improved; insight; language translation; pilot study; public health relevance; relational database; repository; scale up; software development

Relevance: Narrative The human genome is the blueprint for human life and human health, but we do not yet understand its language - the language of genes. The ENCODE project is deciphering that language systematically, and the goal of this proposal is to accelerate that effort by revealing which parts of the blueprint contain instructions to build proteins

Project start date: 2009-09-26

Project end date: 2011-06-30

Budget start date: 1-JUL-2010

Budget end date: 30-JUN-2011

PFA/PA: RFA-OD-09-004

5RC2HG005591-02 (2010): $800000


SOFTWARE TO IDENTIFY POST-TRANSLATIONAL MODIFICATIONS FROM PROTEOMIC DATA SETS

Morgan Corinne Giddings, Associate Professor
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Grant 3R01RR020823-05S1 from National Center For Research Resources

Abstract: Proteins are the workhorses of cells, comprising much of the machinery of life. Chemical changes due to co- or post-translational modifications, or amino acid substitutions resulting from genetic variation, can alter protein function and have significant consequences on the functioning of a cell. Pinpointing chemical changes in proteins in an automated manner remains an elusive goal. Mass spectrometry (MS) based methodologies are promising for examining such alterations, since they are exquisitely sensitive to the resulting shifts in mass. There are two main approaches that can be used for examining proteins by MS, one which measures the intact masses of proteins to detect shifts indicative of modifications (called top-down), and the other which enzymatically digests proteins into short peptides, then analyzes their chemical structure by tandem mass spectrometry (called bottom-up). Each of the existing MS methods has limitations, such as lack of complete protein coverage for bottom-up, and the inability to use top-down data to uniquely identify modifications; these drawbacks have motivated the development of hybrid combinations such as "top-down bottom-up" (TDBU) proteomics. Though these are seeing a surge of interest, there is an acute lack of comprehensive, automated software for combining measurements from the distinct MS approaches; thus, studies to date have relied upon extensive manual analysis and/or ad hoc program scripts, inhibiting progress in the field. We propose to address this issue using our two existing programs, PROCLAME for analyzing top-down data, and GFS for analyzing bottom-up data, to develop integrated, open-source software that combines data from multiple MS methodologies to pinpoint posttranslational modifications and amino acid substitutions in proteins. Our aims are 1) to integrate multiple MS data sources for determining the type and location of modifications on proteins, by adding a Markov chain Monte Carlo (MCMC) based engine to PROCLAME; 2) to improve the ability to analyze bottom-up data by enhancing GFS for the automatic determination of posttranslational modifications; 3) to manage and integrate results from multiple MS measurements and search engines, by developing a database system and scripts to tie the programs together; and 4) to assure program reliability and suitability through both alpha testing in-house and beta testing at external sites. Health Relevance Both amino acid substitutions and misregulation of enzymes that modify proteins play roles in human diseases such as Cancer, Diabetes, Sickle Cell Anemia, and many others. This proposal is to build generalized software that can be used by a broad base of researchers to pinpoint the chemical changes/modifications to proteins that perturb regulatory networks in cells to cause disease.NARRATIVE Both amino acid substitutions and misregulation of enzymes that modify proteins play roles in human diseases such as Cancer, Diabetes, Sickle Cell Anemia, and many others. This proposal is to build generalized software that can be used by a broad base of researchers to pinpoint the chemical changes and modifications to proteins that perturb regulatory networks in cells to cause disease, by integrating data from the latest proteomic technologies

Keywords: No Project Terms available

Project start date: 2009-08-31

Project end date: 2011-08-30

Budget start date: 31-AUG-2009

Budget end date: 30-AUG-2011

PFA/PA: PA-07-070

3R01RR020823-05S1 (2009): $249709


DEVELOPING PROTEOGENOMIC MAPPING FOR HUMAN GENOME ANNOTATION

Morgan Corinne Giddings, Associate Professor
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Grant 5R01HG003700-05 from National Human Genome Research Institute

Abstract: Genome sequencing efforts are producing ever greater quantities of raw DNA sequence, but the annotation process for locating and determining the function of genetic elements has not kept up. While many aspects of annotation are difficult, it is particularly challenging to determine which parts of a genome sequence encode proteins, and therefore how the processes leading to protein translation are regulated. Not only are technologies for examining proteins more limited than those for studying RNA transcription, in an extensive study of transcription by the Encyclopedia of DNA elements consortium, a picture of great complexity emerged. The project uncovered many novel exons, alternative splice forms, and novel regulatory elements. These results indicate that nearly 9/10ths of human genes undergo alternative splicing, and the average gene produces approximately 6 splice variants. Rather than solidify knowledge regarding the location and function of genes, these results question whether we accurately know what constitutes a gene, and how the products encoded by genes determine the function of cells. The results particularly obfuscate determination of which transcripts are selected for translation to protein, further complicating annotation efforts. To address that gap, our project will determine which transcripts encode proteins, and how these are affected in several tissue types and disease conditions. We will use large tandem mass spectrometry-based proteomic data sets, mapping the analyzed protein data directly to several available human genome sequences, along with sets of predicted transcripts produced by the N-SCAN and CONTRAST gene finders, to reveal which parts of transcripts are translated into proteins, and in which types of cells this translation occurs. To accomplish this, our project has three specific aims 1) to develop high-accuracy methods and software for mapping proteomic data from mass spec analyzed proteins directly to the genome locus encoding them; 2) to develop an analysis pipeline software system using a novel rule-based information management approach; and 3) to apply these developments for the high-throughput analysis of large proteomic data sets, identifying the transcripts that encode proteins in distinct tissue types and disease conditions, and placing the results in a publicly accessible track in the UCSC genome browser. We believe this project will yield significant knowledge about the location and timing of protein translation in cells, which will potentiate further investigation of how misregulation of the path from transcription to translation leads to human disease conditions. Sequencing of the human genome is complete, but figuring out where genes are located, how they function, and how they cause or prevent human diseases like cancer has only just begun. Genes act as blueprints for RNA and proteins, the workhorses of the cell. We are developing technologies to address the key challenges of determining which genes specify the building of which proteins and how this process is orchestrated to ultimately unravel how disease processes occur

Keywords: Address; Affect; Algorithms; Alternate Splicing; Alternative Splicing; Biochemical; Body Tissues; Cancers; Cell Function; Cell Process; Cell physiology; Cells; Cellular Function; Cellular Physiology; Cellular Process; Code; Coding System; Collaborations; Communities; Complex; Computer Programs; Computer Software Tools; Computer software; Crossmatching, Tissue; Custom; DNA; DNA Sequence; Data; Data Banks; Data Bases; Data Set; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Deoxyribonucleic Acid; Development; Disease; Disorder; Elements; Exons; Foundations; Funding; Gene Products, RNA; Gene Targeting; Gene Transcription; Genes; Genetic Transcription; Genome; Genome, Human; Goals; Grant; Histocompatibility Testing; Human; Human Genome; Human, General; Imagery; Information Management; Investigation; Investigators; Isotope Labeling; Knowledge; Learning, Machine; Link; Location; Machine Learning; Malignant Neoplasms; Malignant Tumor; Man (Taxonomy); Man, Modern; Maps; Mass Spectrum; Mass Spectrum Analysis; Measures; Methods; Mining; Minings; Modeling; Nature; Paint; Peptides; Photometry/Spectrum Analysis, Mass; Play; Procedures; Process; Protein Analysis; Proteins; Proteomics; Quality Control; RNA; RNA Expression; RNA Splicing; RNA Splicing, Alternative; RNA, Non-Polyadenylated; Regulation; Regulatory Element; RegulatoryElement; Research Personnel; Researchers; Ribonucleic Acid; Role; Sampling; Scanning; Software; Software Tools; Source; Specific qualifier value; Specified; Spectrometry, Mass; Spectroscopy, Mass; Spectrum Analyses, Mass; Spectrum Analysis, Mass; Speed; Speed (motion); Splicing; Structure; Subcellular Process; System; System, LOINC Axis 4; Targetings, Gene; Technology; Time; Tissue Crossmatchings; Tissue Typing; Tissues; Tools, Software; Transcript; Transcription; Transcription, Genetic; Translating; Translatings; Translations; Variant; Variation; Visualization; base; cell type; clinical data repository; clinical data warehouse; computer program/software; data repository; design; designing; disease/disorder; experience; flexibility; gene function; gene product; genetic element; genome sequencing; high throughput analysis; histocompatibility typing; human disease; improved; kernel methods; language translation; malignancy; neoplasm/cancer; new technology; novel; prevent; preventing; public health relevance; relational database; social role; software systems; statistical learning; support vector machine; tandem mass spectrometry; web interface

Relevance: NARRATIVE Sequencing of the human genome is complete, but figuring out where genes are located, how they function, and how they cause or prevent human diseases like cancer has only just begun. Genes act as blueprints for RNA and proteins, the workhorses of the cell. We are developing technologies to address the key challenges of determining which genes specify the building of which proteins and how this process is orchestrated to ultimately unravel how disease processes occur

Project start date: 2005-09-16

Project end date: 2012-03-31

Budget start date: 1-APR-2010

Budget end date: 31-MAR-2011

PFA/PA: PA-07-070

5R01HG003700-05 (2010): $435435


SOFTWARE TO IDENTIFY POST-TRANSLATIONAL MODIFICATIONS FROM PROTEOMIC DATA SETS

Morgan Corinne Giddings, Associate Professor
University Of North Carolina Chapel Hill, Office Of Sponsored Research, Chapel Hill, Nc 27599

Grant 5R01RR020823-06 from National Center For Research Resources

Keywords: Acute; Address; Amino Acid Substitution; Animal Welfare; Bibliography; Cancers; Cells; Chemical Structure; Chemicals; Computer Programs; Computer software; Country; Data; Data Banks; Data Bases; Data Set; Data Sources; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Development; Diabetes Mellitus; Disease; Disorder; Ecological impact; Environment; Environmental Impact; Enzymes; Equipment; Ethics Committees, Research; Gene variant; Genetic Diversity; Genetic Variation; Goals; Hb SS disease; HbSS disease; Health; Hemoglobin S Disease; Hemoglobin sickle cell disease; Hemoglobin sickle cell disorder; Housing; Hybrids; IACUC; IRBs; Impact, Environmental; Institutional Animal Care and Use Committee; Institutional Review Boards; International; Investigators; Life; Location; Malignant Neoplasms; Malignant Tumor; Manuals; Markov Chains; Markov Process; Mass Spectrum; Mass Spectrum Analysis; Measurement; Measures; Method LOINC Axis 6; Methodology; Methods; Modification; Names; Peptides; Photometry/Spectrum Analysis, Mass; Play; Post-Translational Modifications; Post-Translational Protein Processing; Posttranslational Modifications; Principal Investigator; Programs (PT); Programs [Publication Type]; Protein Modification; Protein Modification, Post-Translational; Protein Processing, Post-Translational; Protein Processing, Posttranslational; Protein/Amino Acid Biochemistry, Post-Translational Modification; Proteins; Proteomics; Research; Research Ethics Committees; Research Personnel; Research Resources; Researchers; Resources; Role; Sickle Cell Anemia; Site; Software; Sources, Data; Spectrometry, Mass; Spectroscopy, Mass; Spectrum Analyses, Mass; Spectrum Analysis, Mass; System; System, LOINC Axis 4; Testing; Variation (Genetics); Vertebrate Animals; Vertebrates; ing; allelic variant; base; clinical data repository; clinical data warehouse; computer program/software; data repository; diabetes; disease/disorder; expiration; gene product; human disease; human subject; improved; interest; malignancy; neoplasm/cancer; open source; programs; protein function; relational database; sickle cell disease; sickle disease; sicklemia; social role; tandem mass spectrometry; vertebrata

Project start date: 2004-09-24

Project end date: 2011-11-30

Budget start date: 1-DEC-2009

Budget end date: 30-NOV-2010

PFA/PA: PA-07-070

5R01RR020823-06 (2010): $324220