Alex Barteau, ’13

Project: Development of protein sequence analysis software
Duration: Summer 2011 – Spring 2012
Funding: University of Nebraska Medical Center
Collaboration: Dr. Chittibabu Guda

ABSTRACT

Proteins are the essence of every living organism. Every protein has a well-defined function, and must localize in the cell in order to carry out its function. This information about the protein is encoded in the protein sequence itself.  Alex will be working on a project started by Professor King that analyzes protein sequences to look for recurring patterns that are related to protein localization and their function. These observed patterns can then be used to suggest information about new protein sequences. The aim of this project is to make the software publicly available for the biological and biomedical research community.

ACHIEVEMENTS

  • Poster – Sigma Xi Summer Research Symposium, July 27, 2011, Bucknell University, Lewisburg, PA
  • King BR, Vural S, Pandey S, Barteau A, Guda C. ngLOC: software and web server for predicting protein subcellular localization in prokaryotes and eukaryotes. BMC Research Notes; 2012; 5(351) [link] [PDF]

Marc Burian ’12

Project: Twitter sentiment and stock market performance
Duration: Spring 2012 – Summer 2012
Funding: None

ABSTRACT

This is a project that has continued from Marc’s excellent work in data mining. We are investigating an algorithm we developed that clusters similar Tweets in real-time and generates sentiment related to a specific company of interest from incoming Tweets. Using this information and plotting it against the performance of the company’s stock price, we are gauging the potential use of Twitter as a descriptor and predictor of the stock market.

ACHIEVEMENTS

Marc started this project toward the end of my data mining class. There was insufficient time to bring the project into a publishable / presentable form. However, Marc obtained satisfaction of learning the Twitter API, and learning first-hand that “tweets” are extremely noisy sources of information for stock prediction! We were able to uncover several instances that had minute predictive power, but not statistically significant. More work would need to be completed to filter and process tweets to discard meaningless and irrelevant data.

Phil Stahlfeld ’13

Project: Installation and deployment of caBIG – an information management network for cancer research
Duration: Summer 2011 – Spring 2012
Funding: Geisinger Research
Collaboration: Dr. Gerardus Tromp

ABSTRACT

Phil is working with Dr. Gerardus Tromp, a collaborator at Geisinger Medical Research Center in Danville, on deploying a complex software system called the Cancer Biomedical Informatics Grid, or caBIG.  As a product of the National Cancer Instutute, caBIG was conceived for the purpose of sharing data and knowledge among researchers, clinicians, and patients in order to simplify collaboration and speed research to get diagnostics and therapeutocs from bench to bedside faster and more cost-effectively. The system is multi-tiered, with the front end mostly implemented using JBoss and Tomcat applications in Java, and the backend data management provided by PostgreSQL and MySQL. The entire suite is open software and will require some modification to meet the local needs of Geisinger. Phil will be an integral part of the installation, configuration, modification and and test of the system.