My joy with ILTM

I had the joy of becoming a core faculty member of the Institute of Leadership in Technology and Management for the past two summers (Summer 2017, 2018). I found this to be one of the most transformative experiences available to Bucknell students since I’ve been here. I was honored to be part of this program. I worked with some absolutely wonderful students in ILTM! However, as a result of this opportunity, my scholarship was substantially halted for the last two summers. Thus, I have not taken on any new students for quite some time.

I was also on sabbatical during the entire 2017-18 academic year. During this time, I continued to work on interesting projects collaboratively with Dr. Vanessa Troiani at Geisinger Autism and Developmental Medicine Institute. As much as I’ve found much pleasure working in various areas of bioinformatics, I decided it was time for me to explore other areas of sequential data analysis. Dr. Troiani and her lab members have invigorated me with new opportunities in pattern mining mass quantities of eye-tracking data. This ultimately led to another collaborative project involving Dr. Troiani and our own Prof. Evan Peck. Slowly, the research agenda is ramping back up again. I applied to 5 different grant opportunities, of which, to date, one has been awarded, and a much larger one is currently under review.

I’ve also become more involved in interdisciplinary teaching and research opportunities across the university. Bucknell is at a point now where we can truly provide some very interesting transformative experiences to our students – rare opportunities that very few colleges can offer. To do so, however, we must leverage the opportunities that exist across disciplines. Thus, I’ve been intentional in my pursuits to identify new opportunities outside of my own department, and my own home – the College of Engineering. For instance, I’ve had great joy working with my collegue, Prof. Abby Flynt, on both teaching and research projects. (We both recently received the Presidential Award for Teaching Excellence for 2018, and co-mentored a wonderful student, Alexander Murph, who completed an honors thesis and is now at UNC Chapel Hill pursuing his PhD in Statistics!)

Speaking of new, unique opportunities for interdisciplinary work. I’m looking forward to seeing new things happen with our new College of Management, where I expect some interesting collaborations with new faculty who will be part of their new Analytics and Operations Management program. I’ve been spending time with them recently serving on a committee to help them hire new faculty for this exciting program.

Of course, I can’t forget our wonderful friends in Biology, who were so instrumental in collaborating on my bioinformatics projects very early on during my pre-tenure days here. Needless to say, there are great colleagues across this university, with lots of data! It’s a rich place for a data scientist!

Sequential data mining and analysis – it will always remain my primary area of focus, and it’s exciting to be able to afford the risks with tenure to be able to stretch my core interests toward new areas. Fortunately, sequential data are ubiquitous. Thus, I’ve branched away from biological sequence analysis and delved into numerous other areas of sequential data. I will update soon.

My post tenure feelings

So, is tenure all it’s cracked up to be? Well, I’m now in the midst of my third year post tenure. Or is it my second? I don’t even know. I’m burnt out, thanks to the vicious down side of tenure – SERVICE! Once a faculty member receives tenure, it seems as though you are put on a list by the administration throughout the college and university. This list is special. I believe the title of the list is, “PEOPLE WHO WE CAN GUILT INTO SERVING ON COMMITTEES NOW THAT THEY HAVE TENURE.” This semester, I honestly have lost count of the committees and the other opportunities where I have said “yes” to volunteer for opportunities to help my colleagues. The result, I regularly have a minimum of 10-15 additional hours per week dedicated to service obligations; that has recently reached 20+. Those are hours that are on top of my normal teaching hours in and out of the classroom, which are easily 40+ hours, and that doesn’t include my normal teaching/service duties, such as academic advising, department meetings, mandatory caffeine pursuits, and so on. (An academic has no concept of a 40-hour work week. It doesn’t exist.) This is a huge challenge that I’m struggling with. It is due, in part, to a very young, vibrant department of faculty who are going through the tenure process. Thus, the relatively few of us who have tenure take on a lot of the service obligations to protect them as they work through tenure. And, of course, I know, I know… the real reason? Because I often find it difficult to say, “NO!” Like I said, I work at a great place with wonderful colleagues. I believe it’s important to pay it forward. I had people senior to me who were once in my shoes and protected me from excessive service obligations, and I will do the same. The challenge is the imbalance in the department. We’ve had a lot of people retiring in recent years. In time, the balance should be back to normal as others get through the tenure review process, and they can share the service burdens.

Anyway, the most important thing that has me excited? First, I’m teaching BOTH a data mining AND a data science course this Spring! Second – this summer 2019 is mine! All mine! [Insert-evil-laugh-here]. I have not had a summer for research since 2016. So, I have several projects that are ramping up, and I am looking for new students to work with me this summer. Funding is available. Send me an e-mail if you are interested.

Summer 2016

It has been quite some time since I’ve updated current events. Thanks to our students, we have had a pretty active summer…

  • Robert Cowen is continuing his work with me on word prediction models. We have good results and are writing our first paper. The first draft should be complete by the beginning of September.
  • Morgan Eckenroth has started work on the development of a virtual reality app (using Google Cardboard) that will be used by autistic children to help assess (and hopefully retrain) biases in their visual processing
  • Khai Nguyen is working on a collaborative project, funded together by the College of Engineering, Chemical Engineering, and Computer Science. The aim of the project is to develop a new application for aerosol researchers in Chem Eng.
  • Ryan Stecher is working on a collaborative project with Dr. Aaron Mitchel in Psychology to develop and finalize a web-based series of perception tests.
  • Tongyu Yang has been investigating the use of deep learning to help autism researchers better understand why autistic children have substantial interest in certain types of images

Son Pham, ’17

Project: Using Deep Learning to Automatically Learn Feature Representation and Build a Better Classification Model on Protein Sequential Data
Started: Summer 2015
Funding: Bucknell University PUR

ABSTRACT

In theory, deep learning is not new. However, it has recently become one of the most exciting directions that machine learning has witnessed in years. It has had a tremendous impact on image classification. However, there are very few methods that have investigated its use on strictly sequential data, such as those found in biological sequences. This study will aim to investigate the use of deep learning to induce a protein sequence classifier that can outperform existing methods.

ACHIEVEMENTS

  • Poster Presentation – Sigma Xi 2015 Summer Research Symposium
  • Poster Presentation – Fifth Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2015, August 4, Bucknell University, Lewisburg, PA
  • Poster Presentation – Presented at 15th Annual Kalman Research Symposium, April 2, 2016, Bucknell University, Lewisburg, PA

POST GRADUATION UPDATES

Son graduated with his degrees in Computer Science and Engineering, together with Digital Studio Arts. He went on to work for Amazon as an Software Engineering Intern, then took a position at Google working with machine learning. Son graduated with the aim of going back to graduate school in 1-2 years.

Chuqiao Ren, ’15

Project: A novel ensemble classifier for protein contact map prediction
Duration: Summer 2013 – Spring 2015
Funding: Bucknell University Program for Undergraduate Research, BRK Startup Fund, Geisinger BGRI Grant, CS Dept. Fund

ABSTRACT

One of the greatest challenges in bioinformatics is how to predict the 3-D structure of a protein by understanding the relationship between a sequence and its amino acid structure.  A protein contact map is a useful way of representing protein 3-D conformations. It is based on a distance matrix, which is a symmetric matrix that contains the Euclidean distance between each pair of C-alpha atoms in each residue in the folded protein.  

Our goal is to improve existing machine learning algorithms for predicting a protein contact map from protein sequence, and develop a novel algorithm that improves the performance of existing contact map predictors.

ACHIEVEMENTS

  • Honors Thesis – Successfully defended, April 2015
  • Short paper and poster – ACB BCB ’14 – ACM International Conference on Bioinformatics, Computational Biology and Biomedicine, Sept 20-23, Newport Beach, CA [link]
  • Poster Presentation – Fourth Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2014, August 5, Geisinger Research, Danville, PA
  • Poster – Kalman Research Symposium 2014, March 29, Bucknell University, Lewisburg, PA.

POST GRADUATION UPDATES

Chuqiao successfully defended her honors thesis in April, 2015. She is staying for a bit longer this summer to help finish a journal publication and submit before she departs us. She is currently planning on pursuing her graduate degree in computer science at Columbia University, starting Fall 2015. Congratulations, Chuqiao!

Summer 2015

We have an active summer in store. Three students are working on entirely different research projects, while Rachel Ren is wrapping up her work.

  • Son Pham is working on investigating the use of Deep Learning for protein sequence classification. Deep Learning has recently gained substantial recognition due to its success with automated image recognition and speech classification. Very few have examined its use in bioinformatics. Son will help me explore this untapped area in bioinformatics.
  • Jason Hammett will be applying data mining techniques to years of regional climate data, including local stats for the Susquehanna River, to develop explanatory and predictive models for anomalistic weather events around the Susquehanna River Valley.
  • Robert Cowen will be continuing the wonderful work that I started with Bucknell Student Stephanie Gonthier last year on word prediction. Robert will be collaborating with myself and speech pathologists at the Geisinger-Bucknell Autism and Developmental Medicine Institute (ADMI) to develop a preliminary version of a new augmentative and alternative communication (AAC) app that will utilize my word prediction model. This first version will be developed to run on Android tablets.
  • Rachel Ren is graciously staying for a month after graduating to help submit a paper based on her extensive work completed for her honors thesis. Stayed tuned!

Spring 2015

Rachel Ren successfully defended her honors thesis, titled, “Predicting Protein Contact Maps by Bagging Decision Trees”. Congratulations, Rachel! Additionally, Rachel will be attending graduate school starting in the fall at Columbia University, where she will pursue a Masters in Computer Science. Rachel intends to focus on research in machine learning.

Congratulations, Rachel! Bucknell is proud of you! We wish you the very best as you pursue your graduate work.

Stephanie Gonthier, ’15

Project: Using statistical learning to improve word prediction for augmentative and alternative communication
Duration: Summer 2014
Funding: Bucknell University Program for Undergraduate Research, Geisinger BGRI Grant

ABSTRACT

There are a multitude of reasons why people may be unable to communicate effectively through verbal speech, including disorders like ALS, MS, Cerebral Palsy and Autism. Some people use augmentative and alternative communication (AAC), which is simply any mode of communication besides verbal speech, including gestures, writing, facial expressions, pointing to pictures and so on. In recent decades, the field of AAC has been flooded by electronic devices which generate speech for these people based on combinations of pictures, symbols and/or words that are stored on the device. Unfortunately, these devices do present problems; notably, the communication rate with a device is reduced to a fraction of the communication rate of normal speakers. The average user of a device is only able to communicate 10 words per minute, compared to the 130-200 words per minute of an average speaker [ref]. This stark contrast can leave users frustrated, reducing the utility of such devices. The aim of this research is to develop a novel algorithm that would increase the communication rate for users of AAC devices.

ACHIEVEMENTS

  • Oral PresentationFourth Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2014, August 5, Geisinger Research, Danville, PA
    Winner for oral presentation – One of three chosen out of 86 submissions!
  • Poster Presentation – 2014 Sigma Xi Summer Student Research Symposium, July 24, Bucknell University, Lewisburg, PA

POST GRADUATION UPDATES

Elizabeth Dwornik, ’14

Project: Named-Entity Recognition
Duration: Summer 2013 – Spring 2014
Funding: Bucknell University Program for Undergraduate Research

ABSTRACT

Liz is working on a system that can annotate all of the named entities within a text. There are good systems that can identify named entities, however, identifying the type of named entity is a more challenging problem. Many successful systems use simple database lookup techniques and identify entities from a master gazetteer. We are working on a system that can distinguish among different types of named entities without a gazetteer. Our initial efforts will focus on distinguishing entities between location, organization, or person. We plan to start by developing a large set of regular expressions that can be used to classify the different types of entities.

ACHIEVEMENTS

  • Poster: Kalman Research Symposium 2013, April 13, Bucknell University, Lewisburg, PA

POST GRADUATION UPDATES

Liz pursued graduate school studies at Carnegie Mellon University, starting Fall 2014. She enrolled in the Software Management program in the Information Networking Institute. Congratulations, Liz!

Matthew Rogge, ’17

Project: Analysis of Spike Timing Dependent Neural Networks for More Efficient Starting State Learning
Duration: Summer 2014
Funding: Bucknell University Program for Undergraduate Research

ABSTRACT

Artifical Neural Networks (ANN) have been a popular machine learning method for decades. They aim to simulate the behavior of the neurons in the biological brain. One particular type of ANN that is an especially accurate representation of biological neurons is the Spike-Timing Dependent ANN. These ANNs differ from traditional back propagation ANNs in that they rely on the timing and frequency of signals, rather than their strength, to learn and process information. This type of ANN of often ignored for many reasons, mostly due to the computational complexity of learning using these models. On substantial challenge lies in the difficulty of determining the initial configuration of the network. The time required to train the network is also a formidable challenge. My research seeks to eliminate one of these hurdles by deriving an efficient algorithm that can determine the proper starting configuration for the ANN.

ACHIEVEMENTS

  • Poster: 2014 Sigma Xi Summer Student Research Symposium, July 24, Bucknell University, Lewisburg, PA

POST GRADUATION UPDATES

Charles Cole ’14

Project: Using Machine Learning to Predict the Health of HIV-Infected Patients
Duration: Summer 2012 – Spring 2014
Funding: Bucknell University PUR, Biology Dept. and CS Dept. Funding

ABSTRACT

HIV is one of the most devastating viruses to hit mankind in modern history. About half of people infected will acquire AIDS. For some, however, the virus will lay in a stage known as “clinical latency” for 10, perhaps up to 20 years; in this stage, the symptoms are mild, sometimes even non-existant. This study aims to investigate the potential existance of specific patterns in the genome of HIV, and the prognosis of the infected patient. Discovery of such patterns could help aid researchers in improved understanding of the genetics of HIV, assisting in identifying potential patterns that researchers should look for to help infected doctors predict patient prognosis more accurately. Moreover, the identification of specific mutations or recurring patterns that are highly deleterious to the infected patient could aid in the development of drugs to target those genes containing the deleterious mutations.

ACHIEVEMENTSS

  • Honors thesis defense passed – April 25, 2014
  • Short paper and poster: ACB BCB ’13 – ACM International Conference on Bioinformatics, Computational Biology and Biomedicine, Sept 22-25, Washington DC
  • Oral presentation: Third Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2013, August 6, Geisinger Research, Danville, PA
    • Winner for oral presentation – One of three chosen out of 67 submissions!
  • Poster: Kalman Research Symposium 2013, April 13, Bucknell University, Lewisburg, PA.

POST GRADUATION UPDATES

Charles was accepted into to a pre-med program at Temple University, and will be starting medical school immediately thereafter.