5 Questions with Amy Apon
Welcome to the third installment of our “5 Questions” interview series. This month we chatted with Amy Apon PhD., the professor and chair of the Division of Computer Science for Clemson University.
Dr. Apon maintains an active research program at Clemson where she works with cloud computing, performance modeling and analysis of parallel and distributed system and data-intensive computing, among others. Her research program is currently supported by a number of organizations, including the National Science Foundation, the Department of Education, BMW, HPCC Systems, LexisNexis, Elsevier Scopus, RELX Group, and Amazon.
We recently spoke with Dr. Apon to discuss her career in education and research and where she hopes to take her program in the future.
Why did you pursue a career in higher education and research?
When I was a little girl, I always thought I would be a school teacher, and I had that in my head through my undergraduate degree. Then I had the opportunity to go to graduate school and found that I also enjoyed teaching at the college level. I would say that it’s not been one big decision but a number of small steps, and at each point you take the best option that you have.
I had the opportunity to teach at a couple of different institutions, and then I went back to school to pursue a Ph.D. in Computer Science. That really opened doors for me. It gave me the opportunity to think about doing research projects and mentor other people who were interested in pursuing academic careers. Anyone who wants to go into academics should love working with students because that’s a big part of the job.
You can love research and maybe not like teaching, or you can love teaching and maybe not like research, but to do the job that I’m doing at a place like Clemson, you have to actually love both to be effective.
If there was one thing you could tell the world, as it relates to your research, what would it be?
I would tell prospective researchers to have fun and be open to the possibilities. I started out in performance analysis of parallel and distributed computing systems, but through the years, I’ve had more collaborations in different areas. Today, I’m actually doing a lot of collaboration in areas of natural language understanding. It’s important to be open to the possibilities of new research areas as they present themselves.
Another thing I would tell prospective researchers is that It’s okay to fail, because one of the things I found is that the chance that you make the right hypothesis the first time are not too high. Most often, you’re going to find the thing you tried didn’t exactly work the way you thought it would. That is why we enjoy working with HPCC Systems, because it’s such an interesting distributed and parallel environment. The open source platform provides a lot of opportunity for investigation into the performance and efficiency aspect of parallel systems.
What are the strengths of HPCC Systems compared to other platforms you may have used?
HPCC Systems is an all-in-one integration of data storage, transfer and preparation and analysis tools. You can get this type of environment with other platforms, but you often have to put the pieces together yourself. HPCC Systems brings them into a single, integrated environment. We also found the ECL language to be useful in data preparation and analysis. We’ve had some messy data sets that we’ve tried to work with and ECL is a great tool for data cleaning and curating, so that we can do the analysis that we’re actually trying to do.
How has HPCC Systems increased your teams’ research capabilities?
We have moved into areas of text analysis as a part of natural language understanding and have found the ability of machine learning libraries to analyze and understand the sentiment or topics of the text have been helpful in doing this analysis. The corpus of journal articles we have is pretty complex. For example, the language over a period of decades changes in the way people speak and write journal articles, and we’ve been able to analyze that complex data set using HPCC Systems.
What do you consider to be your most promising area of research in the near future?
As you pointed out, in the past, I had worked in areas of performance analysis. We’ve done some experiments in HPCC Systems where we looked at some of the performance monitoring and profiling tools. For the future, we are focusing on natural language understanding. I’m doing this with collaborators here at Clemson, who have expertise in text analysis, recommendation systems and various kinds of word and text analysis tools. I think it’s fascinating because there are so many ways that text and language can be used as a part of our lives.
We want to understand not just written language, but also spoken language. Thinking about the way people use social media, we want to be able to analyze some of those trends and see how they can be both predictive and descriptive of things that are happening in certain areas. There’s an abundance of ways that natural language understanding can be useful in impacting our lives. I’m very excited about that as a research area going forward.
Want to hear more from Amy’s interview with Flavio Villanustre, VP of Technology, LexisNexis Risk Solutions? Listen to the webcast where Amy continues her discussion with Flavio, diving into her research and work with HPCC Systems as well as what she hopes to accomplish in the future.