Hadoop: It's not just for MapReduce
Tyler A. Young
Dr. Alan Garvey, Faculty Mentor
Hadoop MapReduce is a popular framework for performing computationally-intense tasks by harnessing computing clusters. At present, though, some common use cases are much easier to implement in MapReduce than others. In this presentation, I describe a design pattern implemented at the University of Illinois at Urbana-Champaign for using Hadoop to distribute non-parallelizable natural language processing (NLP) tasks across a cluster. This pattern allowed us to avoid time-consuming modifications to the existing NLP tools, while still benefitting from massive data parallelism.
Keywords: distributed computing, natural language procesing
Topic(s):Computer Science
Presentation Type: Oral Paper
Session: 402-1
Location: VH 1010
Time: 2:30