Curious about techniques and methods for applying data science to unstructured text? Join us at the DC NLP November Meetup!
This month's event features an overview of Latent Dirichlet Allocation and probabilistic topic modeling.
Topic models are a family of models to estimate the distribution of abstract concepts (topics) that make up a collection of documents. Over the last several years, the popularity of topic modeling has swelled. One model, Latent Dirichlet Allocation (LDA), is especially popular.
Tommy Jones, Research Associate Statistician at the Institute for Defense Analyses - Science and Technology Policy Institute, will describe a range of topic modeling algorithms and how they fit into the topic modeling taxonomy. He will then focus on LDA, explaining how to tune its parameters and giving tips for building better LDA models.
Finally, Tommy will present several open statistical questions in topic modeling, particularly LDA. Examples include LDA's inconsistency, how sample selection affects estimates, and how to best present results. Researchers have begun to tackle some of these issues, but others remain. Still, LDA and other topic models are becoming invaluable resources for researchers in many disciplines.
DC NLP November Meetup --------------------- Wednesday, November 12, 2014 6:30 PM to 8:30 PM Stetsons Famous Bar & Grill 1610 U Street Northwest, Washington, DC
The DC NLP meetup group is for anyone in the Washington, D.C. area working in (or interested in) Natural Language Processing. Our meetings provide an opportunity for folks to network, give presentations about their work or research projects, learn about the latest advancements in our field, and exchange ideas or brainstorm. Topics include computational linguistics, machine learning, text analytics, data mining, information extraction, speech processing, sentiment analysis, and much more.