DC NLP May 2014 Meetup Announcement: Corpus Linguistics and Geolocating Tweets

Curious about techniques and methods for applying data science to unstructured text? Join us at the DC NLP May Meetup!


This May, look forward to talks on narrative schema across corpora and techniques to geolocate Twitter users.

Dan Simonson is a third year PhD candidate at Georgetown University in the Department of Linguistics. His presentation addresses whether narrative knowledge is helpful in the cross-comparison of corpora. It builds on the work of Chambers and Jurafsky (2008, 2009), which extracts narrative schema, and devises a measure for the cross-comparison of sets of schema.

Ryan McKeown has a PhD in theoretical physics from Penn State University and is currently employed as a data scientist at Booz Allen Hamilton. He'll be presenting his recent work on determining the geo-location of Twitter users based on multiple classifiers combining NLP techniques and social network analysis.

DC NLP May Meetup

Wednesday, May 14, 2014

6:30 PM to 8:30 PM

Stetsons Famous Bar & Grill

1610 U Street Northwest, Washington, DC

The DC NLP meetup group is for anyone in the Washington, D.C. area working in (or interested in) Natural Language Processing. Our meetings will be an opportunity for folks to network, give presentations about their work or research projects, learn about the latest advancements in our field, and exchange ideas or brainstorm. Topics may include computational linguistics, machine learning, text analytics, data mining, information extraction, speech processing, sentiment analysis, and much more.

For more information and to RSVP, please visit:


Data Science MD Unveils YouTube Channel

[youtube] Data Science MD, in an effort to provide additional value to its members, has started a YouTube channel, DataScienceMD, to host videos of talks presented at Meetup events. Now, when a member can't attend an event due to a scheduling conflict or being out of town, they can view the videos after the fact to stay in the loop. However, we know the more likely scenario: seeing the talks in person will not be enough and you will want to see it again and again. (It's OK, we won't tell the presenters how often you are watching them.)

The presentations are available in two formats: individual video entries that cover one specific presentation and playlists which group all presentations from an event together in one package making it easy to relive it in its entirety. The default view when first visiting the channel is to see the most recent activity. By clicking on the Videos link just below the channel title, you will see individual presentations. To see the playlists, simply change the Uploads box to Playlists.

The playlist above is from our May Meetup which featured Cloudera consultants Joey Echeverria and Sean Busbey discussing an infrastructure option that can make analyzing Twitter data quick and simple as well an introduction to one of the many features of Apache Mahout.  These were not just static presentations; they also included live demonstrations/queries against data stored within the infrastructure, and it was all captured in the videos. Check them out!