Change the future

Saturday 5:10 p.m.–5:40 p.m.

MTO On Blast: Using Python's Natural Language Toolkit to Model Gossip Blogs

Robert Elwell

Audience level:
Novice
Category:
Other

Description

This talk describes a project that uses the Natural Language Toolkit to build a language model from a gossip blog. The tone is light-hearted, but manages to introduce some core concepts in Python's most popular NLP library as well as some basics on computational linguistics and programming in Python.

Abstract

This talk describes how to use Python to programmatically access content from a site, and generate a language model out of that content. We use the Natural Language Toolkit library against content from the gossip blog MediaTakeOut. During this time, we will provide some introduction on how to use Python for data-driven approaches to natural language processing. This talk is intended for beginners with an interest in learning about natural language processing. It is based on a post from my personal blog. For more details please see http://robertelwell.info/blog/mto-on-blast-a-language-model-for-a-gossip-blog/