CSE507


Course

CSE507

Title

Introduction to Computational Linguistics

Description

An introduction to the techniques, literature, technologies and current challenges of computational linguistics. Survey covers four areas of computational linguistics using English as example language: mathematical foundations, syntax, semantics and discourse.

Prerequisite

CSE537 for Computer Science Students.

Credit Information

3 - credits

Course Topics
  1. Mathematical Foundations and Words
    1. Regular expressions and finite-state automata
    2. Finite-state transducers and morphology
    3. Review of basic probability and statistics
    4. N-gram language models
    5. Part-of-speech tagging
  2. Syntax
    1. Context-free grammars
    2. Parsing using context-free grammars
    3. Features and unification
    4. Grammars of English
    5. Lexicalized and probabilistic parsing
  3. Semantics
    1. Propositional and first-order logics and extensions to handle time, default reasoning, probabilistic reasoning
    2. Representing meaning
    3. Semantic analysis
    4. Lexical semantics
  4. Discourse
    1. Reference resolution
    2. Discourse structure: text and dialogue
    3. Dialogue systems

Special topics, which are interspersed with the regular course topics, may include: information extraction; alternative grammar formalisms (e.g. tree-adjoining grammars), machine translation, natural language generation

Textbook(s)

D. Jurafsky and J. Martin (2000), Speech and Language Processing, Prentice Hall, NJ.
The textbook is augmented with selected chapters from other books:

Topic I. C. -- C. Manning and H. Schutz (2000), Statistical Natural Language Processing, Chapter 2.
Topic II. D. -- chapter J. Allen, Natural Language Understanding, Chp. 5.

Also, as the course progresses students are increasingly directed to original research papers on the topics of discussion. A selection of these papers is placed on reserve each semester.

Prerequisites by topic

None.

Course Objectives
  1. Make students familiar with the range of tools and techniques used in computational linguistics. Show them where to find the literature in this field, and give them confidence in reading it. In short, improve their abilities as computational linguists.
  2. Give students the ability to look at a computational linguistics task and identify useful techniques for solving it. Encourage them to think creatively and give them experience in planning and conducting experiments in this field. In short, improve their abilities as researchers.
  3. Encourage students to collaborate with others in different fields. Because this course is interdisciplinary, computer scientists end up working with linguists and psycho-linguists, which is exactly what happens in computational linguistics all the time.
Computer Usage

Perl, Prolog. Students may also choose to use other languages and tools for projects, e.g. Java, HTK, C/C++. There are no labs for this course. However, heavy use is made of the Blackboard system.

Course Webpage

http://www.cs.sunysb.edu/~cse507

Course Coordinator

Dr. Amanda Stent