This introduction to computational linguistics stresses the processing of written language with supplementary discussion of topics related to spoken language. The course is based on the textbook, Speech and Natural Language Processing (Daniel Jurafsky and James H. Martin, Prentice Hall, 1999). Course covers finite state automata and finite state techniques for processing words, language models, tagging corpora for part-of-speech, context -free grammars, parsing techniques, unification grammars and unification-based parsing, probabilistic parsing, semantics, discourse modeling, word sense disambiguation and information retrieval, natural language generation, and machine translation.
Additional requirements for doctoral students include:
Read six research papers from a list provided by the instructor
Write a one-page reaction to each paper
Submit a research project at the end of the semester which will be developed under the instructor's direct guidance. What distinguishes a research project from a regular project is that the former addresses an open research problem to which no optimal solution is currently known. The final report on that project will be in the form of a conference paper and will have to be of such a quality to be submitted to a first-tier conference or workshop in NLP or IR. The final project will be evaluated using a realistic benchmark.