CS 359: Topics in Artificial Intelligence

    Introduction to Discourse and Dialogue

 
leftside_space.gif (66 bytes)
[ Summary ] [ Requirements ] [ Syllabus ] [ Assignments ] [ Bibliography ] [ Resources ]
[ 1 ]
Assignment 1
Due: Tuesday, October 2nd
Theme:Collect and examine spoken discourse

Procedure:
Collect a sample of naturally occurring spoken discourse. "Naturally-occurring" can be broadly construed to include radio or talk shows, children's play, radio or TV news items, spontaneous or scripted storytelling, classroom interactions, task-oriented conversations, classroom lectures, etc. By "collect" we mean you should do either audio or video recordings. You should collect a minimum of 10 minutes, and then transcribe at least 5 continuous minutes (usually the middle of the discourse is the most natural). By "transcribe" we mean you should make a record on paper of what you saw/heard --a good enough record so that when we read the transcript, we know what went on.   Please indicate and explain any special annotations you use to indicate speaker overlap, pausing, intonation, gesture, etc.

Discussion:
The point is to push you to think about what discourse is and what makes it hard to model discourse in a computational system. You may want to have an interactive system in mind when you choose your sample.  Think about how a computer could replace a participant in the discourse.  Supposing that you had perfect word recognition, what are the most challenging issues in processing the discourse? Are some of these challenges specific to the sample domain you chose? Another point is to think about what makes a sufficient record of discourse: how do you turn a speech event into an on-paper transcript? What parameters need to be transcribed (the words, the pronunciation of words, the intonation, the facial expression, the gestures, fidgeting, pauses, etc.)? We ask you to turn in the printed transcription and a discussion of the points listed above. That is, minimally, discuss the issue of what makes an adequate transcription, and what challenges a computer might have in interacting in the discourse that you have collected.