Crossroads The ACM Magazine for Students

Sign In

Association for Computing Machinery

Articles Tagged: Language resources

Articles & Features

National Centre for Text Mining (NaCTeM)


National Centre for Text Mining (NaCTeM)

By Georgios Kontonatsios, Matt Shardlow, October 2014

PDF | HTML | In the Digital Library

An approach for detecting prosodic phrase boundaries in spoken english

Prosodic phrasing is the means by which speakers of any given language break up an utterance into meaningful chunks. The term "prosody" itself refers to the tune or intonation of an utterance, and therefore prosodic phrases literally signal the end of one tune and the beginning of another. This study uses phrase break annotations in the Aix-MARSEC corpus of spoken English as a "gold standard" for measuring the degree of correspondence between prosodic phrases and the discrete syntactic grouping of prepositional phrases, where the latter is defined via a chunk parsing rule using nltk_lite's regular expression chunk parser.

A three-way comparison is also introduced between the "gold standard" chunk parsing rule and human judgment in the form of intuitive predictions about phrasing. Results show that even with a discrete syntactic grouping and a small sample of text, problems may arise for this rule-based method due to uncategorical behavior in parts of speech. Lack of correspondence between intuitive prosodic phrases and corpus annotations highlights the optional nature of certain boundary types. Finally, there are clear indications, supported by corpus annotations, that significant prosodic phrase boundaries occur within sentences and not just at full stops.

By Claire Brierley, Eric Atwell, December 2007

PDF | HTML | In the Digital Library