Language and Computation Seminar, 2003
The Acquisition of Lexical & Ontological Knowledge
This seminar is now ended; this page will be kept
around as a
pointer to the literature.
This page: http://cswww.essex.ac.uk/staff/poesio/LAC/LAC03-04/lexical_seminar_syllabus.html
- Background I: The lexicon
- Cruse, D.A. Lexical Semantics. Cambridge
University Press, 1986.
- J. Pustejovsky, The
Generative Lexicon. MIT Press, 1995.
- Marconi, D. Lexical Competence. MIT Press,
- Murphy, G. L. The Big Book of Concepts. MIT Press,
- Background II: Hand-coded lexical resources
- WordNet: the standard reference is the book edited by C.
Fellbaum, WordNet, MIT Press, 1998. A number of papers and manuals about WordNet,
as well as the system itself, can be downloaded from the
- COMLEX and NOMLEX: C. MacLeod, R. Grishman, A. Meyers,
L. Barrett, and R. Reeves. NOMLEX: A lexicon of nominalizations. Proc. of
- Oxford Dictionary of English: we got the ODE as part of
a joint project with Oxford. A description of the latest extensions is in McCracken's
paper at EACL 03.
- Background III: Ontologies
There is a lot of connection between research on the lexicon and research on ontologies. Here are some of the
many web sites dedicated to ontologies.
- Background IV: Information Theory
- Background V: Acquiring lexical information
about verbs (verb classes)
A lot of work in lexical acquisition has to do with lexical properties of verbs - particularly subcategorization
and selectional restrictions.
The classic work in this area is covered in Manning and Schuetze, chapter 8, that we read last year.
This area remains very active. A lot of new work has come up in connection with the Framenet project.
- Michael Brent (1993):
From grammar to lexicon: unsupervised learning of lexical syntax. Computational
- Christopher D. Manning (1993): Automatic acquisition
of a large subcategorization dictionary from corpora. Proceedings
of the 31st Meeting of the ACL, pp. 235-242. Columbus,
- Phil Resnik (1993). Selection
and Information: A Class-Based Approach to Lexical Relationships - Cognition
- Paola Merlo and Susanne Stevenson (2001). Automatic verb classification based on
statistical distributions of argument structure. Computational
Linguistics, 27(3), 373-408.
- Sabine Schulte im Walde (2003). Experiments on the choice of features for
learning verb classes. Proc. of EACL.
- Old work we didn't get a chance to read:
- October 8th: Adam Kilgarriff (CS Seminar)
Some references for those who want to read more about Kilgarriff's work on thesauri and
- Adam Kilgarriff and David Tugwell (2001). "WORD
SKETCH: Extraction and Display of Significant Collocations for
Lexicography". In Proc. workshop "COLLOCATION: Computational
Extraction, Analysis and Exploitation", pp.32-38. 39th ACL & 10th
EACL, Toulouse, July
- A. Kilgarriff & C. Yallop (2000). "What's in a
Thesaurus" Proc. Second Conf on Language Resources and Evaluation
Athens, May/June. Pp 1371--1379.
- Vector Space Representations from Psychology (and
their application in NLP)
- October 15th (Massimo): HAL (Lund and Burgess,
1996; Burgess, 1998)
- October 30th (Massimo) Latent Semantic Analysis. Landauer, T. K., Foltz, P. W.,
& Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse
Processes, 25, 259-284. (pdf)
- November 6th (Mijail): Applications of LSA to segmentation. F.Y.Y.
Choi, P. Wiemer-Hastings and J. Moore. "Latent semantic analysis for text segmentation". In
Proceedings of the 6th Conference on Empirical Methods in Natural Language
Processing, pp. 109- 117, 2001. (pdf)
- Thesaurus Acquisition
- November 13th (Massimo): James R. Curran and Marc
Moens (2002). Improvements in automatic thesaurus extraction.
In Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition.
pages 59-66 (pdf).
- November 19th (Hala): Dekang Lin (1998). Automatic
Retrieval and Clustering of Similar Words. COLING-ACL98, Montreal,
- November 27th (Olivia): G. Grefenstette's work. (E.g.,
this rather useful TR, thanks to G.
- Other interesting papers we didn't have the time to
- December 2nd: Lexical acquisition in children (Sonja)
- Christmas Holidays!!
- January 15th (Abdulrahman / Massimo): Manning and Schuetze, chapter 14
- January 22nd (Abdulrahman / Massimo): Manning and
Schuetze, chapter 14, continued
- February 5th (Massimo): First meeting on Distributional
Clustering and Lilian Lee's work (the papers,
including her dissertation, are available from her home
Fernando Pereira, Naftali Tishby, and Lillian Lee., Distributional Clustering of English Words.
Proceedings of the 31st ACL, pp 183--190, 1993
- February 12th (Massimo): Baker and McCallum,
Clustering of Words for Text Classification, SIGIR 1998.
- March 4th (Massim) More Distributional Clustering: Ido Dagan, Lillian Lee, and Fernando Pereira,
Similarity-Based Methods for Word Sense Disambiguation. (1997),
Proceedings of the 35th ACL/8th EACL, pp 56--63, 1997
- March 11th (Abdulrahman): Clustering adjectives.
Hatzivassiloglou & McKeown,
Clustering adjectives according to meaning, ACL 1993.
- March 25th (Massimo): Clustering senses: Schuetze's work.
Schutze, H. 1998. Automatic
word sense discrimination.
- More wordsense clustering: McCarthy & Carroll CL
- See also: Adam Berger's dissertation on applying
similarity to IR
- Acquisition of taxonomic knowledge using syntactic
- April 15th: (Massimo): Hearst,
M.A. (1998). "Automated Discovery of WordNet Relations" in WordNet: an
Electronic Lexical Database, Christiane Fellbaum Ed, MIT Press, Cambridge MA, 1998.
- Some useful older references cited by Hearst:
- Hiyan Alshawi (1987). Processing dictionary definitions
with phrasal pattern hierarchies. Computational Linguistics, 13,
- M. Chodorow, R. Byrd, and G. Heidorn (1985). Extracting
semantic hierarchies from a large on-line dictionary. Proc. of the ACL,
- J. Markowitz, T. Ahlswede, and M. Evens (1986).
Semantically significant patterns in dictionary definitions. Proc. of the
24th ACL, 112-119.
- May 12th: Sharon A. Caraballo (1999)
Automatic construction of a hypernym-labeled noun hierarchy from text.
In Proceedings of the 37th Annual Meeting of The Association for
Computational Linguistics ,
- Also: Pantel & Ravichandran, Automatic Labeling of
Semantic Classes, Proc. NAACL 2004
- See also: Sharon A. Caraballo (2001).
Automatic Construction of a Hypernym-Labeled Noun Hierarchy from Text.
Ph.D. dissertation, Computer Science Department, Brown University.
- More interesting papers by Pantel available from his
- Patrick Pantel and Dekang Lin. 2002. Discovering
Word Senses from Text. In Proceedings of ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD-02).
pp. 613-619. Edmonton, Canada.
- Dekang Lin and Patrick Pantel. 2002. Concept
Discovery from Text. In Proceedings of Conference on Computational
pp. 577-583. Taipei, Taiwan.
- Dekang Lin and Patrick Pantel. 2001. DIRT
- Discovery of Inference Rules from Text. In Proceedings of ACM
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-01).
pp. 323-328. San Francisco, CA.
- May 19th (Abdulrahman): Dominic Widdows - e.g., Unsupervised
methods for developing taxonomies by combining syntactic and statistical
information, NAACL 2003. Other papers by Widdows are available
from his homepage. Of
particular interest: " Graph Model for Unsupervised Lexical
Acquisition" (COLING 2002, with Beata Dorow - building on Riloff
and Shepherd's 1997 work); and "Using LSA and Noun Coordination to
Improve the Precision and Recall of Automatic Hyponymy Construction"
(with Scott Cederberg; CONLL 2003; building on the Hearst 1992/1998
- See also: Steffen
Staab's page and the many papers on ontology acquisition there by the
Karlsruhe group - e.g.,
- P. Cimiano, A. Hotho, S. Staab. Comparing conceptual, partitional and
agglomerative clustering for learning taxonomies from text. In Proc of
ECAI-2004, Valencia, August 2004
- A. Mädche, S. Staab. Measuring
Similarity between Ontologies. In: Proc. Of the European Conference on
Knowledge Acquisition and Management - EKAW-2002. Madrid, Spain, October
1-4, 2002. LNCS, Springer, 2002.
- May 26th: No meeting (LREC)
- The acquisition of ontological information and concept hierarchies
(Udo and Hala)
- June 16th: no meeting (Massimo away)
- Extracting part-of relations using Syntactic
June 24th (Massimo):
- M. Berland and E. Charniak. (1999) Finding
parts in very large corpora. In Proceedings of the ACL, pages
57--64, College Park, MD,
- Massimo Poesio, Tomonori Ishikawa, Sabine Schulte im Walde and Renata
Vieira (2002). "Acquiring lexical knowledge for anaphora
resolution", Proc. of LREC, Las Palmas, May. (pdf)
- R. Girju, A. Badulescu, and D. Moldovan (2003)
Semantic Constraints for the Automatic Discovery of Part-Whole Relations.
In the Proceedings of the Human Language Technology Conference,
Edmonton, Canada, May-June
- The acquisition of causal and propositional knowledge
- June 29th (Olivia): R. Girju and D. Moldovan (2002), Mining
Answers for Causation Questions .
In the Proceedings of the American Association for Artificial Intelligence
(AAAI) - Spring Symposium, Stanford University, California, March
- Other interesting readings:
Conferences in the area
Other useful Web links: