Language and Computation Seminar, 2003
/ 2004:
The Acquisition of Lexical & Ontological Knowledge
This seminar is now ended; this page will be kept
around as a
pointer to the literature.
This page: http://cswww.essex.ac.uk/staff/poesio/LAC/LAC03-04/lexical_seminar_syllabus.html
- Background I: The lexicon
- Cruse, D.A. Lexical Semantics. Cambridge
University Press, 1986.
- J. Pustejovsky, The
Generative Lexicon. MIT Press, 1995.
- Marconi, D. Lexical Competence. MIT Press,
1997.
- Murphy, G. L. The Big Book of Concepts. MIT Press,
2002.
- Background II: Hand-coded lexical resources
- WordNet: the standard reference is the book edited by C.
Fellbaum, WordNet, MIT Press, 1998. A number of papers and manuals about WordNet,
as well as the system itself, can be downloaded from the
project's website
- COMLEX and NOMLEX: C. MacLeod, R. Grishman, A. Meyers,
L. Barrett, and R. Reeves. NOMLEX: A lexicon of nominalizations. Proc. of
EURALEX, 1998.
- Oxford Dictionary of English: we got the ODE as part of
a joint project with Oxford. A description of the latest extensions is in McCracken's
paper at EACL 03.
- Background III: Ontologies
There is a lot of connection between research on the lexicon and research on ontologies. Here are some of the
many web sites dedicated to ontologies.
- Background IV: Information Theory
- Background V: Acquiring lexical information
about verbs (verb classes)
A lot of work in lexical acquisition has to do with lexical properties of verbs - particularly subcategorization
and selectional restrictions.
The classic work in this area is covered in Manning and Schuetze, chapter 8, that we read last year.
- Michael Brent (1993):
From grammar to lexicon: unsupervised learning of lexical syntax. Computational
Linguistics, 19(3):243-262.
- Christopher D. Manning (1993): Automatic acquisition
of a large subcategorization dictionary from corpora. Proceedings
of the 31st Meeting of the ACL, pp. 235-242. Columbus,
Ohio.
- Phil Resnik (1993). Selection
and Information: A Class-Based Approach to Lexical Relationships - Cognition
- Paola Merlo and Susanne Stevenson (2001). Automatic verb classification based on
statistical distributions of argument structure. Computational
Linguistics, 27(3), 373-408.
- Sabine Schulte im Walde (2003). Experiments on the choice of features for
learning verb classes. Proc. of EACL.
This area remains very active. A lot of new work has come up in connection with the Framenet project.
- Old work we didn't get a chance to read:
- October 8th: Adam Kilgarriff (CS Seminar)
Some references for those who want to read more about Kilgarriff's work on thesauri and
about WASPS:
- Adam Kilgarriff and David Tugwell (2001). "WORD
SKETCH: Extraction and Display of Significant Collocations for
Lexicography". In Proc. workshop "COLLOCATION: Computational
Extraction, Analysis and Exploitation", pp.32-38. 39th ACL & 10th
EACL, Toulouse, July
- A. Kilgarriff & C. Yallop (2000). "What's in a
Thesaurus" Proc. Second Conf on Language Resources and Evaluation
Athens, May/June. Pp 1371--1379.
- Vector Space Representations from Psychology (and
their application in NLP)
- October 15th (Massimo): HAL (Lund and Burgess,
1996; Burgess, 1998)
- October 30th (Massimo) Latent Semantic Analysis. Landauer, T. K., Foltz, P. W.,
& Laham, D. (1998). Introduction to Latent Semantic Analysis. Discourse
Processes, 25, 259-284. (pdf)
- November 6th (Mijail): Applications of LSA to segmentation. F.Y.Y.
Choi, P. Wiemer-Hastings and J. Moore. "Latent semantic analysis for text segmentation". In
Proceedings of the 6th Conference on Empirical Methods in Natural Language
Processing, pp. 109- 117, 2001. (pdf)
- Thesaurus Acquisition
- November 13th (Massimo): James R. Curran and Marc
Moens (2002). Improvements in automatic thesaurus extraction.
In Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition.
pages 59-66 (pdf).
- November 19th (Hala): Dekang Lin (1998). Automatic
Retrieval and Clustering of Similar Words. COLING-ACL98, Montreal,
Canada, August.
- November 27th (Olivia): G. Grefenstette's work. (E.g.,
this rather useful TR, thanks to G.
Grefenstette.)
- Other interesting papers we didn't have the time to
cover:
- December 2nd: Lexical acquisition in children (Sonja)
- Christmas Holidays!!
-
Clustering:
- January 15th (Abdulrahman / Massimo): Manning and Schuetze, chapter 14
- January 22nd (Abdulrahman / Massimo): Manning and
Schuetze, chapter 14, continued
- February 5th (Massimo): First meeting on Distributional
Clustering and Lilian Lee's work (the papers,
including her dissertation, are available from her home
page)
-
Fernando Pereira, Naftali Tishby, and Lillian Lee., Distributional Clustering of English Words.
Proceedings of the 31st ACL, pp 183--190, 1993
- February 12th (Massimo): Baker and McCallum,
Distributional
Clustering of Words for Text Classification, SIGIR 1998.
- March 4th (Massim) More Distributional Clustering: Ido Dagan, Lillian Lee, and Fernando Pereira,
Similarity-Based Methods for Word Sense Disambiguation. (1997),
Proceedings of the 35th ACL/8th EACL, pp 56--63, 1997
- March 11th (Abdulrahman): Clustering adjectives.
Hatzivassiloglou & McKeown,
Clustering adjectives according to meaning, ACL 1993.
- March 25th (Massimo): Clustering senses: Schuetze's work.
Schutze, H. 1998. Automatic
word sense discrimination.
Computational Linguistics.
http://citeseer.nj.nec.com/schutze98automatic.html
- More wordsense clustering: McCarthy & Carroll CL
2003?
- See also: Adam Berger's dissertation on applying
similarity to IR
- Acquisition of taxonomic knowledge using syntactic
patterns
- April 15th: (Massimo): Hearst,
M.A. (1998). "Automated Discovery of WordNet Relations" in WordNet: an
Electronic Lexical Database, Christiane Fellbaum Ed, MIT Press, Cambridge MA, 1998.
- Some useful older references cited by Hearst:
- Hiyan Alshawi (1987). Processing dictionary definitions
with phrasal pattern hierarchies. Computational Linguistics, 13,
195-202.
- M. Chodorow, R. Byrd, and G. Heidorn (1985). Extracting
semantic hierarchies from a large on-line dictionary. Proc. of the ACL,
299-304.
- J. Markowitz, T. Ahlswede, and M. Evens (1986).
Semantically significant patterns in dictionary definitions. Proc. of the
24th ACL, 112-119.
- May 12th: Sharon A. Caraballo (1999)
Automatic construction of a hypernym-labeled noun hierarchy from text.
In Proceedings of the 37th Annual Meeting of The Association for
Computational Linguistics [2],
pages 120-126.
- Also: Pantel & Ravichandran, Automatic Labeling of
Semantic Classes, Proc. NAACL 2004
- See also: Sharon A. Caraballo (2001).
Automatic Construction of a Hypernym-Labeled Noun Hierarchy from Text.
Ph.D. dissertation, Computer Science Department, Brown University.
- More interesting papers by Pantel available from his
homepage:
- Patrick Pantel and Dekang Lin. 2002. Discovering
Word Senses from Text. In Proceedings of ACM SIGKDD Conference on
Knowledge Discovery and Data Mining (KDD-02).
pp. 613-619. Edmonton, Canada.
- Dekang Lin and Patrick Pantel. 2002. Concept
Discovery from Text. In Proceedings of Conference on Computational
Linguistics (COLING-02).
pp. 577-583. Taipei, Taiwan.
- Dekang Lin and Patrick Pantel. 2001. DIRT
- Discovery of Inference Rules from Text. In Proceedings of ACM
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-01).
pp. 323-328. San Francisco, CA.
- May 19th (Abdulrahman): Dominic Widdows - e.g., Unsupervised
methods for developing taxonomies by combining syntactic and statistical
information, NAACL 2003. Other papers by Widdows are available
from his homepage. Of
particular interest: " Graph Model for Unsupervised Lexical
Acquisition" (COLING 2002, with Beata Dorow - building on Riloff
and Shepherd's 1997 work); and "Using LSA and Noun Coordination to
Improve the Precision and Recall of Automatic Hyponymy Construction"
(with Scott Cederberg; CONLL 2003; building on the Hearst 1992/1998
papers).
- See also: Steffen
Staab's page and the many papers on ontology acquisition there by the
Karlsruhe group - e.g.,
- P. Cimiano, A. Hotho, S. Staab. Comparing conceptual, partitional and
agglomerative clustering for learning taxonomies from text. In Proc of
ECAI-2004, Valencia, August 2004
- A. Mädche, S. Staab. Measuring
Similarity between Ontologies. In: Proc. Of the European Conference on
Knowledge Acquisition and Management - EKAW-2002. Madrid, Spain, October
1-4, 2002. LNCS, Springer, 2002.
- May 26th: No meeting (LREC)
- The acquisition of ontological information and concept hierarchies
(Udo and Hala)
- June 16th: no meeting (Massimo away)
- Extracting part-of relations using Syntactic
Constructions
-
June 24th (Massimo):
- M. Berland and E. Charniak. (1999) Finding
parts in very large corpora. In Proceedings of the ACL, pages
57--64, College Park, MD,
- Massimo Poesio, Tomonori Ishikawa, Sabine Schulte im Walde and Renata
Vieira (2002). "Acquiring lexical knowledge for anaphora
resolution", Proc. of LREC, Las Palmas, May. (pdf)
- R. Girju, A. Badulescu, and D. Moldovan (2003)
Learning
Semantic Constraints for the Automatic Discovery of Part-Whole Relations.
In the Proceedings of the Human Language Technology Conference,
Edmonton, Canada, May-June
- The acquisition of causal and propositional knowledge
- June 29th (Olivia): R. Girju and D. Moldovan (2002), Mining
Answers for Causation Questions .
In the Proceedings of the American Association for Artificial Intelligence
(AAAI) - Spring Symposium, Stanford University, California, March
- Other interesting readings:
Industry
Conferences in the area
Other useful Web links: