Predicate Argument Structures for Information Extraction from Dependency Representations: Null Elements are Missing

Delmonte, Rodolfo

doi:10.1007/978-3-642-40621-8_2

Rodolfo Delmonte⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 515))

486 Accesses
2 Citations

Abstract

State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations which however don’t include null elements. This is done to facilitate structural learning and prevent the probabilistic engine to postulate the existence of deprecated null elements everywhere (see [15]). However it is a fact that in this way, the semantics of the representation used and produced on runtime is inconsistent and will reduce dramatically its usefulness in real life applications like Information Extraction, Q/A and other semantically driven fields by hampering the mapping of a complete logical form. What systems have come up with are “Quasi”-logical forms or partial logical forms mapped directly from the surface representation in dependency structure. We show the most common problems derived from the conversion and then describe an algorithm that we have implemented to apply to our converted Italian Treebank, that can be used on any CONLL-style treebank or representation to produce an “almost complete” semantically consistent dependency treebank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Ann Marcinkiewicz, M., Schasberger, B.: Bracketing guidelines for Treebank II style Penn treebank.uni-tuebingen.de/\(\sim \)dm/07/autumn/795.10/ptb-annotation-guide/root. html (1995)
Google Scholar
Cahill, A., McCarthy, M., van Genabith, J., Way, A.: Automatic annotation of the Penn-Treebank with LFG f-structure information. In: LREC: Workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data. Las Palmas (2002)
Google Scholar
Cahill, A., McCarthy, M., van Genabith, J., Way, A.: Quasi-logical forms for the Penn Treebank. In: Bunt H., van der Sluis I., Morante R. (eds.) Proceedings of the Fifth International Workshop on Computational Semantics, IWCS-05, pp. 55–71. Tilburg (2003)
Google Scholar
Cai, S., Chiang, D., Goldberg, Y.: Language-independent parsing with empty elements. In: Proceedings of the 49th Annual Meeting of the ACL, pp. 212–216 (2011)
Google Scholar
Campbell, R.: Using linguistic principles to recover empty categories. In Proceedings of ACL (2004)
Google Scholar
Chung, T., Gildea, D.: Effects of empty categories on machine translation. In Proceedings EMNLP (2010)
Google Scholar
Choi, J.D., Palmer, M.: Robust constituent-to-dependency conversion for english. In: Proceedings of the 9th International Workshop on Treebanks and Linguistic Theories (TLT’9), pp. 55–66. Tartu (2010)
Google Scholar
Clark, S., Curran, J.R.: Comparing the accuracy of CCG and Penn Treebank parsers. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 53–56. Suntec, Singapore (2009)
Google Scholar
De Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC, pp. 449–454 (2006/5)
Google Scholar
Delmonte, R., Bristot, A., Tonelli, S.: VIT —Venice Italian Treebank: Syntactic and Quantitative Features. In: De Smedt, K., Hajic, J., Kübler, S. (eds.), Proceedings of Sixth International Workshop on TLT, vol. 1, pp. 43–54. Nealt Proceeding Series (2007)
Google Scholar
Delmonte R., Bianchi, D.: Semantic web, RDFs and NLP for QA. In: Calzolari N., Magnini B. (eds.) Proceedings of the Workshop on “Topics and Perspectives of NLP in Italy”, Università di Pisa, AI*IA, pp. 67–75 (2003)
Google Scholar
Dienes P., Dubey, A.: Antecedent recovery: experiments with a trace tagger. In: Proceedingsof EMNLP (2003a)
Google Scholar
Dienes P., Dubey, A.: Deep processing by combining shallow methods. In: Proceedings of ACL (2003b)
Google Scholar
Gabbard, R., Marcus M., Kulick, S.: Fully parsing the Penn Treebank. In: Proceedings of the HLT Conference of the North American Chapter of the ACL, pp. 184–191 (2006)
Google Scholar
Gaizauskas, R.: Investigations into the Grammar Underlying the Penn Treebank II, Technical Report CS-95-25. Univeristy of Sheffield, Department of Computer Science (1995)
Google Scholar
Guo, Y., van Genabith, J., Wang, H.: Treebank-based acquisition of LFG resources for Chinese. In: Lexical Functional Grammar, pp. 28–30. California (2007)
Google Scholar
Johnson, M.: A simple patter-matching algorithm for recovering empty nodes and their antecedents. In: Proceedings of the 39th Annual Meeting of the ACL, 136–143, Toulouse, France (2001)
Google Scholar
Johansson, R., Nugues, P.: Extended constituent-to-dependency conversion for english. In: Proceedings of NODALIDA 2007, Tartu (2007)
Google Scholar
Katz, B.: Annotating the World Wide Web using natural language. In: RIAO ’97 (1997)
Google Scholar
Liakata, M., Pulman, S.: From Trees to Predicate-Argument Structures. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), pp. 563–569. Taipei (2002)
Google Scholar
Litkowski, K.C.: Syntactic clues and Lexical resources in question-answering. In: Voorhees E.M., Harman D.K. (eds.) The Ninth Text Retrieval Conference (TREC-9). NIST Special Publication 500–249, Gaithersburg, pp. 157–166 (2001)
Google Scholar
Marcus, M., Kim, G., Ann Marcinkiewicz, M., Macintyre, R., Bies, A., Ferguson, M., Katz, K., Schasberger, B.: The Penn Treebank: annotating predicate argument structure. In: ARPA Human Language Technology Workshop, pp. 114–119 (1994)
Google Scholar
Sagae, K., Tsujii, J.: Shift-reduce dependency DAG parsing. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester (2008)
Google Scholar
Schmid, H.: Trace prediction and recovery with unlexicalized PCFGs and slash features. In: Proceedings COLING-ACL (2006)
Google Scholar
Tonelli, S., Delmonte, R., Bristot, A.: Enriching the Venice Italian Treebank with dependency and grammatical relations, LREC 2008 (2008)
Google Scholar
Xue, N., Xia, F., Chiou, F.-D., Palmer, M.: The Penn Chinese TreeBank: phrase structure annotation of a large corpus. Nat. Lang. Eng. 11(2), 207–238 (2005)
Google Scholar
Yang, Y., Xue, N.: Chasing the ghost: recovering empty categories in the Chinese Treebank. In: Proceedings COLING (2010)
Google Scholar
http://nlp.stanford.edu:8080/parser/
http://www.connexor.com/nlplib/?q=demo/syntax

Download references

Acknowledgments

This work has been partially funded by the PARLI Project (Portale per l’Accesso alle Risorse Linguistiche per l’Italiano—MIUR—PRIN 2008).

Author information

Authors and Affiliations

Department of Linguistic Studies and Comparative Cultures and Department of Computer science, Ca’ Foscari University, Venice, Italy
Rodolfo Delmonte

Authors

Rodolfo Delmonte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodolfo Delmonte .

Editor information

Editors and Affiliations

Center for Advanced Studies, Research and Development in Sardinia, CRS4, Pula, Italy
Cristian Lai
Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
Alessandro Giuliani
Department of Informatics, University of Bari Aldo Moro, Bari, Italy
Giovanni Semeraro

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Delmonte, R. (2014). Predicate Argument Structures for Information Extraction from Dependency Representations: Null Elements are Missing. In: Lai, C., Giuliani, A., Semeraro, G. (eds) Distributed Systems and Applications of Information Filtering and Retrieval. Studies in Computational Intelligence, vol 515. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40621-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-40621-8_2
Published: 08 November 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40620-1
Online ISBN: 978-3-642-40621-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics