A Cross-Lingual ILP Solution to Zero Anaphora Resolution
Iida, Ryu and Poesio, Massimo

Article Structure

Abstract

We present an ILP-based model of zero anaphora detection and resolution that builds on the joint determination of anaphoricity and coreference model proposed by Denis and Baldridge (2007), but revises it and extends it into a three-way ILP problem also incorporating subject detection.

Introduction

In so-called ‘pro-drop’ languages such as Japanese and many romance languages including Italian, phonetic realization is not required for anaphoric references in contexts in which in English non-contrastive pronouns are used: e.g., the subjects of Italian and Japanese translations of buy in (1b) and (1c) are not explicitly realized.

Topics

coreference

Appears in 23 sentences as: Coreference (1) coreference (20) coreferent (3) COREF|i (1)
In A Cross-Lingual ILP Solution to Zero Anaphora Resolution
  1. We present an ILP-based model of zero anaphora detection and resolution that builds on the joint determination of anaphoricity and coreference model proposed by Denis and Baldridge (2007), but revises it and extends it into a three-way ILP problem also incorporating subject detection.
    Page 1, “Abstract”
  2. The felicitousness of zero anaphoric reference depends on the referred entity being sufficiently salient, hence this type of data—particularly in Japanese and Italian—played a key role in early work in coreference resolution, e.g., in the development of Centering (Kameyama, 1985; Walker et a1., 1994; Di Eugenio, 1998).
    Page 1, “Introduction”
  3. (2010)), and their use in competitions such as SEMEVAL 2010 Task 1 on Multilingual Coreference (Recasens et a1., 2010), is leading to a renewed interest in zero anaphora resolution, particularly at the light of the mediocre results obtained on zero anaphors by most systems participating in SEMEVAL.
    Page 1, “Introduction”
  4. We integrate the zero anaphora resolver with a coreference resolver and demonstrate that the approach leads to improved results for both Italian and Japanese.
    Page 2, “Introduction”
  5. In Section 5 we discuss experiments testing that adding our zero anaphora detector and resolver to a full coreference resolver would result in overall increase in performance.
    Page 2, “Introduction”
  6. 2 Using ILP for joint anaphoricity and coreference determination
    Page 2, “Introduction”
  7. Denis and Baldridge (2007) defined the following object function for the joint anaphoricity and coreference determination problem.
    Page 2, “Introduction”
  8. M stands for the set of mentions in the document, and P the set of possible coreference links over these mentions.
    Page 2, “Introduction”
  9. scam is an indicator variable that is set to 1 if mentions z' and j are coreferent , and 0 otherwise.
    Page 2, “Introduction”
  10. 2 —log(P( COREF|i , are (logs of) probabilities produced by an antecedent identification classifier with —log, whereas 03-4 = —log(P(ANAPH|j)), are the probabilities produced by an anaphoricity determination classifier with —log.
    Page 2, “Introduction”
  11. Resolve only anaphors: if a pair of mentions (2', j) is coreferent (sch-J) = 1), then mention j must be anaphoric (yj = l).
    Page 2, “Introduction”

See all papers in Proc. ACL 2011 that mention coreference.

See all papers in Proc. ACL that mention coreference.

Back to top.

ILP

Appears in 12 sentences as: ILP (13)
In A Cross-Lingual ILP Solution to Zero Anaphora Resolution
  1. We present an ILP-based model of zero anaphora detection and resolution that builds on the joint determination of anaphoricity and coreference model proposed by Denis and Baldridge (2007), but revises it and extends it into a three-way ILP problem also incorporating subject detection.
    Page 1, “Abstract”
  2. task, for which Integer Linear Programming ( ILP )—introduced to NLP by Roth and Yih (2004) and successfully applied by Denis and Baldridge (2007) to the task of jointly inferring anaphoricity and determining the antecedent—would be appropriate.
    Page 2, “Introduction”
  3. In this work we developed, starting from the ILP system proposed by Denis and Baldridge, an ILP approach to zero anaphora detection and resolution that integrates (revised) versions of Denis and Baldridge’s constraints with additional constraints between the values of three distinct classifiers, one of which is a novel one for subject prediction.
    Page 2, “Introduction”
  4. We next present our new ILP formulation in Section 3.
    Page 2, “Introduction”
  5. 2 Using ILP for joint anaphoricity and coreference determination
    Page 2, “Introduction”
  6. Integer Linear Programming ( ILP ) is a method for constraint-based inference aimed at finding the values for a set of variables that maximize a (linear) objective function while satisfying a number of constraints.
    Page 2, “Introduction”
  7. Roth and Yih (2004) advocated ILP as a general solution for a number of NLP tasks that require combining multiple classifiers and which the traditional pipeline architecture is not appropriate, such as entity disambiguation and relation extraction.
    Page 2, “Introduction”
  8. As can be seen from Table 4, the ILP version with Do-Not—Resolve-Non-Anaphors performs no better than the baselines for either languages, but in both languages replacing that constraint with Best-First results in a performance above the baselines; adding Subject Detection results in further improvement for both languages.
    Page 6, “Introduction”
  9. For the separated classifier, we use the ILP+BF model for explicitly realized NPs, and different ILP models for zeros.
    Page 7, “Introduction”
  10. their model into the ILP formulation proposed here looks like a promising further extension.
    Page 8, “Introduction”
  11. As a next step, we also need to take into account ways of incorporating a mention detection model into the ILP formulation.
    Page 8, “Introduction”

See all papers in Proc. ACL 2011 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

coreference resolution

Appears in 6 sentences as: coreference resolution (4) coreference resolver (2)
In A Cross-Lingual ILP Solution to Zero Anaphora Resolution
  1. The felicitousness of zero anaphoric reference depends on the referred entity being sufficiently salient, hence this type of data—particularly in Japanese and Italian—played a key role in early work in coreference resolution , e.g., in the development of Centering (Kameyama, 1985; Walker et a1., 1994; Di Eugenio, 1998).
    Page 1, “Introduction”
  2. We integrate the zero anaphora resolver with a coreference resolver and demonstrate that the approach leads to improved results for both Italian and Japanese.
    Page 2, “Introduction”
  3. In Section 5 we discuss experiments testing that adding our zero anaphora detector and resolver to a full coreference resolver would result in overall increase in performance.
    Page 2, “Introduction”
  4. (In contrast, in the experiments on coreference resolution discussed in the following section, all mentions are considered as both candidate anaphors and candidate antecedents.
    Page 4, “Introduction”
  5. 5 Experiment 2: coreference resolution for all anaphors
    Page 6, “Introduction”
  6. In this paper, we developed a new ILP-based model of zero anaphora detection and resolution that extends the coreference resolution model proposed by Denis and Baldridge (2007) by introducing modified constraints and a subject detection model.
    Page 8, “Introduction”

See all papers in Proc. ACL 2011 that mention coreference resolution.

See all papers in Proc. ACL that mention coreference resolution.

Back to top.

dependency relations

Appears in 3 sentences as: dependency relations (3)
In A Cross-Lingual ILP Solution to Zero Anaphora Resolution
  1. We also used the Kyoto University Text Corpus4 that provides dependency relations information for the same articles as the NAIST Text Corpus.
    Page 4, “Introduction”
  2. To create a subject detection model for Italian, we used the TUT corpus9 (Bosco et al., 2010), which contains manually annotated dependency relations and their labels, consisting of 80,878 tokens in CoNLL format.
    Page 5, “Introduction”
  3. We induced an maximum entropy classifier by using as items all arcs of dependency relations , each of which is used as a positive instance if its label is subject; otherwise it is used as a negative instance.
    Page 5, “Introduction”

See all papers in Proc. ACL 2011 that mention dependency relations.

See all papers in Proc. ACL that mention dependency relations.

Back to top.

part-of-speech

Appears in 3 sentences as: part-of-speech (3)
In A Cross-Lingual ILP Solution to Zero Anaphora Resolution
  1. base phrases in Japanese) whose head part-of-speech was automatically tagged by the Japanese morphological analyser Chasen6 as either ‘noun’ or ‘unknown word’ according to the NAIST—jdic dictionary.7
    Page 4, “Introduction”
  2. POS / LEMMA part-of-speech / dependency label / lemma / DEP_LABEL of the predicate which has ZERO.
    Page 5, “Introduction”
  3. D_POS / part-of-speech / dependency label / lemma D_LEMMA / of the dependents of the predicate which has D_DEP_LABEL ZERO.
    Page 5, “Introduction”

See all papers in Proc. ACL 2011 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.