Margin-based Decomposed Amortized Inference
Kundu, Gourab and Srikumar, Vivek and Roth, Dan

Article Structure

Abstract

Given that structured output prediction is typically performed over entire datasets, one natural question is whether it is possible to reuse computation from earlier inference instances to speed up inference for future instances.

Introduction

A wide variety of NLP problems can be naturally cast as structured prediction problems.

Problem Definition and Notation

Structured output prediction encompasses a wide variety of NLP problems like part-of-speech tagging, parsing and machine translation.

Margin-based Amortization

In this section, we will introduce a new method for amortizing inference costs over time.

Decomposed Amortized Inference

One limitation in previously considered approaches for amortized inference stems from the

Experiments and Results

Our experiments show two results: 1.

Discussion

Lagrangian Relaxation in the literature In the literature, in applications of the Lagrangian relaxation technique (such as (Rush and Collins, 2011; Chang and Collins, 2011; Reichart and Barzilay, 2012) and others), the relaxed problems are solved using specialized algorithms.

Conclusion

Amortized inference takes advantage of the regularities in structured output to reuse previous computation and improve running time over the lifetime of a structured output predictor.

Topics

ILP

Appears in 18 sentences as: ILP (15) ILPs (3)
In Margin-based Decomposed Amortized Inference
  1. In these problems, the inference problem has been framed as an integer linear program ( ILP ).
    Page 2, “Introduction”
  2. The language of 0-1 integer linear programs ( ILP ) provides a convenient analytical tool for representing structured prediction problems.
    Page 2, “Problem Definition and Notation”
  3. One approach to deal with the computational complexity of inference is to use an off-the-shelf ILP solver for solving the inference problem.
    Page 2, “Problem Definition and Notation”
  4. Let the set P 2 {p1, p2, - - - } denote previously solved inference problems, along with their respective solutions {yllm yfj, - - - An equivalence class of integer linear programs, denoted by [P], consists of ILPs which have the same number of inference variables and the same feasible set.
    Page 3, “Problem Definition and Notation”
  5. (Srikumar et al., 2012) introduced a set of amortized inference schemes, each of which provides a condition for a new ILP to have the same solution as a previously seen problem.
    Page 3, “Problem Definition and Notation”
  6. Suppose q belongs to the same equivalence class of ILPs as p. Then the solution to q will be the same as that of p if the following condition holds for all inference variables:
    Page 3, “Problem Definition and Notation”
  7. If no such problem exists, then we make a call to an ILP solver.
    Page 4, “Margin-based Amortization”
  8. problem cannot be solved using the procedure, then we can either solve the subproblem using a different approach (effectively giving us the standard Lagrangian relaxation algorithm for inference), or we can treat the full instance as a cache miss and make a call to an ILP solver.
    Page 6, “Decomposed Amortized Inference”
  9. We used a database engine to cache ILP and their solutions along with identifiers for the equivalence class and the value of 6.
    Page 7, “Experiments and Results”
  10. For the margin-based algorithm and the Theorem 1 from (Srikumar et al., 2012), for a new inference problem p N [P], we retrieve all inference problems from the database that belong to the same equivalence class [P] as the test problem p and find the cached assignment y that has the highest score according to the coefficients of p. We only consider cached ILPs whose solution is y for checking the conditions of the theorem.
    Page 7, “Experiments and Results”
  11. We compare our approach to a state-of-the-art ILP solver2 and also to Theorem 1 from (Srikumar et al., 2012).
    Page 7, “Experiments and Results”

See all papers in Proc. ACL 2013 that mention ILP.

See all papers in Proc. ACL that mention ILP.

Back to top.

linear program

Appears in 6 sentences as: linear program (4) linear programs (3)
In Margin-based Decomposed Amortized Inference
  1. In these problems, the inference problem has been framed as an integer linear program (ILP).
    Page 2, “Introduction”
  2. The language of 0-1 integer linear programs (ILP) provides a convenient analytical tool for representing structured prediction problems.
    Page 2, “Problem Definition and Notation”
  3. Let the set P 2 {p1, p2, - - - } denote previously solved inference problems, along with their respective solutions {yllm yfj, - - - An equivalence class of integer linear programs , denoted by [P], consists of ILPs which have the same number of inference variables and the same feasible set.
    Page 3, “Problem Definition and Notation”
  4. Let p denote an inference problem posed as an integer linear program belonging to an equivalence class [P] with optimal solution yp.
    Page 3, “Margin-based Amortization”
  5. Even though the theorem provides a condition for two integer linear programs to have the same solution, checking the validity of the condition requires the computation of A, which in itself is another integer linear program .
    Page 4, “Margin-based Amortization”
  6. The goal is to solve an integer linear program q, which is defined as
    Page 4, “Decomposed Amortized Inference”

See all papers in Proc. ACL 2013 that mention linear program.

See all papers in Proc. ACL that mention linear program.

Back to top.

role labeling

Appears in 6 sentences as: Role Labeling (1) role labeling (5)
In Margin-based Decomposed Amortized Inference
  1. In our experiments, using the NLP tasks of semantic role labeling and entity-relation extraction, we demonstrate that with the margin-based algorithm, we need to call the inference engine only for a third of the test examples.
    Page 1, “Abstract”
  2. We evaluate the two schemes and their combination on two NLP tasks where the output is encoded as a structure: PropBank semantic role labeling (Punyakanok et al., 2008) and the problem of recognizing entities and relations in text (Roth and Yih, 2007; Kate and Mooney, 2010).
    Page 2, “Introduction”
  3. We report the performance of inference on two NLP tasks: semantic role labeling and the task of extracting entities and relations from text.
    Page 6, “Experiments and Results”
  4. Semantic Role Labeling (SRL) Our first task is that of identifying arguments of verbs in a sentence and annotating them with semantic roles (Gildea and Jurafsky, 2002; Palmer et al., 2010) .
    Page 6, “Experiments and Results”
  5. For the semantic role labeling task, we need to call the solver only for one in six examples while for the entity-relations task, only one in four examples require a solver call.
    Page 7, “Experiments and Results”
  6. We show via experiments that these methods individually give a reduction in the number of calls made to an inference engine for semantic role labeling and entity-relation extraction.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2013 that mention role labeling.

See all papers in Proc. ACL that mention role labeling.

Back to top.

semantic role

Appears in 6 sentences as: Semantic Role (1) semantic role (5) semantic roles (1)
In Margin-based Decomposed Amortized Inference
  1. In our experiments, using the NLP tasks of semantic role labeling and entity-relation extraction, we demonstrate that with the margin-based algorithm, we need to call the inference engine only for a third of the test examples.
    Page 1, “Abstract”
  2. We evaluate the two schemes and their combination on two NLP tasks where the output is encoded as a structure: PropBank semantic role labeling (Punyakanok et al., 2008) and the problem of recognizing entities and relations in text (Roth and Yih, 2007; Kate and Mooney, 2010).
    Page 2, “Introduction”
  3. We report the performance of inference on two NLP tasks: semantic role labeling and the task of extracting entities and relations from text.
    Page 6, “Experiments and Results”
  4. Semantic Role Labeling (SRL) Our first task is that of identifying arguments of verbs in a sentence and annotating them with semantic roles (Gildea and Jurafsky, 2002; Palmer et al., 2010) .
    Page 6, “Experiments and Results”
  5. For the semantic role labeling task, we need to call the solver only for one in six examples while for the entity-relations task, only one in four examples require a solver call.
    Page 7, “Experiments and Results”
  6. We show via experiments that these methods individually give a reduction in the number of calls made to an inference engine for semantic role labeling and entity-relation extraction.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2013 that mention semantic role.

See all papers in Proc. ACL that mention semantic role.

Back to top.

semantic role labeling

Appears in 6 sentences as: Semantic Role Labeling (1) semantic role labeling (5)
In Margin-based Decomposed Amortized Inference
  1. In our experiments, using the NLP tasks of semantic role labeling and entity-relation extraction, we demonstrate that with the margin-based algorithm, we need to call the inference engine only for a third of the test examples.
    Page 1, “Abstract”
  2. We evaluate the two schemes and their combination on two NLP tasks where the output is encoded as a structure: PropBank semantic role labeling (Punyakanok et al., 2008) and the problem of recognizing entities and relations in text (Roth and Yih, 2007; Kate and Mooney, 2010).
    Page 2, “Introduction”
  3. We report the performance of inference on two NLP tasks: semantic role labeling and the task of extracting entities and relations from text.
    Page 6, “Experiments and Results”
  4. Semantic Role Labeling (SRL) Our first task is that of identifying arguments of verbs in a sentence and annotating them with semantic roles (Gildea and Jurafsky, 2002; Palmer et al., 2010) .
    Page 6, “Experiments and Results”
  5. For the semantic role labeling task, we need to call the solver only for one in six examples while for the entity-relations task, only one in four examples require a solver call.
    Page 7, “Experiments and Results”
  6. We show via experiments that these methods individually give a reduction in the number of calls made to an inference engine for semantic role labeling and entity-relation extraction.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2013 that mention semantic role labeling.

See all papers in Proc. ACL that mention semantic role labeling.

Back to top.

highest scoring

Appears in 3 sentences as: highest score (1) highest scoring (2)
In Margin-based Decomposed Amortized Inference
  1. For a new inference problem, if this margin is larger than the sum of the decrease in the score of the previous prediction and any increase in the score of the second best one, then the previous solution will be the highest scoring one for the new problem.
    Page 1, “Introduction”
  2. The goal of inference is to find the highest scoring global assignment of the variables from a feasible set of assignments, which is defined by linear inequalities.
    Page 2, “Problem Definition and Notation”
  3. For the margin-based algorithm and the Theorem 1 from (Srikumar et al., 2012), for a new inference problem p N [P], we retrieve all inference problems from the database that belong to the same equivalence class [P] as the test problem p and find the cached assignment y that has the highest score according to the coefficients of p. We only consider cached ILPs whose solution is y for checking the conditions of the theorem.
    Page 7, “Experiments and Results”

See all papers in Proc. ACL 2013 that mention highest scoring.

See all papers in Proc. ACL that mention highest scoring.

Back to top.

part-of-speech

Appears in 3 sentences as: part-of-speech (3)
In Margin-based Decomposed Amortized Inference
  1. Structured output prediction encompasses a wide variety of NLP problems like part-of-speech tagging, parsing and machine translation.
    Page 2, “Problem Definition and Notation”
  2. Figure 1 illustrates this observation in the context of part-of-speech tagging.
    Page 2, “Problem Definition and Notation”
  3. Figure 1: Comparison of number of instances and the number of unique observed part-of-speech structures in the Gi-gaword corpus.
    Page 2, “Problem Definition and Notation”

See all papers in Proc. ACL 2013 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.