Bootstrapping Semantic Analyzers from Non-Contradictory Texts
Titov, Ivan and Kozhevnikov, Mikhail

Article Structure

Abstract

We argue that groups of unannotated texts with overlapping and noncontradictory semantics represent a valuable source of information for learning semantic representations.

Introduction

In recent years, there has been increasing interest in statistical approaches to semantic parsing.

Inference with NonContradictory Documents

In this section we will describe our inference method on a higher conceptual level, not specifying the underlying meaning representation and the probabilistic model.

A Model of Semantics

In this section we redescribe the semantics-text correspondence model (Liang et al., 2009) with an extension needed to model examples with latent states, and also explain how the inference algorithm defined in section 2 can be applied to this model.

Empirical Evaluation

In this section, we consider the semi-supervised setup, and present evaluation of our approach on on the problem of aligning weather forecast reports to the formal representation of weather.

Related Work

Probably the most relevant prior work is an approach to bootstrapping lexical choice of a generation system using a corpus of alternative pas-

Summary and Future Work

In this work we studied the use of weak supervision in the form of noncontradictory relations between documents in learning semantic representations.

Topics

semantic representations

Appears in 12 sentences as: semantic representation (4) semantic representations (9) semantics represent (1)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. We argue that groups of unannotated texts with overlapping and noncontradictory semantics represent a valuable source of information for learning semantic representations .
    Page 1, “Abstract”
  2. A simple and efficient inference method recursively induces joint semantic representations for each group and discovers correspondence between lexical entries and latent semantic concepts.
    Page 1, “Abstract”
  3. Alternatively, if such groupings are not available, it may still be easier to give each semantic representation (or a state) to multiple annotators and ask each of them to provide a textual description, instead of annotating texts with semantic expressions.
    Page 2, “Introduction”
  4. Unsupervised learning with shared latent semantic representations presents its own challenges, as exact inference requires marginalization over possible assignments of the latent semantic state, consequently, introducing nonlocal statistical dependencies between the decisions about the semantic structure of each text.
    Page 2, “Introduction”
  5. Even though the dependencies are only conveyed via {mj : j 75 the space of possible meanings m is very large even for relatively simple semantic representations , and, therefore, we need to resort to efficient approximations.
    Page 4, “Inference with NonContradictory Documents”
  6. However, a major weakness of this algorithm is that decisions about components of the composite semantic representation (e. g., argument values) are made only on the basis of a single text, which first mentions the corresponding aspects, without consulting any future texts k’ > k, and these decisions cannot be revised later.
    Page 4, “Inference with NonContradictory Documents”
  7. Though the most likely alignment 6.3- for a fixed semantic representation fizj can be found efficiently using a Viterbi algorithm, computing the most probable pair (éj, fly) is still intractable.
    Page 7, “A Model of Semantics”
  8. We use a modification of the beam search algorithm, where we keep a set of candidate meanings (partial semantic representations ) and compute an alignment for each of them using a form of the Viterbi algorithm.
    Page 7, “A Model of Semantics”
  9. Sentence and text alignment has also been considered in the related context of paraphrase extraction (see, e.g., (Dolan et al., 2004; Barzilay and Lee, 2003)) but this prior work did not focus on inducing or learning semantic representations .
    Page 9, “Related Work”
  10. In this work we studied the use of weak supervision in the form of noncontradictory relations between documents in learning semantic representations .
    Page 9, “Summary and Future Work”
  11. However, exact inference for groups of documents with overlapping semantic representation is generally prohibitively expensive, as the shared latent semantics introduces nonlocal dependences between semantic representations of individual documents.
    Page 9, “Summary and Future Work”

See all papers in Proc. ACL 2010 that mention semantic representations.

See all papers in Proc. ACL that mention semantic representations.

Back to top.

semi-supervised

Appears in 11 sentences as: semi-supervised (11)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. Such annotated resources are scarce and expensive to create, motivating the need for unsupervised or semi-supervised techniques (Poon and Domingos, 2009).
    Page 1, “Introduction”
  2. This compares favorably with 69.1% shown by a semi-supervised learning approach, though, as expected, does not reach the score of the model which, in training, observed semantics states for all the 750 documents (77.7% F1).
    Page 3, “Introduction”
  3. However, in a semi-supervised or unsupervised case variational techniques, such as the EM algorithm (Dempster et al., 1977), are often used to estimate the model.
    Page 3, “Inference with NonContradictory Documents”
  4. In this section, we consider the semi-supervised setup, and present evaluation of our approach on on the problem of aligning weather forecast reports to the formal representation of weather.
    Page 7, “Empirical Evaluation”
  5. Only then, in the semi-supervised learning scenarios, we added unlabeled data and ran 5 additional iterations of EM.
    Page 7, “Empirical Evaluation”
  6. We compare our approach (Semi-superv, non-contr) with two baselines: the basic supervised training on 100 labeled forecasts (Supervised BL) and with the semi-supervised training which disregards the non-contradiction relations (Semi-superv BL).
    Page 8, “Empirical Evaluation”
  7. The learning regime, the inference procedure and the texts for the semi-supervised baseline were identical to the ones used for our approach, the only difference is that all the documents were modeled as independent.
    Page 8, “Empirical Evaluation”
  8. Additionally, we report the results of the model trained with all the 750 texts labeled (Supervised UB), its scores can be regarded as an upper bound on the results of the semi-supervised models.
    Page 8, “Empirical Evaluation”
  9. Our training strategy results in a substantially more accurate model, outperforming both the supervised and semi-supervised baselines.
    Page 8, “Empirical Evaluation”
  10. The estimation of the model with our approach takes around one hour on a standard desktop PC, which is comparable to 40 minutes required to train the semi-supervised baseline.
    Page 8, “Empirical Evaluation”
  11. Our approach resulted in an improvement over the scores of both the supervised baseline and of the traditional semi-supervised leam-ing.
    Page 9, “Summary and Future Work”

See all papers in Proc. ACL 2010 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

meaning representations

Appears in 10 sentences as: meaning representation (5) meaning representations (6)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. The supervision was either given in the form of meaning representations aligned with sentences (Zettlemoyer and Collins, 2005; Ge and Mooney, 2005; Mooney, 2007) or in a somewhat more relaxed form, such as lists of candidate meanings for each sentence (Kate and Mooney, 2007; Chen and Mooney, 2008) or formal representations of the described world state for each text (Liang et al., 2009).
    Page 1, “Introduction”
  2. However, it is important to note that the phrase “wind from west” may still appear in the texts, but in reference to other time periods, underlying the need for modeling alignment between grouped texts and their latent meaning representation .
    Page 2, “Introduction”
  3. In this section we will describe our inference method on a higher conceptual level, not specifying the underlying meaning representation and the probabilistic model.
    Page 3, “Inference with NonContradictory Documents”
  4. garded as defining the probability distribution of meaning m and its alignment a with the given text w, P(m, a, w) = P(a, The semantics m can be represented either as a logical formula (see, e.g., (Poon and Domingos, 2009)) or as a set of field values if database records are used as a meaning representation (Liang et al., 2009).
    Page 3, “Inference with NonContradictory Documents”
  5. ings (m1,..., m K) that Ailmi is not satisfiable,2 and models dependencies between components in the composite meaning representation (e.g., arguments values of predicates).
    Page 4, “Inference with NonContradictory Documents”
  6. and corresponding meaning representations m* = (m’f,..., m}), where m; is the predicted meaning representation of text wnk.
    Page 4, “Inference with NonContradictory Documents”
  7. Then it iteratively predicts meaning representations fizj conditioned on the list of semantics m* = (m’f,..., mfil) fixed on the previous stages and does it for all the remaining texts wj (lines 3-5).
    Page 4, “Inference with NonContradictory Documents”
  8. An important aspect of this algorithm is that unlike usual greedy inference, the remaining (‘future’) texts do affect the choice of meaning representations made on the earlier stages.
    Page 4, “Inference with NonContradictory Documents”
  9. tradiction is trivial: two meaning representations
    Page 5, “A Model of Semantics”
  10. As soon as the meaning representations m* are inferred, we find ourselves in the setup studied in (Liang et al., 2009): the state 3 is no longer latent and we can run efficient inference on the E—step.
    Page 7, “A Model of Semantics”

See all papers in Proc. ACL 2010 that mention meaning representations.

See all papers in Proc. ACL that mention meaning representations.

Back to top.

latent semantic

Appears in 5 sentences as: latent semantic (5) latent semantics (1)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. A simple and efficient inference method recursively induces joint semantic representations for each group and discovers correspondence between lexical entries and latent semantic concepts.
    Page 1, “Abstract”
  2. We assume that each text in a group is independently generated from a full latent semantic state corresponding to the group.
    Page 1, “Introduction”
  3. Unsupervised learning with shared latent semantic representations presents its own challenges, as exact inference requires marginalization over possible assignments of the latent semantic state, consequently, introducing nonlocal statistical dependencies between the decisions about the semantic structure of each text.
    Page 2, “Introduction”
  4. Figure 3: The semantics-text correspondence model with K documents sharing the same latent semantic state.
    Page 5, “A Model of Semantics”
  5. However, exact inference for groups of documents with overlapping semantic representation is generally prohibitively expensive, as the shared latent semantics introduces nonlocal dependences between semantic representations of individual documents.
    Page 9, “Summary and Future Work”

See all papers in Proc. ACL 2010 that mention latent semantic.

See all papers in Proc. ACL that mention latent semantic.

Back to top.

semantic parsing

Appears in 5 sentences as: semantic parsing (5)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. In recent years, there has been increasing interest in statistical approaches to semantic parsing .
    Page 1, “Introduction”
  2. The alignment a defines how semantics is verbalized in the text w, and it can be represented by a meaning derivation tree in case of full semantic parsing (Poon and Domingos, 2009) or, e.g., by a hierarchical segmentation into utterances along with an utterance-field alignment in a more shallow variation of the problem.
    Page 3, “Inference with NonContradictory Documents”
  3. In semantic parsing , we aim to find the most likely underlying semantics and alignment given the text:
    Page 3, “Inference with NonContradictory Documents”
  4. This is a weaker form of supervision than the one traditionally considered in supervised semantic parsing , where the alignment is also usually provided in training (Chen and Mooney, 2008; Zettlemoyer and Collins, 2005).
    Page 5, “A Model of Semantics”
  5. semantic parsing ) accuracy is not possible on this dataset, as the data does not contain information which fields are discussed.
    Page 8, “Empirical Evaluation”

See all papers in Proc. ACL 2010 that mention semantic parsing.

See all papers in Proc. ACL that mention semantic parsing.

Back to top.

model trained

Appears in 4 sentences as: model trained (3) models trained (1)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. Similarly, we call the models trained from this data supervised though full supervision was not available.
    Page 7, “Empirical Evaluation”
  2. Additionally, we report the results of the model trained with all the 750 texts labeled (Supervised UB), its scores can be regarded as an upper bound on the results of the semi-supervised models.
    Page 8, “Empirical Evaluation”
  3. Surprisingly, its precision is higher than that of the model trained on 750 labeled examples, though admittedly it is achieved at a very different recall level.
    Page 8, “Empirical Evaluation”
  4. To confirm that the model trained by our approach indeed assigns new words to correct fields and records, we visualize top words for the field characterizing sky cover (table 2).
    Page 8, “Empirical Evaluation”

See all papers in Proc. ACL 2010 that mention model trained.

See all papers in Proc. ACL that mention model trained.

Back to top.

model parameters

Appears in 3 sentences as: model parameters (3)
In Bootstrapping Semantic Analyzers from Non-Contradictory Texts
  1. In the supervised case, where a and m are observable, estimation of the generative model parameters is generally straightforward.
    Page 3, “Inference with NonContradictory Documents”
  2. We select the model parameters 6 by maximizing the marginal likelihood of the data, where the data D is given in the form of groups w =
    Page 6, “A Model of Semantics”
  3. When estimating the model parameters , we followed the training regime prescribed in (Liang et al., 2009).
    Page 7, “Empirical Evaluation”

See all papers in Proc. ACL 2010 that mention model parameters.

See all papers in Proc. ACL that mention model parameters.

Back to top.