Abstract | Compositional question answering begins by mapping questions to logical forms, but training a semantic parser to perform this mapping typically requires the costly annotation of the target logical forms. |
Abstract | On two standard semantic parsing benchmarks (GEO and JOBS), our system obtains the highest published accuracies, despite requiring no annotated logical forms. |
Experiments | (2010) (henceforth, SEMRESP), which also learns a semantic parser from question-answer pairs. |
Experiments | Next, we compared our systems (DCS and DCS+) with the state-of-the-art semantic parsers on the full dataset for both GEO and JOBS (see Table 3). |
Introduction | Answering these types of complex questions compositionally involves first mapping the questions into logical forms ( semantic parsing ). |
Introduction | Supervised semantic parsers (Zelle and Mooney, 1996; Tang and Mooney, 2001; Ge and Mooney, 2005; Zettlemoyer and Collins, 2005; Kate and Mooney, 2007; Zettlemoyer and Collins, 2007; Wong and Mooney, 2007; Kwiatkowski et al., 2010) rely on manual annotation of logical forms, which is expensive. |
Introduction | On the other hand, existing unsupervised semantic parsers (Poon and Domingos, 2009) do not handle deeper linguistic phenomena such as quantification, negation, and superlatives. |
Semantic Parsing | Model We now present our discriminative semantic parsing model, which places a log-linear distribution over 2 E Z L(x) given an utterance X. |
Abstract | We then replace the human semantic role annotators with automatic shallow semantic parsing to further automate the evaluation metric, and show that even the semiautomated evaluation metric achieves a 0.34 correlation coefficient with human adequacy judgment, which is still about 80% as closely correlated as HTER despite an even lower labor co st for the evaluation procedure. |
Abstract | Finally, we show that replacing the human semantic role labelers with an automatic shallow semantic parser in our proposed metric yields an approximation that is about 80% as closely correlated with human judgment as HTER, at an even lower cost—and is still far better correlated than n-gram based evaluation metrics. |
Abstract | It is now worth asking a deeper question: can we further reduce the labor cost of MEANT by using automatic shallow semantic parsing instead of humans for semantic role labeling? |