Abstract | We use a Bayesian generative model to capture relevant natural language phenomena and translate the English specification into a specification tree, which is then translated into a C++ input parser. |
Introduction | The general problem of translating natural language specifications into executable code has been around since the field of computer science was founded. |
Introduction | Figure 1: An example of (a) one natural language specification describing program input data; (b) the corresponding specification tree representing the program input structure; and (c) two input examples |
Introduction | Recent advances in this area include the successful translation of natural language commands into database queries (Wong and Mooney, 2007; Zettlemoyer and Collins, 2009; Poon and Domingos, 2009; Liang et al., 2011) and the successful mapping of natural language instructions into Windows command sequences (Branavan et al., 2009; Branavan et al., 2010). |
Model | Finally, it generates natural language feature observations conditioned on the hidden specification trees. |
Model | We define a range of features that capture the correspondence between the input format and its description in natural language . |
Related Work | NLP in Software Engineering Researchers have recently developed a number of approaches that apply natural language processing techniques to software engineering problems. |
Related Work | This research analyzes natural language in documentation or comments to better understand existing application programs. |
Related Work | Our mechanism, in contrast, automatically generates parser programs from natural language input format descriptions. |
Abstract | This paper further explores linguistic features that explain why certain relations are preserved in English writing, and which contribute to related tasks such as native language identification. |
Approach | 1Recently, native language identification has drawn the attention of NLP researchers. |
Approach | native language identification took place at an NAACL—HLT 2013 workshop. |
Discussion | This tendency in the length of noun-noun compounds provides us with a crucial insight for native language identification, which we will |
Experiments | Because some of the writers had more than one native language, we excluded essays that did not meet the following three conditions: (i) the writer has only one native language; (ii) the writer has only one language at home; (iii) the two languages in (i) and (ii) are the same as the native language of the subcorpus to which the essay belongs3. |
Experiments | Native language # of essays # of tokens |
Implications for Work in Related Domains | (2005) work on native language identification and show that machine learning-based methods are effective. |
Implications for Work in Related Domains | Related to this, other researchers (Koppel and Ordan, 2011; van Halteren, 2008) show that machine learning-based methods can also predict the source language of a given translated text although it should be emphasized that it is a different task from native language identification because translation is not typically performed by nonnative speakers but rather native speakers of the target language“. |
Implications for Work in Related Domains | The experimental results show that n-grams containing articles are predictive for identifying native languages . |
Introduction | This becomes important in native language identification1 which is useful for improving grammatical error correction systems (Chodorow et al., 2010) or for providing more targeted feedback to language learners. |
Introduction | 6, this paper reveals several crucial findings that contribute to improving native language identification. |
Conclusion | In particular, more research is needed to handle more complex matches between database and textual relations, and to handle more complex natural language queries. |
Introduction | Semantic parsing is the task of translating natural language utterances to a formal meaning representation language (Chen et al., 2010; Liang et al., 2009; Clarke et al., 2010; Liang et al., 2011; Artzi and Zettlemoyer, 2011). |
Previous Work | Two existing systems translate between natural language questions and database queries over large-scale databases. |
Previous Work | (2012) report on a system for translating natural language queries to SPARQL queries over the Yago2 (Hoffart et al., 2013) database. |
Previous Work | The manual extraction patterns predefine a link between natural language terms and Yago2 relations. |
Textual Schema Matching | The textual schema matching task is to identify natural language words and phrases that correspond with each relation and entity in a fixed schema for a relational database. |
Textual Schema Matching | The problem would be greatly simplified if M were a 1-1 function, but in practice most database relations can be referred to in many ways by natural language users: for instance, f i lm_actor can be referenced by the English verbs “played,” “acted,” and “starred,” along with morphological variants of them. |
Overview of the Approach | Problem Our goal is to learn a function that will map a natural language question cc to a query 2 over a database D. The database D is a collection of assertions in the form r(el, 62) where 7“ is a bi- |
Overview of the Approach | The lexicon L associates natural language patterns to database concepts, thereby defining the space of queries that can be derived from the input question (see Table 2). |
Question Answering Model | To answer questions, we must find the best query for a given natural language question. |
Question Answering Model | To define the space of possible queries, PARALEX uses a lexicon L that encodes mappings from natural language to database concepts (entities, relations, and queries). |
Related Work | Our work builds upon two major threads of research in natural language processing: information extraction (IE), and natural language interfaces to databases (NLIDB). |
Related Work | While much progress has been made in converting text into structured knowledge, there has been little work on answering natural language questions over these databases. |
Related Work | However, we use a paraphrase corpus for extracting lexical items relating natural language patterns to database concepts, as opposed to relationships between pairs of natural language utterances. |
Abstract | Semantic parsing is the problem of deriving a structured meaning representation from a natural language utterance. |
Conclusions | We have presented a semantic parser which uses techniques from machine translation to learn mappings from natural language to variable-free meaning representations. |
Introduction | Semantic parsing (SP) is the problem of transforming a natural language (NL) utterance into a machine-interpretable meaning representation (MR). |
MT—based semantic parsing | cityid, which in some training examples is unary) to align with different natural language strings depending on context. |
Related Work | Other work which generalizes from variable-free meaning representations to A-calculus expressions includes the natural language generation procedure described by Lu and Ng (2011). |
Abstract | Finding concepts in natural language utterances is a challenging task, especially given the scarcity of labeled data for learning semantic ambiguity. |
Introduction | Semantic tagging is used in natural language understanding (NLU) to recognize words of semantic importance in an utterance, such as entities. |
Introduction | Our SSL approach uses probabilistic clustering method tailored for tagging natural language utterances. |
Semi-Supervised Semantic Labeling | In (Subramanya et al., 2010), a new SSL method is described for adapting syntactic POS tagging of sentences in newswire articles along with search queries to a target domain of natural language (NL) questions. |
Abstract | Our unsupervised model brings together familiar components in natural language processing (like parsers and topic models) with contextual political information—temporal and dyad dependence—to infer latent event classes. |
Experiments | (This highlights how better natural language processing could help the model, and the dangers of false positives for this type of data analysis, especially in small-sample drilldowns.) |
Related Work | 6.2 Events in Natural Language Processing |
Related Work | Political event extraction from news has also received considerable attention within natural language processing in part due to government-funded challenges such as MUC-3 and MUC-4 (Lehnert, 1994), which focused on the extraction of terrorist events, as well as the more recent ACE program. |
Abstract | Natural language parsing has typically been done with small sets of discrete categories such as NP and VP, but this representation does not capture the full syntactic nor semantic richness of linguistic phrases, and attempts to improve on this by lexicalizing phrases or splitting categories only partly address the problem at the cost of huge feature spaces and sparseness. |
Introduction | Syntactic parsing is a central task in natural language processing because of its importance in mediating between linguistic expression and meaning. |
Introduction | In many natural language systems, single words and n-grams are usefully described by their distributional similarities (Brown et al., 1992), among many others. |
Introduction | However, their model is lacking in that it cannot represent the recursive structure inherent in natural language . |
Abstract | Part-of-speech tagging is a crucial preliminary process in many natural language processing applications. |
Abstract | Because many words in natural languages have more than one part-of-speech tag, resolving part-of-speech ambiguity is an important task. |
Introduction | part-of-speech or POS tagging) is an important preprocessing step for many natural language processing applications because grammatical rules are not functions of individual words, instead, they are functions of word categories. |
Corpus Information | We use as our corpus the 4.5 million word Intema—tional Corpus of Learner English (ICLE) (Granger et al., 2009), which consists of more than 6000 essays written by university undergraduates from 16 countries and 16 native languages who are learners of English as a Foreign Language. |
Corpus Information | Fifteen native languages are represented in the set of essays selected for annotation. |
Introduction | Automated essay scoring, the task of employing computer technology to evaluate and score written text, is one of the most important educational applications of natural language processing (NLP) (see Shermis and Burstein (2003) and Shermis et al. |
Introduction | Recent years have seen increased interest in grounded language acquisition, where the goal is to extract representations of the meaning of natural language tied to the physical world. |
Introduction | The language grounding problem has assumed seVeral guises in the literature such as semantic parsing (Zelle and Mooney, 1996; Zettlemoyer and Collins, 2005; Kate and Mooney, 2007; Lu et al., 2008; Bo'rschinger et al., 2011), mapping natural language instructions to executable actions (Branavan et al., 2009; Tellex et al., 2011), associating simplified language to perceptual data such as images or Video (Siskind, 2001; Roy and Pent-land, 2002; Gorniak and Roy, 2004; Yu and Ballard, 2007), and learning the meaning of words based on linguistic and perceptual input (Bruni et al., 2012b; Feng and Lapata, 2010; Johns and Jones, 2012; Andrews et al., 2009; Silberer and Lapata, 2012). |
Related Work | Since our goal is to develop distributional models that are applicable to many words, it contains a considerably larger number of concepts (i.e., more than 500) and attributes (i.e., 412) based on a detailed taxonomy which we argue is cognitively plausible and beneficial for image and natural language processing tasks. |
Abstract | We address the challenge of generating natural language abstractive summaries for spoken meetings in a domain-independent fashion. |
Introduction | 0 To the best of our knowledge, our system is the first fully automatic system to generate natural language abstracts for spoken meetings. |
Surface Realization | In this section, we describe surface realization, which renders the relation instances into natural language abstracts. |
Introduction | Open-domain question answering (QA), which fulfills a user’s information need by outputting direct answers to natural language queries, is a challenging but important problem (Etzioni, 2011). |
Introduction | Due to the variety of word choices and inherent ambiguities in natural languages , bag-of-words approaches with simple surface-form word matching tend to produce brittle results with poor prediction accuracy (Bilotti et al., 2007). |
Related Work | While the task of question answering has a long history dated back to the dawn of artificial intelligence, early systems like STUDENT (Winograd, 1977) and LUNAR (Woods, 1973) are typically designed to demonstrate natural language understanding for a small and specific domain. |