Index of papers in Proc. ACL that mention
  • neural network
Zhu, Xiaodan and Guo, Hongyu and Mohammad, Saif and Kiritchenko, Svetlana
Abstract
This model learns the syntax and semantics of the negator’s argument with a recursive neural network .
Experimental results
Furthermore, modeling the syntax and semantics with the state-of-the-art recursive neural network (model 7 and 8) can dramatically improve the performance over model 6.
Experimental results
Note that the two neural network based models incorporate the syntax and semantics by representing each node with a vector.
Experimental results
Note that this is a special case of what the neural network based models can model.
Introduction
This model learns the syntax and semantics of the negator’s argument with a recursive neural network .
Related work
The more recent work of (Socher et al., 2012; Socher et al., 2013) proposed models based on recursive neural networks that do not rely on any heuristic rules.
Related work
In principle neural network is able to fit very complicated functions (Mitchell, 1997), and in this paper, we adapt the state-of-the-art approach described in (Socher et al., 2013) to help understand the behavior of negators specifically.
Semantics-enriched modeling
A recursive neural tensor network (RNTN) is a specific form of feed-forward neural network based on syntactic (phrasal-structure) parse tree to conduct compositional sentiment analysis.
Semantics-enriched modeling
A major difference of RNTN from the conventional recursive neural network (RRN) (Socher et al., 2012) is the use of the tensor V in order to directly capture the multiplicative interaction of two input vectors, although the matrix W implicitly captures the nonlinear interaction between the input vectors.
Semantics-enriched modeling
This is actually an interesting place to extend the current recursive neural network to consider extrinsic knowledge.
neural network is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing
Abstract
Specifically, we develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g.
Introduction
To this end, we extend the existing word embedding learning algorithm (Collobert et al., 2011) and develop three neural networks to effectively incorporate the supervision from sentiment polarity of text (e.g.
Introduction
0 We develop three neural networks to learn sentiment-specific word embedding (SSWE) from massive distant-supervised tweets without any manual annotations;
Related Work
propose Recursive Neural Network (RNN) (2011b), matrix-vector RNN (2012) and Recursive Neural Tensor Network (RNTN) (2013b) to learn the compositionality of phrases of any length based on the representation of each pair of children recursively.
Related Work
(2011) that follow the probabilistic document model (Blei et al., 2003) and give an sentiment predictor function to each word, we develop neural networks and map each ngram to the sentiment polarity of sentence.
Related Work
We extend the existing word embedding learning algorithm (Collobert et al., 2011) and develop three neural networks to learn SSWE.
neural network is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Huang, Eric and Socher, Richard and Manning, Christopher and Ng, Andrew
Abstract
We present a new neural network architecture which 1) learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and 2) accounts for homonymy and polysemy by learning multiple embeddings per word.
Experiments
We use 10-word windows of text as the local context, 100 hidden units, and no weight regularization for both neural networks .
Global Context-Aware Neural Language Model
In this section, we describe the training objective of our model, followed by a description of the neural network architecture, ending with a brief description of our model’s training method.
Global Context-Aware Neural Language Model
We compute scores 9(8, d) and g(sw, d) where 8w is s with the last word replaced by word w, and g(-, is the scoring function that represents the neural networks used.
Global Context-Aware Neural Language Model
2.2 Neural Network Architecture
neural network is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Pei, Wenzhe and Ge, Tao and Chang, Baobao
Abstract
Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability to alleviate the burden of manual feature engineering.
Abstract
In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN).
Abstract
Experiments on the benchmark dataset show that our model achieves better performances than previous neural network models and that our model can achieve a competitive performance with minimal feature engineering.
Introduction
Recently, neural network models have been increasingly focused on for their ability to minimize the effort in feature engineering.
Introduction
Workable as previous neural network models seem, a limitation of them to be pointed out is that the tag-tag interaction, tag-character interaction and character-character interaction are not well modeled.
Introduction
In previous neural network models, however, hardly can such interactional effects be fully captured relying only on the simple transition score and the single nonlinear transformation (See section 2).
neural network is mentioned in 39 sentences in this paper.
Topics mentioned in this paper:
Boros, Tiberiu and Ion, Radu and Tufis, Dan
Abstract
In this paper we present an alternative method to Tiered Tagging, based on local optimizations with Neural Networks and we show how, by properly encoding the input sequence in a general Neural Network architecture, we achieve results similar to the Tiered Tagging methodology, significantly faster and without requiring extensive linguistic knowledge as implied by the previously mentioned method.
Abstract
In this article, we propose an alternative solution based on local optimizations with feed-forward neural networks .
Abstract
2 Large tagset part-of—speech tagging with feed-forward neural networks
neural network is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Ma, Ji and Zhang, Yue and Zhu, Jingbo
Abstract
The representation is integrated as features into a neural network that serves as a scorer for an easy-first POS tagger.
Abstract
Parameters of the neural network are trained using guided learning in the second phase.
Easy-first POS tagging with Neural Network
The neural network proposed in Section 3 is used for POS disambiguation by the easy-first POS tagger.
Easy-first POS tagging with Neural Network
At each step, the algorithm adopts a scorer, the neural network in our case, to assign a score to each possible word-tag pair (212, t) , and then selects the highest score one (If), f) to tag (i.e., tag 21“) with f).
Easy-first POS tagging with Neural Network
While previous work (Shen et al., 2007; Zhang and Clark, 2011; Goldberg and Elhadad, 2010) apply guided learning to train a linear classifier by using variants of the percep-tron algorithm, we are the first to combine guided learning with a neural network , by using a margin loss and a modified back-propagation algorithm.
Introduction
We integrate the learned encoder with a set of well-established features for POS tagging (Ratnaparkhi, 1996; Collins, 2002) in a single neural network , which is applied as a scorer to an easy-first POS tagger.
Introduction
To our knowledge, we are the first to investigate guided learning for neural networks .
Neural Network for POS Disambiguation
We integrate the learned WRRBM into a neural network , which serves as a scorer for POS disambiguation.
Neural Network for POS Disambiguation
The main challenge to designing the neural network structure is: on the one hand, we hope that the model can take the advantage of information provided by the learned WRRBM, which reflects general properties of web texts, so that the model generalizes well in the web domain; on the other hand, we also hope to improve the model’s discriminative power by utilizing well-established POS tagging features, such as those of Ratnaparkhi (1996).
Neural Network for POS Disambiguation
Our approach is to leverage the two sources of information in one neural network by combining them though a shared output layer, as shown in Figure 1.
neural network is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun
Abstract
A neural network is a reasonable method to address these pitfalls.
Abstract
However, modeling SMT with a neural network is not trivial, especially when taking the decoding efficiency into consideration.
Abstract
In this paper, we propose a variant of a neural network , i.e.
Introduction
A neural network (Bishop, 1995) is a reasonable method to overcome the above shortcomings.
Introduction
In the search procedure, frequent computation of the model score is needed for the search heuristic function, which will be challenged by the decoding efficiency for the neural network based translation model.
Introduction
In this paper, we propose a variant of neural networks , i.e.
neural network is mentioned in 34 sentences in this paper.
Topics mentioned in this paper:
Socher, Richard and Bauer, John and Manning, Christopher D. and Andrew Y., Ng
Abstract
Instead, we introduce a Compositional Vector Grammar (CVG), which combines PCFGs with a syntactically untied recursive neural network that learns syntactico-semantic, compositional vector representations.
Introduction
The vectors for nonterminals are computed via a new type of recursive neural network which is conditioned on syntactic categories from a PCFG.
Introduction
l. CVGs combine the advantages of standard probabilistic context free grammars (PCFG) with those of recursive neural networks (RNNs).
Introduction
This requires the composition function to be extremely powerful, since it has to combine phrases with different syntactic head words, and it is hard to optimize since the parameters form a very deep neural network .
neural network is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai
Abstract
In this paper, we explore a novel bilingual word alignment approach based on DNN (Deep Neural Network ), which has been proven to be very effective in various machine learning tasks (Collobert et al., 2011).
DNN structures for NLP
The lookup process is called a lookup layer LT , which is usually the first layer after the input layer in neural network .
DNN structures for NLP
Multilayer neural networks are trained with the standard back propagation algorithm (LeCun, 1985).
DNN structures for NLP
Techniques such as layerwise pre-training(Bengio et al., 2007) and many tricks(LeCun et al., 1998) have been developed to train better neural networks .
Introduction
Recent years research communities have seen a strong resurgent interest in modeling with deep (multilayer) neural networks .
Introduction
For speech recognition, (Dahl et al., 2012) proposed context-dependent neural network with large vocabulary, which achieved 16.0% relative error reduction.
Introduction
(Collobert et al., 2011) and (Socher et al., 2011) further apply Recursive Neural Networks to address the structural prediction tasks such as tagging and parsing, and (Socher et al., 2012) explores the compositional aspect of word representations.
Related Work
(Seide et al., 2011) and (Dahl et al., 2012) apply Context-Dependent Deep Neural Network with HMM (CD-DNN-HMM) to speech recognition task, which significantly outperforms traditional models.
Related Work
(Bengio et al., 2006) proposed to use multilayer neural network for language modeling task.
neural network is mentioned in 33 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Gao, Jianfeng
Abstract
Neural network language models are often trained by optimizing likelihood, but we would prefer to optimize for a task specific metric, such as BLEU in machine translation.
Abstract
We show how a recurrent neural network language model can be optimized towards an expected BLEU loss instead of the usual cross-entropy criterion.
Expected BLEU Training
We integrate the recurrent neural network language model as an additional feature into the standard log-linear framework of translation (Och, 2003).
Expected BLEU Training
We summarize the weights of the recurrent neural network language model as 6 = {U, W, V} and add the model as an additional feature to the log-linear translation model using the simplified notation 89(10):) 2 8(wt|w1...wt_1,ht_1):
Introduction
In this paper we focus on recurrent neural network architectures which have recently advanced the state of the art in language modeling (Mikolov et al., 2010; Mikolov et al., 2011; Sundermeyer et al., 2013) with several subsequent applications in machine translation (Auli et al., 2013; Kalchbrenner and Blunsom, 2013; Hu et al., 2014).
Introduction
In practice, neural network models for machine translation are usually trained by maximizing the likelihood of the training data, either via a cross-entropy objective (Mikolov et al., 2010; Schwenk
Introduction
Most previous work on neural networks for machine translation is based on a rescoring setup (Arisoy et al., 2012; Mikolov, 2012; Le et al., 2012a; Auli et al., 2013), thereby side stepping the algorithmic and engineering challenges of direct decoder-integration.
Recurrent Neural Network LMs
Our model has a similar structure to the recurrent neural network language model of Mikolov et al.
neural network is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Abstract
In this paper, we propose a novel approach to learning topic representation for parallel data using a neural network architecture, where abundant topical contexts are embedded via topic relevant monolingual data.
Background: Deep Learning
This technique began raising public awareness in the mid-2000s after researchers showed how a multilayer feed-forward neural network can be effectively trained.
Introduction
These topic-related documents are utilized to learn a specific topic representation for each sentence using a neural network based approach.
Introduction
Neural network is an effective technique for learning different levels of data representations.
Introduction
The levels inferred from neural network correspond to distinct levels of concepts, where high-level representations are obtained from low-level bag-of-words input.
Topic Similarity Model with Neural Network
In this section, we explain our neural network based topic similarity model in detail, as well as how to incorporate the topic similarity features into SMT decoding procedure.
Topic Similarity Model with Neural Network
Neural Network Training
Topic Similarity Model with Neural Network
Figure 1: Overview of neural network based topic similarity model.
neural network is mentioned in 36 sentences in this paper.
Topics mentioned in this paper:
Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John
Abstract
Recent work has shown success in using neural network language models (NNLMs) as features in MT systems.
Abstract
Here, we present a novel formulation for a neural network joint model (NNJM), which augments the NNLM with a source context window.
Introduction
In recent years, neural network models have become increasingly popular in NLP.
Introduction
Initially, these models were primarily used to create n-gram neural network language models (NNLMs) for speech recognition and machine translation (Bengio et al., 2003; Schwenk, 2010).
Introduction
In this paper we use a basic neural network architecture and a lexicalized probability model to create a powerful MT decoding feature.
Neural Network Joint Model (NNJ M)
Fortunately, neural network language models are able to elegantly scale up and take advantage of arbitrarily large context sizes.
Neural Network Joint Model (NNJ M)
2.1 Neural Network Architecture
Neural Network Joint Model (NNJ M)
Our neural network architecture is almost identical to the original feed-forward NNLM architecture described in Bengio et al.
neural network is mentioned in 33 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming
Abstract
In this paper, we propose a novel recursive recurrent neural network (RZNN) to model the end-to-end decoding process for statistical machine translation.
Abstract
RZNN is a combination of recursive neural network and recurrent neural network, and in turn integrates their respective capabilities: (1) new information can be used to generate the next hidden state, like recurrent neural networks, so that language model and translation model can be integrated naturally; (2) a tree structure can be built, as recursive neural networks , so as to generate the translation candidates in a bottom up manner.
Introduction
Deep Neural Network (DNN), which essentially is a multilayer neural network , has regained more and more attentions these years.
Introduction
Recurrent neural networks are leveraged to learn language model, and they keep the history information circularly inside the network for arbitrarily long time (Mikolov et al., 2010).
Introduction
Recursive neural networks , which have the ability to generate a tree structured output, are applied to natural language parsing (Socher et al., 2011), and they are extended to recursive neural tensor networks to explore the compositional aspect of semantics (Socher et al., 2013).
neural network is mentioned in 48 sentences in this paper.
Topics mentioned in this paper:
Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil
Abstract
We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences.
Background
Then we describe the operation of one-dimensional convolution and the classical Time-Delay Neural Network (TDNN) (Hinton, 1989; Waibel et al., 1990).
Background
A model that adopts a more general structure provided by an external parse tree is the Recursive Neural Network (RecNN) (Pollack, 1990; Kiichler and Goller, 1996; Socher et al., 2011; Hermann and Blunsom, 2013).
Background
The Recurrent Neural Network (RNN) is a special case of the recursive network where the structure that is followed is a simple linear chain (Gers and Schmidhuber, 2001; Mikolov et al., 2011).
Introduction
Figure 1: Subgraph of a feature graph induced over an input sentence in a Dynamic Convolutional Neural Network .
Introduction
A central class of models are those based on neural networks .
Introduction
These range from basic neural bag-of-words or bag-of-n-grams models to the more structured recursive neural networks and to time-delay neural networks based on convolutional operations (Collobert and Weston, 2008; Socher et al., 2011; Kalchbrenner and Blunsom, 2013b).
neural network is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Dong, Li and Wei, Furu and Tan, Chuanqi and Tang, Duyu and Zhou, Ming and Xu, Ke
Abstract
We propose Adaptive Recursive Neural Network (AdaRNN) for target-dependent Twitter sentiment classification.
Conclusion
We propose Adaptive Recursive Neural Network (AdaRNN) for the target-dependent Twitter sentiment classification.
Introduction
In this paper, we mainly focus on integrating target information with Recursive Neural Network (RNN) to leverage the ability of deep learning models.
Introduction
We employ a novel adaptive multi-compositionality layer in recursive neural network , which is named as AdaRNN (Dong et al., 2014).
Our Approach
Adaptive Recursive Neural Network is proposed to propagate the sentiments of words to the target node.
Our Approach
In Section 3.2, we propose Adaptive Recursive Neural Network and use it for target-dependent sentiment analysis.
Our Approach
3.2 AdaRNN: Adaptive Recursive Neural Network
RNN: Recursive Neural Network
Figure l: The composition process for “not very good” in Recursive Neural Network .
neural network is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Lazaridou, Angeliki and Bruni, Elia and Baroni, Marco
Conclusion
The neural network architecture emerged as the best performing approach, and our qualitative analysis revealed that it induced a categorical organization of concepts.
Conclusion
Given the success of NN, we plan to experiment in the future with more sophisticated neural network architectures inspired by recent work in machine translation (Gao et al., 2013) and multimodal deep learning (Srivastava and Salakhut-dinov, 2012).
Experimental Setup
Neural Network (NNet) The last model that we introduce is a neural network with one hidden layer.
Introduction
This is achieved by means of a simple neural network trained to project image-extracted feature vectors to text-based vectors through a hidden layer that can be interpreted as a cross-modal semantic space.
Related Work
(2013) learn to project unsupervised vector-based image representations onto a word-based semantic space using a neural network architecture.
Related Work
(2013) rely on a supervised state-of-the-art method: They feed low-level features to a deep neural network trained on a supervised object recognition task (Krizhevsky et al., 2012).
Results
For the neural network NN, we use prior knowledge
neural network is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K and Silver, David and Barzilay, Regina
Adding Linguistic Knowledge to the Monte-Carlo Framework
As shown in Figure 2, our model is a four layer neural network .
Adding Linguistic Knowledge to the Monte-Carlo Framework
The third feature layer f of the neural network is deterministically computed given the active units 3/, and zj of the softmax layers, and the values of the input layer.
Adding Linguistic Knowledge to the Monte-Carlo Framework
In the second layer of the neural network , the units 23’ represent a predicate labeling (5;- of every sentence yz- E d. However, our intention is to incorporate, into action-value function Q, information from only the most relevant sentence.
Introduction
We employ a multilayer neural network where the hidden layers represent sentence relevance and predicate parsing decisions.
neural network is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Xu, Liheng and Liu, Kang and Lai, Siwei and Zhao, Jun
Conclusion and Future Work
A semantic similarity graph is built to capture lexical semantic clue, and a convolutional neural network is used to encode contextual semantic clue.
Experiments
F W-5 uses a traditional neural network with a fixed window size of 5 to replace the CNN in CONT, and the candidate term to be classified is placed in the center of the window.
The Proposed Method
Then, a semantic similarity graph is created to capture lexical semantic clue, and a Convolutional Neural Network (CNN) (Collobert et al., 2011) is trained in each bootstrapping iteration to encode contextual semantic clue.
The Proposed Method
3.3 Encoding Contextual Semantic Clue Using Convolutional Neural Network
The Proposed Method
3.3.1 The architecture of the Convolutional Neural Network
neural network is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Silberer, Carina and Lapata, Mirella
Autoencoders for Grounded Semantics
Autoencoders An autoencoder is an unsupervised neural network which is trained to reconstruct a given input from its latent representation (Bengio, 2009).
Autoencoders for Grounded Semantics
Stacked Autoencoders Several (denoising) autoencoders can be used as building blocks to form a deep neural network (Bengio et al., 2007; Vincent et al., 2010).
Conclusions
To the best of our knowledge, our model is novel in its use of attribute-based input in a deep neural network .
Experimental Setup
Finally, we also compare to the word embeddings obtained using Mikolov et al.’s (2011) recurrent neural network based language model.
Related Work
A large body of work has focused on projecting words and images into a common space using a variety of deep learning methods ranging from deep and restricted Boltzman machines (Srivastava and Salakhutdinov, 2012; Feng et al., 2013), to autoencoders (Wu et al., 2013), and recursive neural networks (Socher et al., 2013b).
Related Work
Secondly, our problem setting is different from the former studies, which usually deal with classification tasks and fine-tune the deep neural networks using training data with explicit class labels; in contrast we fine-tune our autoencoders using a semi-supervised criterion.
neural network is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Turian, Joseph and Ratinov, Lev-Arie and Bengio, Yoshua
Distributed representations
Word embeddings are typically induced using neural language models, which use neural networks as the underlying predictive model (Bengio, 2008).
Distributed representations
We predict a score s(x) for x by passing e(x) through a single hidden layer neural network .
Distributed representations
We minimize this loss stochastically over the n-grams in the corpus, doing gradient descent simultaneously over the neural network parameters and the embedding lookup table.
neural network is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Iyyer, Mohit and Enns, Peter and Boyd-Graber, Jordan and Resnik, Philip
Abstract
Taking inspiration from recent work in sentiment analysis that successfully models the compositional aspect of language, we apply a recursive neural network (RNN) framework to the task of identifying the political position evinced by a sentence.
Conclusion
In this paper we apply recursive neural networks to political ideology detection, a problem where previous work relies heavily on bag-of-words models and hand-designed lexica.
Introduction
Building from those insights, we introduce a recursive neural network (RNN) to detect ideological bias on the sentence level.
Recursive Neural Networks
Recursive neural networks (RNNs) are machine learning models that capture syntactic and semantic composition.
neural network is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zweig, Geoffrey and Platt, John C. and Meek, Christopher and Burges, Christopher J.C. and Yessenalina, Ainur and Liu, Qiang
Discussion
Secondly, for the Holmes data, we can state that LSA total similarity beats the recurrent neural network , which in turn is better than the baseline n—gram model.
Discussion
It is an interesting research question if this could be done implicitly with a machine learning technique, for example recurrent or recursive neural networks .
Experimental Results 5.1 Data Resources
In this case, the best combination is to blend LSA, the Good—Turing language model, and the recurrent neural network .
Introduction
Also in the language modeling vein, but with potentially global context, we evaluate the use of a recurrent neural network language model.
neural network is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jeon, Je Hun and Liu, Yang
Previous work
(2004) used a Gaussian mixture model for acoustic-prosodic information and neural network based syntactic-prosodic model and achieved pitch accent detection accuracy of 84% and IPB detection accuracy of 90% at the word level.
Previous work
The experiments of Ananthakrishnan and Narayanan (2008) with neural network based acoustic-prosodic model and a factored n-gram syntactic model reported 87% accuracy on accent and break index detection at the syllable level.
Prosodic event detection method
Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine (SVM) classifier for syntactic-prosodic evidence performed better than other classifiers.
neural network is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Shixiang and Chen, Zhenbiao and Xu, Bo
Introduction
In this paper, we strive to effectively address the above two shortcomings, and systematically explore the possibility of learning new features using deep (multilayer) neural networks (DNN, which is usually referred under the name Deep Learning) for SMT.
Related Work
(2013) presented a joint language and translation model based on a recurrent neural network which predicts target words based on an unbounded history of both source and target words.
Related Work
(2013) went beyond the log-linear model for SMT and proposed a novel additive neural networks based translation model, which overcome some of the shortcomings suffered by the log-linear model: linearity and the lack of deep interpretation and representation in features.
neural network is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
He, Zhengyan and Liu, Shujie and Li, Mu and Zhou, Ming and Zhang, Longkai and Wang, Houfeng
Abstract
We propose a novel entity disambiguation model, based on Deep Neural Network (DNN).
Introduction
Deep neural networks (Hinton et al., 2006; Bengio et al., 2007) are built in a hierarchical manner, and allow us to compare context and entity at some higher level abstraction; while at lower levels, general concepts are shared across entities, resulting in compact models.
Learning Representation for Contextual Document
BTS is a variant of the general backpropagation algorithm for structured neural network .
neural network is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro
Abstract
This study proposes a word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers.
Introduction
(2013) adapted the Context-Dependent Deep Neural Network for HMM (CD-DNN-HMM) (Dahl et al., 2012), a type of feed-forward neural network (FFNN)-based model, to
Introduction
Recurrent neural network (RNN)-based models have recently demonstrated state-of-the-art performance that outperformed FFNN-based models for various tasks (Mikolov et al., 2010; Mikolov and Zweig, 2012; Auli et al., 2013; Kalchbrenner and Blunsom, 2013; Sundermeyer et al., 2013).
neural network is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: