SciSurf: Index of "hidden layer" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

hidden layer

Seen in text as:

hidden layer (73)
hidden layers (17)

Seen in 85 sentences in 9 papers.

1. Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	When used in conjunction with a precomputed hidden layer , these techniques speed up NNJ M computation by a factor of 10,000X, with only a small reduction on MT accuracy.
Neural Network Joint Model (NNJ M)	We use two 512-dimensional hidden layers with tanh activation functions.
Neural Network Joint Model (NNJ M)	We chose these values for the hidden layer size, vocabulary size, and source window size because they seemed to work best on our data sets — larger sizes did not improve results, while smaller sizes degraded results.
Neural Network Joint Model (NNJ M)	2.4 Pre-Computing the Hidden Layer

hidden layer is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

2. Recurrent Neural Networks for Word Alignment Model

Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This study proposes a word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers .
Introduction	An RNN has a hidden layer with recurrent connections that propagates its own previous signals.
RNN-based Alignment Model	The model consists of a lookup layer, a hidden layer , and an output layer, which have weight
RNN-based Alignment Model	2Consecutive l hidden layers can be used: 2; = f (H; X 21—1 -\|- B H1).
RNN-based Alignment Model	For simplicity, this paper describes the model with 1 hidden layer .
Related Work	21 Hidden Layer \| htanh(H><20+BH) l
Related Work	Figure 1 shows the network structure with one hidden layer for computing a lexical translation probability tle$(fj, ea].
Related Work	The model consists of a lookup layer, a hidden layer , and an output layer, which have weight matrices.

hidden layer is mentioned in 25 sentences in this paper.

Topics mentioned in this paper:

3. Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models

Auli, Michael and Gao, Jianfeng

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Decoder Integration	To solve this problem, we follow previous work on lattice rescoring with recurrent networks that maintained the usual n-gram context but kept a beam of hidden layer configurations at each state (Auli et al., 2013).
Decoder Integration	In fact, to make decoding as efficient as possible, we only keep the single best scoring hidden layer configuration.
Decoder Integration	As future cost estimate we score each phrase in isolation, resetting the hidden layer at the beginning of a phrase.
Experiments	The hidden layer uses 100 neurons unless otherwise stated.
Experiments	To keep training times manageable, we reduce the hidden layer size to 30 neurons, thereby greatly increasing speed.
Recurrent Neural Network LMs	(2010) which is factored into an input layer, a hidden layer with recurrent connections, and an output layer (Figure 1).
Recurrent Neural Network LMs	The hidden layer state ht encodes the history of all words observed in the sequence up to time step t. The state of the hidden layer is determined by the input layer and the hidden layer configuration of the previous time step ht_1.
Recurrent Neural Network LMs	The weights of the connections between the layers are summarized in a number of matrices: U represents weights from the input layer to the hidden layer, and W represents connections from the previous hidden layer to the current hidden layer .

hidden layer is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

4. Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation

Lu, Shixiang and Chen, Zhenbiao and Xu, Bo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Moreover, to learn high dimensional feature representation, we introduce a natural horizontal composition of more DAEs for large hidden layers feature learning.
Introduction	Moreover, to learn high dimensional feature representation, we introduce a natural horizontal composition for DAEs (HCDAE) that can be used to create large hidden layer representations simply by horizontally combining two (or more) DAEs (Baldi, 2012), which shows further improvement compared with single DAE in our experiments.
Semi-Supervised Deep Auto-encoder Features Learning for SMT	The connection weight W, hidden layer biases c and visible layer biases b can be learned efficiently using the contrastive divergence (Hinton, 2002; Carreira-Perpinan and Hinton, 2005).
Semi-Supervised Deep Auto-encoder Features Learning for SMT	When given a hidden layer h, factorial conditional distribution of visible layer 2) can be estimated by
Semi-Supervised Deep Auto-encoder Features Learning for SMT	Moreover, although we have introduced another four types of phrase features (X2, X3, X4 and X 5), the only 16 features in X are a bottleneck for learning large hidden layers feature representation, because it has limited information, the performance of the high-dimensional DAE features which are directly learned from single DAE is not very satisfactory.

hidden layer is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

5. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The vocabulary size for the input layer is 100,000, and we choose different lengths for the hidden layer as L = {100, 300, 600, 1000} in the experiments.
Experiments	4.3 Effect of retrieved documents and length of hidden layers
Experiments	We illustrate the relationship among translation accuracy (BLEU), the number of retrieved documents (N) and the length of hidden layers (L) on different testing datasets.
Topic Similarity Model with Neural Network	Assuming that the dimension of the 9(X) is L, the linear layer forms a L x V matriX W which projects the n-of-V vector to a L-dimensional hidden layer .
Topic Similarity Model with Neural Network	Training neural networks involves many factors such as the learning rate and the length of hidden layers .

hidden layer is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. Learning Grounded Meaning Representations with Autoencoders

Silberer, Carina and Lapata, Mirella

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Autoencoders for Grounded Semantics	To further optimize the parameters of the network, a supervised criterion can be imposed on top of the last hidden layer such as the minimization of a prediction error on a supervised task (Bengio, 2009).
Autoencoders for Grounded Semantics	We first train SAEs with two hidden layers (codings) for each modality separately.
Autoencoders for Grounded Semantics	Then, we join these two SAEs by feeding their respective second coding simultaneously to another autoencoder, whose hidden layer thus yields the fused meaning representation.
Experimental Setup	This model has the following architecture: the textual autoencoder (see Figure 1, left-hand side) consists of 700 hidden units which are then mapped to the second hidden layer with 500 units (the corruption parameter was set to v = 0.1); the visual autoencoder (see Figure 1, right-hand side) has 170 and 100 hidden units, in the first and second layer, respectively.

hidden layer is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

7. Is this a wampimuk? Cross-modal mapping between distributional semantics and the visual world

Lazaridou, Angeliki and Bruni, Elia and Baroni, Marco

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Neural Network (NNet) The last model that we introduce is a neural network with one hidden layer .
Experimental Setup	where (9va consists of the model weights 6(1) 6 Rdeh and 6(2) 6 Rthw that map the input image-based vectors V8 first to the hidden layer and then to the output layer in order to obtain text-based vectors, i.e., W8 2 0(2)(0(1)(V36(1))6(2)), where 0(1) and 0(2) are
Introduction	This is achieved by means of a simple neural network trained to project image-extracted feature vectors to text-based vectors through a hidden layer that can be interpreted as a cross-modal semantic space.
Results	In order to gain qualitative insights into the performance of the projection process of NN, we attempt to investigate the role and interpretability of the hidden layer .
Results	hidden layer acts as a cross-modal concept cate-gorizatiorflorganization system.

hidden layer is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. A Recursive Recurrent Neural Network for Statistical Machine Translation

Liu, Shujie and Yang, Nan and Li, Mu and Zhou, Ming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Model	As shown in Figure l, the network contains three layers, an input layer, a hidden layer , and an output layer.
Our Model	previous history ht_1 to generate the current hidden layer , which is a new history vector ht .
Phrase Pair Embedding	The neural network is used to reduce the space dimension of sparse features, and the hidden layer of the network is used as the phrase pair embedding.
Phrase Pair Embedding	The length of the hidden layer is empirically set to 20.

hidden layer is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Tagging The Web: Building A Robust Web Tagger with Neural Network

Ma, Ji and Zhang, Yue and Zhu, Jingbo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For each data set, we investigate an extensive set of combinations of hyper-parameters: the n-gram window (l,r) in {(1, 1), (2,1), (1,2), (2,2)}; the hidden layer size in {200, 300, 400}; the learning rate in {0.1, 0.01, 0.001}.
Learning from Web Text	The affine form of E with respect to v and h implies that the visible variables are conditionally independent with each other given the hidden layer units, and vice versa.
Learning from Web Text	For each position j, there is a weight matrix W0) 6 RHXD, which is used to model the interaction between the hidden layer and the word projection in position j.
Neural Network for POS Disambiguation	The web-feature module, shown in the lower left part of Figure 1, consists of a input layer and two hidden layers .

hidden layer is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: