Index of papers in Proc. ACL 2013 that mention

Seen in text as:

Seen in 35 sentences in 3 papers.

Boros, Tiberiu and Ion, Radu and Tufis, Dan

Abstract	In our experiments, we used a fully connected, feed forward neural network with 3 layers (1 input layer, 1 hidden layer and 1 output layer)
Abstract	In order to fully characterize our system, we took into account the following parameters: accuracy, runtime speed, training speed, hidden layer configuration and the number of optimal training iterations.
Abstract	For example, the accuracy, the optimal number of training iterations, the training and the runtime speed are all highly dependent on the hidden layer configuration.

hidden layer is mentioned in 20 sentences in this paper.

Topics mentioned in this paper:

Yang, Nan and Liu, Shujie and Li, Mu and Zhou, Ming and Yu, Nenghai

DNN structures for NLP	Besides that, neural network training also involves some hyperparameters such as learning rate, the number of hidden layers .
Experiments and Results	Table 3: Effect of different number of hidden layers .
Experiments and Results	Two hidden layers outperform one hidden layer, while three hidden layers do not bring further improvement.
Experiments and Results	6.4.3 Effect of number of hidden layers
Training	Tunable parameters in neural network alignment model include: word embeddings in lookup table LT, parameters Wl, bl for linear transformations in the hidden layers of the neural network, and distortion parameters 3d of jump distance.
Training	We set word embedding length to 20, window size to 5, and the length of the only hidden layer to 40.
Training	To make our model concrete, there are still hyper-parameters to be determined: the window size 3212 and tw, the length of each hidden layer Ll.

hidden layer is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

He, Zhengyan and Liu, Shujie and Li, Mu and Zhou, Ming and Zhang, Longkai and Wang, Houfeng

Experiments and Analysis	Training settings: In pre-training stage, input layer has 100,000 units, all hidden layers have 1,000 units with rectifier function masc(0, Following (Glorot et al., 2011), for the first reconstruction layer, we use sigmoid activation function and cross-entropy error function.
Learning Representation for Contextual Document	This stage we optimize the learned representation ( “hidden layer 11” in Fig.
Learning Representation for Contextual Document	The network weights below “hidden layer 11” are initialized with the pre-training stage.
Learning Representation for Contextual Document	hidden layer n

hidden layer is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: