Abstract | In our experiments, we used a fully connected, feed forward neural network with 3 layers (1 input layer, 1 hidden layer and 1 output layer) |
Abstract | In order to fully characterize our system, we took into account the following parameters: accuracy, runtime speed, training speed, hidden layer configuration and the number of optimal training iterations. |
Abstract | For example, the accuracy, the optimal number of training iterations, the training and the runtime speed are all highly dependent on the hidden layer configuration. |
DNN structures for NLP | Besides that, neural network training also involves some hyperparameters such as learning rate, the number of hidden layers . |
Experiments and Results | Table 3: Effect of different number of hidden layers . |
Experiments and Results | Two hidden layers outperform one hidden layer, while three hidden layers do not bring further improvement. |
Experiments and Results | 6.4.3 Effect of number of hidden layers |
Training | Tunable parameters in neural network alignment model include: word embeddings in lookup table LT, parameters Wl, bl for linear transformations in the hidden layers of the neural network, and distortion parameters 3d of jump distance. |
Training | We set word embedding length to 20, window size to 5, and the length of the only hidden layer to 40. |
Training | To make our model concrete, there are still hyper-parameters to be determined: the window size 3212 and tw, the length of each hidden layer Ll. |
Experiments and Analysis | Training settings: In pre-training stage, input layer has 100,000 units, all hidden layers have 1,000 units with rectifier function masc(0, Following (Glorot et al., 2011), for the first reconstruction layer, we use sigmoid activation function and cross-entropy error function. |
Learning Representation for Contextual Document | This stage we optimize the learned representation ( “hidden layer 11” in Fig. |
Learning Representation for Contextual Document | The network weights below “hidden layer 11” are initialized with the pre-training stage. |
Learning Representation for Contextual Document | hidden layer n |