Abstract | Furthermore, our system improves significantly over a baseline system when applied to text from a different domain, and it reduces the sample complexity of sequence labeling . |
Conclusion and Future Work | Our study of smoothing techniques demonstrates that by aggregating information across many unannotated examples, it is possible to find accurate distributional representations that can provide highly informative features to supervised sequence labelers . |
Conclusion and Future Work | These features help improve sequence labeling performance on rare word types, on domains that differ from the training set, and on smaller training sets. |
Experiments | Smoothing can improve the performance of a supervised sequence labeling system on words that are rare or nonexistent in the training data. |
Experiments | A supervised sequence labeler achieves greater accuracy on new domains with smoothing. |
Experiments | A supervised sequence labeler has a better sample complexity with smoothing. |
Introduction | We then compute features of the distributional representations, and provide them as input to our supervised sequence labelers . |
Smoothing Natural Language Sequences | In particular, for every word :0, in a sequence, we provide the sequence labeler with a set of features of the left and right contexts indexed by ’U E V2 FgeftCEi) = P(Xi_1 = and ngghtwi) = P(Xi+1 = For example, the left context for “reformulated” in our example above would contain a nonzero probability for the word “of.” Using the features a sequence labeler can learn patterns such as, if :0, has a high probability of following “of,” it is a good candidate for the start of a noun phrase. |
Smoothing Natural Language Sequences | After experimenting with different choices for the number of dimensions to reduce our vectors to, we choose a value of 10 dimensions as the one that maximizes the performance of our supervised sequence labelers on held-out data. |
Smoothing Natural Language Sequences | The output of this process is an integer (ranging from 1 to S) for every word :10, in the corpus; we include a new boolean feature for each possible value of 3/, in our sequence labelers . |
Abstract | While Active Learning (AL) has already been shown to markedly reduce the annotation efforts for many sequence labeling tasks compared to random selection, AL remains unconcerned about the internal structure of the selected sequences (typically, sentences). |
Abstract | We propose a semi-supervised AL approach for sequence labeling where only highly uncertain subsequences are presented to human annotators, while all others in the selected sequences are automatically labeled. |
Active Learning for Sequence Labeling | In the sequence labeling scenario, such an example is a stream of linguistic items — a sentence is usually considered as proper sequence unit. |
Active Learning for Sequence Labeling | In the sequence labeling scenario, an example which, as a whole, has a high utility U X03), can still exhibit subsequences which do not add much to the overall utility and thus are fairly easy for the current model to label correctly. |
Active Learning for Sequence Labeling | 1There are many more sophisticated utility functions for sequence labeling . |
Conditional Random Fields for Sequence Labeling | Many NLP tasks, such as POS tagging, chunking, or NER, are sequence labeling problems where a sequence of class labels 3] = (3/1,. |
Introduction | When used for sequence labeling tasks such as POS tagging, chunking, or named entity recogni- |
Introduction | Approaches to AL for sequence labeling are usually unconcerned about the internal structure of the selected sequences. |
Introduction | Accordingly, our approach is a combination of AL and self-training to which we will refer as semi-supervised Active Learning (SeSAL) for sequence labeling . |
Introduction | Figure 1: English (a) and Chinese (b) abbreviation generation as a sequential labeling problem. |
Introduction | (2005) formalized the processes of abbreviation generation as a sequence labeling problem. |
Introduction | In order to formalize this task as a sequential labeling problem, we have assumed that the label of a character is determined by the local information of the character and its previous label. |
Results and Discussion | In general, the results indicate that all of the sequential labeling models outperformed the SVM regression model with less training time.3 In the SVM regression approach, a large number of negative examples are explicitly generated for the training, which slowed the process. |
Abstract | Manually annotated corpora are valuable but scarce resources, yet for many annotation tasks such as treebanking and sequence labeling there exist multiple corpora with diflerent and incompatible annotation guidelines or standards. |
Introduction | We envision this technique to be general and widely applicable to many other sequence labeling tasks. |
Segmentation and Tagging as Character Classification | Similar to the situation in other sequence labeling problems, the training procedure is to learn a discriminative model mapping from inputs SE 6 X to outputs y E Y, where X is the set of sentences in the training corpus and Y is the set of corresponding labelled results. |