Headline generation | (1): an MST is extracted from the entity pair 61, 62 (2); nodes are heuristically added to the MST to enforce grammaticality (3); entity types are recombined to generate the final patterns (4). |
Headline generation | COMBINEENTITYTYPES: Finally, a distinct pattern is generated from each possible combination of entity type assignments for the participating entities. |
Headline generation | While in many cases information about entity types would be sufficient to decide about the order of the entities in the generated sentences (e. g., “[person] married in [location]” for the entity set {ea 2 “Mr. |
Related work | Chambers and Jurafsky (2009) present an unsupervised method for learning narrative schemas from news, i.e., coherent sets of events that involve specific entity types (semantic roles). |
Related work | Similarly to them, we move from the assumptions that 1) utterances involving the same entity types within the same document (in our case, a collection of related documents) are likely describing aspects of the same event, and 2) meaningful representations of the underlying events can be learned by clustering these utterances in a principled way. |
Experiments | Depending on the document category, we found some variations as to which hierarchy was learned in each setting, but we noticed that parameters starting with right and left gramtypes often produced quite good hierarchies: for instance right gramtype —> left gramtype —> same sentence —> right named entity type . |
Hierarchizing feature spaces | , 0 entity types . |
Introduction | The main question we raise is, given a set of indicators (such as grammatical types, distance between two mentions, or named entity types ), how to best partition the pool of mention pair examples in order to best discriminate coreferential pairs from non coreferential ones. |
System description | We used classical features that can be found in details in (Bengston and Roth, 2008) and (Rah-man and Ng, 2011): grammatical type and subtype of mentions, string match and substring, apposition and copula, distance (number of separating mentions/sentences/words), gender/number match, synonymy/hypemym and animacy (using WordNet), family name (based on lists), named entity types , syntactic features (gold parse) and anaphoricity detection. |
System description | As indicators we used: left and right grammatical types and subtypes, entity types , a boolean indicating if the mentions are in the same sentence, and a very coarse histogram of distance in terms of sentences. |
Models | Each mention 2' has been augmented with a single property node pi E {1, ..., The unary B factors encode prior knowledge about the setting of each pi; these factors may be hard (I will not refer to a plural entity), soft (such as a distribution over named entity types output by an NER tagger), or practically uniform (e. g. the last name Smith does not specify a particular gender). |
Models | Suppose that we are using named entity type as an entity-level property. |
Related Work | Their system could be extended to handle property information like we do, but our system has many other advantages, such as freedom from a pre-specified list of entity types , the ability to use multiple input clusterings, and discriminative projection of clusters. |