Metagraph Feature Format
A metagraph feature file contains metagraph-based feature vectors for nodes and node pairs on a heterogeneous graph. The feature file is part of the input to the semantic proximity search program, which uses metagraph-based features to train a learning-to-rank model for proximity search. In particular, we use anchored metagraphs as features, first proposed in our TKDD19 paper. The anchored metagraph concept is an extension of metagraph in our ICDE16 paper.
The file is a tab delimited text file, as illustrated in the sample below.
First line is a single integer n representing the number of dimensions of the feature vectors, i.e., the number of anchored metagraphs.
The next n lines describe the anchored metagraphs, each line for one anchored metagraph. Each anchored metagraph is a feature, summarized by four integers separated by tab characters in the following form:
[FeatureID] is the ID of a feature, i.e., an anchored metagraph. Note that an anchored metagraph is defined by the triple (metagraph, head, node), as described by the next three integers;
[MetagraphID] is the ID of the corresponding metagraph in the anchored metagraph;
[HeadColor] is the color of the head anchor in the metagraph;
[TailColor] is the color of the tail anchor in the metagraph.
The remaining lines contain the feature vectors. Each line represents a feature vector of one node or node pair in the following form: