neuralogic.datasetο
Available dataset formats
Dataset
(Logic format)
- class Dataset(samples: List[Sample] | Sample | None = None)[source]ο
Dataset encapsulating (learning) samples in the form of logic format, allowing users to fully take advantage of the PyNeuraLogic library.
- class FileDataset(examples_file: str | None = None, queries_file: str | None = None)[source]ο
FileDataset
represents samples stored in files in the NeuraLogic (logic) format.- Parameters:
examples_file (Optional[str]) β Path to the examples file. Default:
None
queries_file (Optional[str]) β Path to the queries file. Default:
None
- class Data(x: Sequence, edge_index: Sequence, y: Sequence | float | int, edge_attr: Sequence | None = None, y_mask: Sequence | None = None)[source]ο
The
Data
instance stores information about one specific graph instance.Example
For example, the directed graph \(G = (V, E)\), where \(E = \{(0, 1), (1, 2), (2, 0)\}\), node features \(X = \{[0], [1], [0]\}\) and target nodesβ labels \(Y = \{0, 1, 0\}\) would be represented as:
data = Data( x=[[0], [1], [0]], edge_index=[ [0, 1, 2], [1, 2, 0], ], y=[0, 1, 0], )
- Parameters:
x (Sequence) β Sequence of node features.
edge_index (Sequence) β Edges represented via a graph connectivity format - matrix
[[...src], [...dst]]
.y (Union[Sequence, float, int]) β Sequence of labels of all nodes or one graph label.
edge_attr (Optional[Sequence]) β Optional sequence of edge features. Default:
None
y_mask (Optional[Sequence]) β Optional sequence of node ids to generate queries for. Default:
None
(all nodes)
- static from_pyg(data) List[Data] [source]ο
Converts a PyTorch Geometric Data instance into a list of PyNeuraLogic
Data
instances. The conversion supportstrain_mask
,test_mask
andval_mask
attributes - for each mask the conversion yields a new data instance.- Parameters:
data β The PyTorch Geometric Data instance
- Returns:
The list of PyNeuraLogic
Data
instances
- class TensorDataset(data: List[Data], one_hot_encode_labels: bool = False, one_hot_decode_features: bool = False, one_hot_decode_edge_features: bool = False, number_of_classes: int = 1, feature_name: str = 'node_feature', edge_name: str = 'edge', output_name: str = 'predict')[source]ο
The
TensorDataset
holds a list ofData
instances - a list of graphs represented in a tensor format.- Parameters:
data (List[Data]) β List of data (graph) instances.
one_hot_encode_labels (bool) β Turn numerical labels into one hot encoded vectors - e.g., label
2
would be turned into a vector[0, 0, 1, .., 0]
of lengthnumber_of_classes
. Default:False
one_hot_decode_features (bool = False) β Turn one hot encoded feature vectors into a scalar - e.g., feature vector
[0, 0, 1]
would be turned into a predicate<feature_name>_2
. Default:False
one_hot_decode_edge_features (bool = False) β Turn one hot encoded edge feature vectors into a scalar - e.g., edge feature vector
[0, 0, 1]
would be turned into a predicate<edge_name>_2
. Default:False
number_of_classes (int) β Specifies the number of classes for converting numerical labels to one hot encoded vectors. Default:
1
feature_name (str) β Specify the node feature predicate name used for converting into the logic format. Default:
"node_feature"
edge_name (str) β Specify the edge predicate name used for converting into the logic format. Default:
"edge"
output_name (str) β Specify the output predicate name used for converting into the logic format. Default:
"predict"
- class CSVFile(relation_name: str, csv_source: TextIO | Path, sep=',', value_column: str | int | None = None, default_value: float | int | None = None, value_mapper: Callable | None = None, term_columns: Sequence[str | int] | None = None, header: bool = False, skip_rows: int = 0, n_rows: int | None = None, replace_empty_column: str | float | int = 0)[source]ο
- class CSVDataset(csv_files: List[CSVFile] | CSVFile, csv_queries: CSVFile | None = None, mode: Mode = Mode.ONE_EXAMPLE)[source]ο
- class Mode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]ο