ablkit.bridge
- class ablkit.bridge.A3BLBridge(model: ABLModel, reasoner: A3BLReasoner, metric_list: List[BaseMetric])[source]
Bases:
SimpleBridgeAn ambiguity-aware implementation for bridging machine learning and reasoning parts.
Reference: https://github.com/Hao-Yuan-He/A3BL
- Involves the following five steps:
Predict class probabilities and indices for the given data examples.
Map indices into pseudo-labels.
Enumerate all valid pseudo-labels.
Revise pseudo-labels to label distribution based on the class probabilities.
Train the model.
- Parameters:
model (ABLModel) – The machine learning model wrapped in
ABLModel, used for prediction and training. The wrapped base model should exposeextract_featuresso embeddings are available for the soft-label aggregation.reasoner (A3BLReasoner) – The reasoning part wrapped in
A3BLReasoner, used for pseudo-label enumeration and soft-label aggregation.metric_list (List[BaseMetric]) – A list of metrics used for evaluating the model’s performance.
- abduce_soft_label(data_examples: ListData) List[List[Any]][source]
Revise predicted pseudo-labels to a soft label, given data examples using abduction.
- Parameters:
data_examples (ListData) – Data examples containing predicted pseudo-labels.
- Returns:
A list of abduced soft labels for the given data examples.
- Return type:
List[List[Any]]
- train(train_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any]], val_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any] | None] | None = None, loops: int = 50, segment_size: int | float = 1.0, eval_interval: int = 1, save_interval: int | None = None, save_dir: str | None = None)[source]
A typical training pipeline of Abuductive Learning.
- Parameters:
train_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], List[Any]]]) – Training data should be in the form of
(X, gt_pseudo_label, Y)or aListDataobject withX,gt_pseudo_labelandYattributes. -Xis a list of sublists representing the input data. -gt_pseudo_labelis only used to evaluate the performance of theABLModelbut not to train.gt_pseudo_labelcan beNone. -Yis a list representing the ground truth reasoning result for each sublist inX.label_data (Union[ListData, Tuple[List[List[Any]], List[List[Any]], List[Any]]], optional) – Labeled data should be in the same format as
train_data. The only difference is that thegt_pseudo_labelinlabel_datashould not beNoneand will be utilized to train the model. Defaults to None.val_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], Optional[List[Any]]]], optional) – Validation data should be in the same format as
train_data. Bothgt_pseudo_labelandYcan be either None or not, which depends on the evaluation metircs inself.metric_list. Ifval_datais None,train_datawill be used to validate the model during training time. Defaults to None.loops (int) – Learning part and Reasoning part will be iteratively optimized for
loopstimes. Defaults to 50.segment_size (Union[int, float]) – Data will be split into segments of this size and data in each segment will be used together to train the model. Defaults to 1.0.
eval_interval (int) – The model will be evaluated every
eval_intervalloop during training, Defaults to 1.save_interval (int, optional) – The model will be saved every
eval_intervalloop during training. Defaults to None.save_dir (str, optional) – Directory to save the model. Defaults to None.
- class ablkit.bridge.BaseBridge(model: ABLModel, reasoner: Reasoner)[source]
Bases:
objectA base class for bridging learning and reasoning parts.
This class provides necessary methods that need to be overridden in subclasses to construct a typical pipeline of Abductive Learning (corresponding to
train), which involves the following four methods:predict: Predict class indices on the given data examples.
idx_to_pseudo_label: Map indices into pseudo-labels.
abduce_pseudo_label: Revise pseudo-labels based on abdutive reasoning.
pseudo_label_to_idx: Map revised pseudo-labels back into indices.
- Parameters:
- abstract abduce_pseudo_label(data_examples: ListData) List[List[Any]][source]
Placeholder for revising pseudo-labels based on abdutive reasoning.
- filter_pseudo_label(data_examples: ListData) List[List[Any]][source]
Default filter function for pseudo-label.
- abstract idx_to_pseudo_label(data_examples: ListData) List[List[Any]][source]
Placeholder for mapping indices to pseudo-labels.
- abstract predict(data_examples: ListData) Tuple[List[List[Any]], List[List[Any]]][source]
Placeholder for predicting class indices from input.
- abstract pseudo_label_to_idx(data_examples: ListData) List[List[Any]][source]
Placeholder for mapping pseudo-labels to indices.
- abstract test(test_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any]]) None[source]
Placeholder for model validation.
- class ablkit.bridge.SimpleBridge(model: ABLModel, reasoner: Reasoner, metric_list: List[BaseMetric])[source]
Bases:
BaseBridgeA basic implementation for bridging machine learning and reasoning parts.
This class implements the typical pipeline of Abductive Learning, which involves the following five steps:
Predict class probabilities and indices for the given data examples.
Map indices into pseudo-labels.
Revise pseudo-labels based on abdutive reasoning.
Map the revised pseudo-labels to indices.
Train the model.
- Parameters:
model (ABLModel) – The machine learning model wrapped in
ABLModel, which is mainly used for prediction and model training.reasoner (Reasoner) – The reasoning part wrapped in
Reasoner, which is used for pseudo-label revision.metric_list (List[BaseMetric]) – A list of metrics used for evaluating the model’s performance.
- abduce_pseudo_label(data_examples: ListData) List[List[Any]][source]
Revise predicted pseudo-labels of the given data examples using abduction.
- Parameters:
data_examples (ListData) – Data examples containing predicted pseudo-labels.
- Returns:
A list of abduced pseudo-labels for the given data examples.
- Return type:
List[List[Any]]
- concat_data_examples(unlabel_data_examples: ListData, label_data_examples: ListData | None) ListData[source]
Concatenate unlabeled and labeled data examples.
abduced_pseudo_labelof unlabeled data examples andgt_pseudo_labelof labeled data examples will be used to train the model.
- data_preprocess(prefix: str, data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any]]) ListData[source]
Transform data in the form of (X, gt_pseudo_label, Y) into ListData.
- Parameters:
prefix (str) – A prefix indicating the type of data processing (e.g., ‘train’, ‘test’).
data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], List[Any]]]) – Data to be preprocessed. Can be ListData or a tuple of lists.
- Returns:
The preprocessed ListData object.
- Return type:
- idx_to_pseudo_label(data_examples: ListData) List[List[Any]][source]
Map indices of data examples into pseudo-labels.
- Parameters:
data_examples (ListData) – Data examples containing the indices.
- Returns:
A list of pseudo-labels converted from indices.
- Return type:
List[List[Any]]
- predict(data_examples: ListData) Tuple[List[ndarray], List[ndarray]][source]
Predict class indices and probabilities (if
predict_probais implemented inself.model.base_model) on the given data examples.- Parameters:
data_examples (ListData) – Data examples on which predictions are to be made.
- Returns:
A tuple containing lists of predicted indices and probabilities.
- Return type:
Tuple[List[ndarray], List[ndarray]]
- pseudo_label_to_idx(data_examples: ListData) List[List[Any]][source]
Map pseudo-labels of data examples into indices.
- Parameters:
data_examples (ListData) – Data examples containing pseudo-labels.
- Returns:
A list of indices converted from pseudo-labels.
- Return type:
List[List[Any]]
- supervised_abduce_pseudo_label(data_examples: ListData) List[List[Any]][source]
Revise predicted pseudo-labels of the given data examples using ground truth.
- Parameters:
data_examples (ListData) – Data examples containing predicted pseudo-labels.
- Returns:
A list of ground truth/abduced pseudo-labels for the given data examples.
- Return type:
List[List[Any]]
- test(test_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any] | None]) None[source]
Test the model with the given test data.
- Parameters:
test_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], Optional[List[Any]]]]) – Test data should be in the form of
(X, gt_pseudo_label, Y)or aListDataobject withX,gt_pseudo_labelandYattributes. Bothgt_pseudo_labelandYcan be either None or not, which depends on the evaluation metircs inself.metric_list.
- train(train_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any]], label_data: ListData | Tuple[List[List[Any]], List[List[Any]], List[Any]] | None = None, val_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any] | None] | None = None, loops: int = 50, segment_size: int | float = 1.0, use_supervised_data: bool = False, eval_interval: int = 1, save_interval: int | None = None, save_dir: str | None = None)[source]
A typical training pipeline of Abuductive Learning.
- Parameters:
train_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], List[Any]]]) – Training data should be in the form of
(X, gt_pseudo_label, Y)or aListDataobject withX,gt_pseudo_labelandYattributes. -Xis a list of sublists representing the input data. -gt_pseudo_labelis only used to evaluate the performance of theABLModelbut not to train.gt_pseudo_labelcan beNone. -Yis a list representing the ground truth reasoning result for each sublist inX.label_data (Union[ListData, Tuple[List[List[Any]], List[List[Any]], List[Any]]], optional) – Labeled data should be in the same format as
train_data. The only difference is that thegt_pseudo_labelinlabel_datashould not beNoneand will be utilized to train the model. Defaults to None.val_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], Optional[List[Any]]]], optional) – Validation data should be in the same format as
train_data. Bothgt_pseudo_labelandYcan be either None or not, which depends on the evaluation metircs inself.metric_list. Ifval_datais None,train_datawill be used to validate the model during training time. Defaults to None.loops (int) – Learning part and Reasoning part will be iteratively optimized for
loopstimes. Defaults to 50.segment_size (Union[int, float]) – Data will be split into segments of this size and data in each segment will be used together to train the model. Defaults to 1.0.
eval_interval (int) – The model will be evaluated every
eval_intervalloop during training, Defaults to 1.save_interval (int, optional) – The model will be saved every
eval_intervalloop during training. Defaults to None.save_dir (str, optional) – Directory to save the model. Defaults to None.
- valid(val_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any] | None]) None[source]
Validate the model with the given validation data.
- Parameters:
val_data (Union[ListData, Tuple[List[List[Any]], Optional[List[List[Any]]], Optional[List[Any]]]]) – Validation data should be in the form of
(X, gt_pseudo_label, Y)or aListDataobject withX,gt_pseudo_labelandYattributes. Bothgt_pseudo_labelandYcan be either None or not, which depends on the evaluation metircs inself.metric_list.
- class ablkit.bridge.VerificationBridge(model: ABLModel, reasoner: VerificationReasoner, metric_list: List[BaseMetric])[source]
Bases:
SimpleBridgeBridge implementing the Verification Learning training loop.
- Parameters:
model (ABLModel) – Wrapped learning model.
reasoner (VerificationReasoner) – Top-K reasoner. The bridge reads
reasoner.top_kto decide how many training passes to run per segment.metric_list (List[BaseMetric]) – Evaluation metrics, identical to
SimpleBridge.
- train(train_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any]], val_data: ListData | Tuple[List[List[Any]], List[List[Any]] | None, List[Any] | None] | None = None, loops: int = 50, segment_size: int | float = 1.0, eval_interval: int = 1, save_interval: int | None = None, save_dir: str | None = None) None[source]
Verification Learning training loop. For each segment we predict once, enumerate the top-K consistent candidates, then run a
model.trainpass per candidate.