ablkit.reasoning
- class ablkit.reasoning.KBBase(pseudo_label_list: ~typing.List[~typing.Any], max_err: float = 1e-10, use_cache: bool = True, key_func: ~typing.Callable = <function to_hashable>, cache_size: int = 4096)[source]
Bases:
ABCBase class for knowledge base.
- Parameters:
pseudo_label_list (List[Any]) – List of possible pseudo-labels. It’s recommended to arrange the pseudo-labels in this list so that each aligns with its corresponding index in the base model: the first with the 0th index, the second with the 1st, and so forth.
max_err (float, optional) – The upper tolerance limit when comparing the similarity between the reasoning result of pseudo-labels and the ground truth. This is only applicable when the reasoning result is of a numerical type. This is particularly relevant for regression problems where exact matches might not be feasible. Defaults to 1e-10.
use_cache (bool, optional) – Whether to use abl_cache for previously abduced candidates to speed up subsequent operations. Defaults to True.
key_func (Callable, optional) – A function employed for hashing in abl_cache. This is only operational when use_cache is set to True. Defaults to
to_hashable.cache_size (int, optional) – The cache size in abl_cache. This is only operational when use_cache is set to True. Defaults to 4096.
Notes
Users should derive from this base class to build their own knowledge base. For the user-build KB (a derived subclass), it’s only required for the user to provide the
pseudo_label_listand override thelogic_forwardfunction (specifying how to perform logical reasoning). After that, other operations (e.g. how to perform abductive reasoning) will be automatically set up.- abduce_candidates(pseudo_label: List[Any], y: Any, x: List[Any], max_revision_num: int, require_more_revision: int) List[List[Any]][source]
Perform abductive reasoning to get a candidate compatible with the knowledge base.
- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example (to be revised by abductive reasoning).
y (Any) – Ground truth of the reasoning result for the example.
x (List[Any]) – The example. If the information from the example is not required in the reasoning process, then this parameter will not have any effect.
max_revision_num (int) – The upper limit on the number of revised labels for each example.
require_more_revision (int) – Specifies additional number of revisions permitted beyond the minimum required.
- Returns:
A tuple of two elements. The first element is a list of candidate revisions, i.e. revised pseudo-labels of the example. that are compatible with the knowledge base. The second element is a list of reasoning results corresponding to each candidate, i.e., the outcome of the
logic_forwardfunction.- Return type:
Tuple[List[List[Any]], List[Any]]
- abstract logic_forward(pseudo_label: List[Any], x: List[Any] | None = None) Any[source]
How to perform (deductive) logical reasoning, i.e. matching an example’s pseudo-labels to its reasoning result. Users are required to provide this.
- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example.
x (List[Any], optional) – The example. If deductive logical reasoning does not require any information from the example, the overridden function provided by the user can omit this parameter.
- Returns:
The reasoning result.
- Return type:
Any
- revise_at_idx(pseudo_label: List[Any], y: Any, x: List[Any], revision_idx: List[int]) List[List[Any]][source]
Revise the pseudo-labels at specified index positions.
- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example (to be revised).
y (Any) – Ground truth of the reasoning result for the example.
x (List[Any]) – The example. If the information from the example is not required in the reasoning process, then this parameter will not have any effect.
revision_idx (List[int]) – A list specifying indices of where revisions should be made to the pseudo-labels.
- Returns:
A tuple of two elements. The first element is a list of candidate revisions, i.e. revised pseudo-labels of the example. that are compatible with the knowledge base. The second element is a list of reasoning results corresponding to each candidate, i.e., the outcome of the
logic_forwardfunction.- Return type:
Tuple[List[List[Any]], List[Any]]
- class ablkit.reasoning.GroundKB(pseudo_label_list: List[Any], GKB_len_list: List[int], max_err: float = 1e-10)[source]
Bases:
KBBaseKnowledge base with a ground KB (GKB). Ground KB is a knowledge base prebuilt upon class initialization, storing all potential candidates along with their respective reasoning result. Ground KB can accelerate abductive reasoning in
abduce_candidates.- Parameters:
pseudo_label_list (List[Any]) – Refer to class
KBBase.GKB_len_list (List[int]) – List of possible lengths for pseudo-labels of an example.
max_err (float, optional) – Refer to class
KBBase.
Notes
Users can also inherit from this class to build their own knowledge base. Similar to
KBBase, users are only required to provide thepseudo_label_listand override thelogic_forwardfunction. Additionally, users should provide theGKB_len_list. After that, other operations (e.g. auto-construction of GKB, and how to perform abductive reasoning) will be automatically set up.- abduce_candidates(pseudo_label: List[Any], y: Any, x: List[Any], max_revision_num: int, require_more_revision: int) List[List[Any]][source]
Perform abductive reasoning by directly retrieving compatible candidates from the prebuilt GKB. In this way, the time-consuming exhaustive search can be avoided.
- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example (to be revised by abductive reasoning).
y (Any) – Ground truth of the reasoning result for the example.
x (List[Any]) – The example (unused in GroundKB).
max_revision_num (int) – The upper limit on the number of revised labels for each example.
require_more_revision (int) – Specifies additional number of revisions permitted beyond the minimum required.
- Returns:
A tuple of two elements. The first element is a list of candidate revisions, i.e. revised pseudo-labels of the example. that are compatible with the knowledge base. The second element is a list of reasoning results corresponding to each candidate, i.e., the outcome of the
logic_forwardfunction.- Return type:
Tuple[List[List[Any]], List[Any]]
- class ablkit.reasoning.PrologKB(pseudo_label_list: List[Any], pl_file: str)[source]
Bases:
KBBaseKnowledge base provided by a Prolog (.pl) file.
- Parameters:
pseudo_label_list (List[Any]) – Refer to class
KBBase.pl_file (str) – Prolog file containing the KB.
Notes
Users can instantiate this class to build their own knowledge base. During the instantiation, users are only required to provide the
pseudo_label_listandpl_file. To use the default logic forward and abductive reasoning methods in this class, in the Prolog (.pl) file, there needs to be a rule which is strictly formatted aslogic_forward(Pseudo_labels, Res)., e.g.,logic_forward([A,B], C) :- C is A+B. For specifics, refer to thelogic_forwardandget_query_stringfunctions in this class. Users are also welcome to override related functions for more flexible support.- get_query_string(pseudo_label: List[Any], y: Any, x: List[Any], revision_idx: List[int]) str[source]
Get the query to be used for consulting Prolog. This is a default function for demo, users would override this function to adapt to their own Prolog file. In this demo function, return query
logic_forward([kept_labels, Revise_labels], Res)..- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example (to be revised by abductive reasoning).
y (Any) – Ground truth of the reasoning result for the example.
x (List[Any]) – The corresponding input example. If the information from the input is not required in the reasoning process, then this parameter will not have any effect.
revision_idx (List[int]) – A list specifying indices of where revisions should be made to the pseudo-labels.
- Returns:
A string of the query.
- Return type:
str
- logic_forward(pseudo_label: List[Any], x: List[Any] | None = None) Any[source]
Consult prolog with the query
logic_forward(pseudo_labels, Res)., and set the returnedResas the reasoning results. To use this default function, there must be alogic_forwardmethod in the pl file to perform reasoning. Otherwise, users would override this function.- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example.
x (List[Any]) – The corresponding input example. If the information from the input is not required in the reasoning process, then this parameter will not have any effect.
- revise_at_idx(pseudo_label: List[Any], y: Any, x: List[Any], revision_idx: List[int]) List[List[Any]][source]
Revise the pseudo-labels at specified index positions by querying Prolog.
- Parameters:
pseudo_label (List[Any]) – Pseudo-labels of an example (to be revised).
y (Any) – Ground truth of the reasoning result for the example.
x (List[Any]) – The corresponding input example. If the information from the input is not required in the reasoning process, then this parameter will not have any effect.
revision_idx (List[int]) – A list specifying indices of where revisions should be made to the pseudo-labels.
- Returns:
A tuple of two elements. The first element is a list of candidate revisions, i.e. revised pseudo-labels of the example. that are compatible with the knowledge base. The second element is a list of reasoning results corresponding to each candidate, i.e., the outcome of the
logic_forwardfunction.- Return type:
Tuple[List[List[Any]], List[Any]]
- class ablkit.reasoning.Reasoner(kb: KBBase, dist_func: str | Callable = 'confidence', idx_to_label: dict | None = None, max_revision: int | float = -1, require_more_revision: int = 0, use_zoopt: bool = False)[source]
Bases:
objectReasoner for minimizing the inconsistency between the knowledge base and learning models.
- Parameters:
kb (class KBBase) – The knowledge base to be used for reasoning.
dist_func (Union[str, Callable], optional) – The distance function used to determine the cost list between each candidate and the given prediction. The cost is also referred to as a consistency measure, wherein the candidate with lowest cost is selected as the final abduced label. It can be either a string representing a predefined distance function or a callable function. The available predefined distance functions: ‘hamming’ | ‘confidence’ | ‘avg_confidence’ | ‘similarity’ | ‘rejection’. ‘hamming’ directly calculates the Hamming distance between the predicted pseudo-label in the data example and each candidate. ‘confidence’ and ‘avg_confidence’ calculate the confidence distance between the predicted probabilities and each candidate, defined as
1 - productand1 - averageof the candidate’s per-symbol probabilities respectively. ‘similarity’ compares candidates against the geometry of the model’s embeddings (requires the base model to exposeextract_features;ABLModelthen stores the result ondata_example.embeddings). ‘rejection’ combines confidence distance with a candidate-complexity penalty, favoring shorter candidates when scores are close. Alternatively, the callable function should have the signaturedist_func(data_example, candidates, candidate_idxs, reasoning_results)and must return a cost list. Each element in this cost list should be a numerical value representing the cost for each candidate, and the list should have the same length as candidates. Defaults to ‘confidence’.idx_to_label (dict, optional) – A mapping from index in the base model to label. If not provided, a default order-based index to label mapping is created. Defaults to None.
max_revision (Union[int, float], optional) – The upper limit on the number of revisions for each data example when performing abductive reasoning. If float, denotes the fraction of the total length that can be revised. A value of -1 implies no restriction on the number of revisions. Defaults to -1.
require_more_revision (int, optional) – Specifies additional number of revisions permitted beyond the minimum required when performing abductive reasoning. Defaults to 0.
use_zoopt (bool, optional) – Whether to use ZOOpt library during abductive reasoning. Defaults to False.
- abduce(data_example: ListData) List[Any][source]
Perform abductive reasoning on the given data example.
- Parameters:
data_example (ListData) – Data example.
- Returns:
A revised pseudo-labels of the example through abductive reasoning, which is compatible with the knowledge base.
- Return type:
List[Any]
- batch_abduce(data_examples: ListData) List[List[Any]][source]
Perform abductive reasoning on the given prediction data examples. For detailed information, refer to
abduce.
- batch_supervised_abduce(data_examples: ListData) List[List[Any]][source]
Perform abductive reasoning on the given prediction data examples, using supervised data when gt_pseudo_label is given.
- zoopt_budget(symbol_num: int) int[source]
Set the budget for ZOOpt optimization. The budget can be dynamic relying on the number of symbols considered, e.g., the default implementation shown below. Alternatively, it can be a fixed value, such as simply setting it to 100.
- Parameters:
symbol_num (int) – The number of symbols to be considered in the ZOOpt optimization process.
- Returns:
The budget for ZOOpt optimization.
- Return type:
int
- zoopt_score(symbol_num: int, data_example: ListData, sol: Solution) int[source]
Set the score for a solution. A lower score suggests that ZOOpt library has a higher preference for this solution.
- Parameters:
symbol_num (int) – Number of total symbols.
data_example (ListData) – Data example.
sol (Solution) – The solution for ZOOpt library.
- Returns:
The score for the solution.
- Return type:
int
- class ablkit.reasoning.A3BLReasoner(kb, dist_func='confidence', idx_to_label=None, max_revision: int | float = -1, require_more_revision: int = 0, use_zoopt: bool = False, topK: int = 16, temperature: float = 0.2, multi_label: bool = False)[source]
Bases:
ReasonerReasoner for minimizing the inconsistency between the knowledge base and learning models.
- Parameters:
kb (class KBBase) – The knowledge base to be used for reasoning.
dist_func (Union[str, Callable], optional) – The distance function used to determine the cost list between each candidate and the given prediction. The cost is also referred to as a consistency measure, wherein the candidate with the lowest cost is selected as the final abduced label. It can be either a string representing a predefined distance function or a callable function. The available predefined distance functions: ‘hamming’ | ‘confidence’ | ‘avg_confidence’ | ‘similarity’ | ‘rejection’. See
Reasonerfor the full description of each option. Defaults to ‘confidence’.idx_to_label (dict, optional) – A mapping from index in the base model to label. If not provided, a default order-based index to label mapping is created. Defaults to None.
max_revision (Union[int, float], optional) – The upper limit on the number of revisions for each data example when performing abductive reasoning. If float, denotes the fraction of the total length that can be revised. A value of -1 implies no restriction on the number of revisions. Defaults to -1.
require_more_revision (int, optional) – Specifies additional number of revisions permitted beyond the minimum required when performing abductive reasoning. Defaults to 0.
use_zoopt (bool, optional) – Whether to use ZOOpt library during abductive reasoning. Defaults to False.
topK (int, optional) – Number of top-ranked candidates to keep when forming the soft label.
-1keeps all candidates. Defaults to 16.temperature (float, optional) – Softmax temperature used when aggregating candidate probabilities into a soft label. Lower values produce sharper distributions. Defaults to 0.2.
multi_label (bool, optional) – Whether the underlying task is multi-label (each symbol is a binary vector rather than a single class index). Defaults to False.
- abduce(data_example: ListData) Tuple[List[Any], List[Any]][source]
Perform abduction and get a soft label distribution aggregated from all valid candidates that satisfy the underlying rules.
- Parameters:
data_example (ListData) – Data example.
- Returns:
soft_label (List[Any]) – Soft label aggregated from the top-k valid candidates.
pseudo_label (List[Any]) – Hard pseudo-label revision (the top-1 candidate) that is consistent with the knowledge base.
- class ablkit.reasoning.VerificationReasoner(kb: KBBase, top_k: int = 1, max_iter: int = 10000, idx_to_label: dict | None = None)[source]
Bases:
objectReasoner used by
VerificationBridge. Rather than picking a single best candidate via a distance function, it enumerates the toptop_klabel assignments that satisfy the knowledge base, ordered by joint probability. The bridge then trains the model on each of those candidates.- Parameters:
kb (KBBase) – The knowledge base used to verify candidates.
kb.logic_forwardmust return the reasoning result so it can be compared with each data example’sY.top_k (int, optional) – Number of satisfying candidates to enumerate per example. Defaults to 1.
max_iter (int, optional) – Maximum number of enumeration steps per example before giving up and returning the fallback. Defaults to 10000.
idx_to_label (dict, optional) – A mapping from base-model index to pseudo-label. If omitted a default order-based mapping is built from
kb.pseudo_label_list.
- batch_top_k(data_examples) List[List[List[Any]]][source]
Run
top_k_candidates()on every example indata_examples. Stores the result ondata_examples.top_k_candidatesanddata_examples.top_k_probs. Returns the list of per-example candidate lists.
- ablkit.reasoning.reasoner.enumerate_label_assignments(pred_prob: ndarray, max_iter: int = 10000) Iterator[Tuple[List[int], float, List[float]]][source]
Yield label-index assignments for a single data example in descending joint-probability order. The walk is a Lawler-style best-first search: each state is the tuple of per-symbol rank indices, and successors are generated by advancing any one symbol to its next-best class.
- Parameters:
pred_prob (np.ndarray) – Per-symbol probability matrix with shape
(num_symbols, num_classes).max_iter (int, optional) – Hard cap on the number of yields. Defaults to 10000.
- Yields:
labels (List[int]) – Class indices for each symbol.
joint_prob (float) – Product of the chosen per-symbol probabilities.
per_symbol_probs (List[float]) – The chosen probability for each symbol.
- ablkit.reasoning.reasoner.top_k_satisfying(pred_prob: ndarray, predicate: Callable[[List[Any]], bool], top_k: int = 1, max_iter: int = 10000, idx_to_label: dict | None = None) Tuple[List[List[Any]], List[float]][source]
Walk label assignments in descending joint-probability order and return the first
top_kthat satisfypredicate. If none is found withinmax_iteriterations the single highest-probability assignment is returned as a fallback so callers always receive a usable label.- Parameters:
pred_prob (np.ndarray) – Per-symbol probability matrix with shape
(num_symbols, num_classes).predicate (Callable[[List[Any]], bool]) – Function called on each candidate label sequence; truthy means the candidate is consistent with the knowledge base.
top_k (int, optional) – Maximum number of satisfying candidates to return. Defaults to 1.
max_iter (int, optional) – Hard cap on enumeration steps. Defaults to 10000.
idx_to_label (dict, optional) – Optional mapping from class index to pseudo-label. When omitted, the raw class indices are returned.
- Returns:
candidates (List[List[Any]]) – Label assignments that satisfy
predicate(or the fallback).probs (List[float]) – Joint probability of each returned candidate.