Learn the Basics || Quick Start || Dataset & Data Structure || Learning Part || Reasoning Part || Evaluation Metrics || Bridge

Quick Start

We use the MNIST Addition task as a quick start example. In this task, pairs of MNIST handwritten images and their sums are given, alongwith a domain knowledge base which contains information on how to perform addition operations. Our objective is to input a pair of handwritten images and accurately determine their sum. Refer to the links in each section to dive deeper.

Working with Data

ABLkit requires data in the format of (X, gt_pseudo_label, Y) where X is a list of input examples containing instances, gt_pseudo_label is the ground-truth label of each example in X and Y is the ground-truth reasoning result of each example in X. Note that gt_pseudo_label is only used to evaluate the machine learning model’s performance but not to train it.

In the MNIST Addition task, the data loading looks like

# The 'datasets' module below is located in 'examples/mnist_add/'
from datasets import get_dataset

# train_data and test_data are tuples in the format of (X, gt_pseudo_label, Y)
train_data = get_dataset(train=True)
test_data = get_dataset(train=False)

Building the Learning Part

Learning part is constructed by first defining a base model for machine learning. ABLkit offers considerable flexibility, supporting any base model that conforms to the scikit-learn style (which requires the implementation of fit and predict methods), or a PyTorch-based neural network (which has defined the architecture and implemented forward method). In this example, we build a simple LeNet5 network as the base model.

# The 'models' module below is located in 'examples/mnist_add/'
from models.nn import LeNet5

net = LeNet5(num_classes=10)

To facilitate uniform processing, ABLkit provides the BasicNN class to convert a PyTorch-based neural network into a format compatible with scikit-learn models. To construct a BasicNN instance, aside from the network itself, we also need to define a loss function, an optimizer, and the computing device.

import torch
from ablkit.learning import BasicNN

loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.RMSprop(net.parameters(), lr=0.001)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
base_model = BasicNN(model=net, loss_fn=loss_fn, optimizer=optimizer, device=device)

The base model built above is trained to make predictions on instance-level data (e.g., a single image), while ABL deals with example-level data. To bridge this gap, we wrap the base_model into an instance of ABLModel. This class serves as a unified wrapper for base models, facilitating the learning part to train, test, and predict on example-level data, (e.g., images that comprise an equation).

from ablkit.learning import ABLModel

model = ABLModel(base_model)

Read more about building the learning part.

Building the Reasoning Part

To build the reasoning part, we first define a knowledge base by creating a subclass of KBBase. In the subclass, we initialize the pseudo_label_list parameter and override the logic_forward method, which specifies how to perform (deductive) reasoning that processes pseudo-labels of an example to the corresponding reasoning result. Specifically, for the MNIST Addition task, this logic_forward method is tailored to execute the sum operation.

from ablkit.reasoning import KBBase

class AddKB(KBBase):
   def __init__(self, pseudo_label_list=list(range(10))):
      super().__init__(pseudo_label_list)

   def logic_forward(self, nums):
      return sum(nums)

kb = AddKB()

Next, we create a reasoner by instantiating the class Reasoner, passing the knowledge base as a parameter. Due to the indeterminism of abductive reasoning, there could be multiple candidate pseudo-labels compatible with the knowledge base. In such scenarios, the reasoner can minimize inconsistency and return the pseudo-label with the highest consistency.

from ablkit.reasoning import Reasoner

reasoner = Reasoner(kb)

Read more about building the reasoning part.

Building Evaluation Metrics

ABLkit provides two basic metrics, namely SymbolAccuracy and ReasoningMetric, which are used to evaluate the accuracy of the machine learning model’s predictions and the accuracy of the logic_forward results, respectively.

from ablkit.data.evaluation import ReasoningMetric, SymbolAccuracy

metric_list = [SymbolAccuracy(), ReasoningMetric(kb=kb)]

Read more about building evaluation metrics

Bridging Learning and Reasoning

Now, we use SimpleBridge to combine learning and reasoning in a unified ABL framework.

from ablkit.bridge import SimpleBridge

bridge = SimpleBridge(model, reasoner, metric_list)

Finally, we proceed with training and testing.

bridge.train(train_data, loops=1, segment_size=0.01)
bridge.test(test_data)

Read more about bridging machine learning and reasoning.