BDD-OIA ======= .. raw:: html

For detailed code implementation, please view it on GitHub.

Below shows an implementation of `BDD-OIA `__. The BDD-OIA dataset comprises frames extracted from driving scene videos that are used for autonomous driving predictions. Each frame is annotated with 4 binary action labels (:math:`\textsf{move_forward}`, :math:`\textsf{stop}`, :math:`\textsf{turn_left}`, :math:`\textsf{turn_right}`), as well as 21 intermediate binary concept labels such as :math:`\textsf{red_light}` and :math:`\textsf{road_clear}` that explain those actions. The objective is to predict the possible actions for each frame. During training we use only the action-level supervision together with a knowledge base that captures the relations between concepts and actions, e.g., :math:`\textsf{red_light} \lor \textsf{traffic_sign} \lor \textsf{obstacle} \implies \textsf{stop}`. The training set contains 16,000 frames; the test set contains 4,500. Intuitively, the learning part predicts the 21 binary concept pseudo-labels from each frame, and the reasoning part uses the knowledge base to derive the four action labels from those concepts. When the learning part's predictions conflict with the ground-truth actions, the reasoner revises the concepts via abductive reasoning, and those revised concepts are used to further train the learning part. The dataset was preprocessed by `Marconato et al. (2023) `__ with a pretrained Faster-RCNN on BDD-100k together with the first module of CBM-AUC `(Sawada & Nakamura, 2022) `__, yielding a 2048-dimensional visual feature for each frame. .. code:: python # Import necessary libraries and modules import os.path as osp import numpy as np import torch import torch.nn as nn from torch import optim from ablkit.data.evaluation import SymbolAccuracy from ablkit.learning import MultiLabelABLModel, MultiLabelBasicNN from ablkit.reasoning import KBBase, Reasoner from ablkit.utils import ABLLogger, print_log from bridge import BDDBridge from dataset.data_util import get_dataset from metric import BDDReasoningMetric from models.nn import ConceptNet Working with Data ----------------- First, we load the training, validation, and testing splits: .. code:: python train_data = get_dataset(fname="train.npz", get_pseudo_label=True) val_data = get_dataset(fname="val.npz", get_pseudo_label=True) test_data = get_dataset(fname="test.npz", get_pseudo_label=True) Each split consists of three components (``X``, ``gt_pseudo_label``, and ``Y``) with one entry per frame: - ``X[i]`` is a list with a single ndarray of shape ``(2048,)``, the pre-extracted visual feature for the frame. - ``gt_pseudo_label[i]`` is a list of length 21 holding the binary concept annotations (``red_light``, ``road_clear``, …). - ``Y[i]`` is a tuple of length 4 holding the binary action labels (``move_forward``, ``stop``, ``turn_left``, ``turn_right``). During training only ``X`` and ``Y`` are used; ``gt_pseudo_label`` is held back for evaluation. Building the Learning Part -------------------------- To build the learning part we first construct a PyTorch model, ``ConceptNet``, then wrap it in :class:`~ablkit.learning.MultiLabelBasicNN` to obtain an sklearn-style base model. ``MultiLabelBasicNN`` is a multi-label variant of ``BasicNN``: the output uses sigmoid activations rather than softmax, predictions are binary vectors rather than single class indices, and the dataset is a :class:`~ablkit.learning.MultiLabelClassificationDataset`. The 21 outputs therefore correspond to the 21 binary concept labels. .. code:: python net = ConceptNet() loss_fn = nn.BCEWithLogitsLoss() optimizer = optim.Adam(net.parameters(), lr=0.002) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") scheduler = optim.lr_scheduler.OneCycleLR( optimizer, max_lr=0.002, pct_start=0.15, epochs=2, steps_per_epoch=int(1 / 0.01) + 1, ) base_model = MultiLabelBasicNN( net, loss_fn, optimizer, scheduler=scheduler, device=device, batch_size=32, num_epochs=1, ) ``MultiLabelBasicNN`` operates on a single frame at a time. To work at the example level (a frame together with its label set), we wrap the base model in :class:`~ablkit.learning.MultiLabelABLModel`, an ``ABLModel`` subclass that threshold-binarises the sigmoid probabilities into per-concept 0/1 pseudo-labels. .. code:: python model = MultiLabelABLModel(base_model) Building the Reasoning Part --------------------------- The knowledge base ``BDDKB`` encodes the rules linking the 21 concepts to the 4 actions (e.g., ``red_light`` or ``obstacle`` imply ``stop``; ``green_light`` together with ``road_clear`` implies ``move_forward``). It subclasses ``KBBase``; the ``pseudo_label_list`` parameter is ``[0, 1]`` because each pseudo-label is binary, and the ``logic_forward`` method computes the 4-tuple of action labels from the 21 concept attributes. .. code:: python from reasoning.bddkb import BDDKB kb = BDDKB() Since abductive reasoning is non-deterministic, multiple concept revisions can be consistent with the ground-truth actions. The ``Reasoner`` picks the revision that minimises a user-supplied distance function. For BDD-OIA we provide ``multi_label_confidence_dist``, which sums ``-log(p)`` over the concept-by-concept probabilities so that revisions consistent with the learning part's per-concept confidence are preferred: .. code:: python def multi_label_confidence_dist(data_example, candidates, candidates_idxs, reasoning_results): pred_prob = data_example.pred_prob.T # nc x 1 pred_prob = np.concatenate([1 - pred_prob, pred_prob], axis=1) # nc x 2 cols = np.arange(len(candidates_idxs[0]))[None, :] corr_prob = pred_prob[cols, candidates_idxs] costs = -np.sum(np.log(corr_prob + 1e-6), axis=1) return costs reasoner = Reasoner( kb, dist_func=multi_label_confidence_dist, max_revision=3, require_more_revision=3, ) ``max_revision`` and ``require_more_revision`` cap how many concept flips the reasoner explores when searching for a consistent revision. Building Evaluation Metrics --------------------------- We track two metrics. ``SymbolAccuracy`` measures how often the predicted concepts match the ground-truth concepts, and ``BDDReasoningMetric`` measures the per-action accuracy after passing the predicted concepts through ``logic_forward``. .. code:: python metric_list = [ SymbolAccuracy(prefix="bdd_oia"), BDDReasoningMetric(kb=kb, prefix="bdd_oia"), ] Bridging Learning and Reasoning ------------------------------- Finally we bridge the learning and reasoning parts via ``BDDBridge``, a thin subclass of ``SimpleBridge`` that handles the multi-label-specific shape of ``pred_idx`` (a ``[1, nc]`` ndarray per example). .. code:: python bridge = BDDBridge(model, reasoner, metric_list) Training and testing reuse the standard ``SimpleBridge`` interface: .. code:: python print_log("Abductive Learning on the BDD_OIA example.", logger="current") log_dir = ABLLogger.get_current_instance().log_dir weights_dir = osp.join(log_dir, "weights") bridge.train( train_data, loops=2, segment_size=0.01, save_interval=1, save_dir=weights_dir, ) bridge.test(test_data)