BDD-OIA
=======
.. raw:: html
For detailed code implementation, please view it on GitHub.
Below shows an implementation of `BDD-OIA `__.
The BDD-OIA dataset comprises frames extracted from driving scene videos
that are used for autonomous driving predictions. Each frame is
annotated with 4 binary action labels (:math:`\textsf{move_forward}`,
:math:`\textsf{stop}`, :math:`\textsf{turn_left}`, :math:`\textsf{turn_right}`),
as well as 21 intermediate binary concept labels such as
:math:`\textsf{red_light}` and :math:`\textsf{road_clear}` that explain those
actions.
The objective is to predict the possible actions for each frame.
During training we use only the action-level supervision together with
a knowledge base that captures the relations between concepts and
actions, e.g.,
:math:`\textsf{red_light} \lor \textsf{traffic_sign} \lor \textsf{obstacle} \implies \textsf{stop}`.
The training set contains 16,000 frames; the test set contains 4,500.
Intuitively, the learning part predicts the 21 binary concept
pseudo-labels from each frame, and the reasoning part uses the
knowledge base to derive the four action labels from those concepts.
When the learning part's predictions conflict with the ground-truth
actions, the reasoner revises the concepts via abductive reasoning,
and those revised concepts are used to further train the learning
part.
The dataset was preprocessed by `Marconato et al. (2023) `__
with a pretrained Faster-RCNN on BDD-100k together with the first
module of CBM-AUC `(Sawada & Nakamura, 2022) `__,
yielding a 2048-dimensional visual feature for each frame.
.. code:: python
# Import necessary libraries and modules
import os.path as osp
import numpy as np
import torch
import torch.nn as nn
from torch import optim
from ablkit.data.evaluation import SymbolAccuracy
from ablkit.learning import MultiLabelABLModel, MultiLabelBasicNN
from ablkit.reasoning import KBBase, Reasoner
from ablkit.utils import ABLLogger, print_log
from bridge import BDDBridge
from dataset.data_util import get_dataset
from metric import BDDReasoningMetric
from models.nn import ConceptNet
Working with Data
-----------------
First, we load the training, validation, and testing splits:
.. code:: python
train_data = get_dataset(fname="train.npz", get_pseudo_label=True)
val_data = get_dataset(fname="val.npz", get_pseudo_label=True)
test_data = get_dataset(fname="test.npz", get_pseudo_label=True)
Each split consists of three components (``X``, ``gt_pseudo_label``,
and ``Y``) with one entry per frame:
- ``X[i]`` is a list with a single ndarray of shape ``(2048,)``, the
pre-extracted visual feature for the frame.
- ``gt_pseudo_label[i]`` is a list of length 21 holding the binary
concept annotations (``red_light``, ``road_clear``, …).
- ``Y[i]`` is a tuple of length 4 holding the binary action labels
(``move_forward``, ``stop``, ``turn_left``, ``turn_right``).
During training only ``X`` and ``Y`` are used; ``gt_pseudo_label`` is
held back for evaluation.
Building the Learning Part
--------------------------
To build the learning part we first construct a PyTorch model,
``ConceptNet``, then wrap it in
:class:`~ablkit.learning.MultiLabelBasicNN` to obtain an sklearn-style
base model. ``MultiLabelBasicNN`` is a multi-label variant of
``BasicNN``: the output uses sigmoid activations rather than softmax,
predictions are binary vectors rather than single class indices, and
the dataset is a
:class:`~ablkit.learning.MultiLabelClassificationDataset`. The 21
outputs therefore correspond to the 21 binary concept labels.
.. code:: python
net = ConceptNet()
loss_fn = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(net.parameters(), lr=0.002)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
scheduler = optim.lr_scheduler.OneCycleLR(
optimizer,
max_lr=0.002,
pct_start=0.15,
epochs=2,
steps_per_epoch=int(1 / 0.01) + 1,
)
base_model = MultiLabelBasicNN(
net,
loss_fn,
optimizer,
scheduler=scheduler,
device=device,
batch_size=32,
num_epochs=1,
)
``MultiLabelBasicNN`` operates on a single frame at a time. To work at
the example level (a frame together with its label set), we wrap the
base model in :class:`~ablkit.learning.MultiLabelABLModel`, an
``ABLModel`` subclass that threshold-binarises the sigmoid
probabilities into per-concept 0/1 pseudo-labels.
.. code:: python
model = MultiLabelABLModel(base_model)
Building the Reasoning Part
---------------------------
The knowledge base ``BDDKB`` encodes the rules linking the 21
concepts to the 4 actions (e.g., ``red_light`` or ``obstacle`` imply
``stop``; ``green_light`` together with ``road_clear`` implies
``move_forward``). It subclasses ``KBBase``; the ``pseudo_label_list``
parameter is ``[0, 1]`` because each pseudo-label is binary, and the
``logic_forward`` method computes the 4-tuple of action labels from
the 21 concept attributes.
.. code:: python
from reasoning.bddkb import BDDKB
kb = BDDKB()
Since abductive reasoning is non-deterministic, multiple concept
revisions can be consistent with the ground-truth actions. The
``Reasoner`` picks the revision that minimises a user-supplied
distance function. For BDD-OIA we provide
``multi_label_confidence_dist``, which sums ``-log(p)`` over the
concept-by-concept probabilities so that revisions consistent with the
learning part's per-concept confidence are preferred:
.. code:: python
def multi_label_confidence_dist(data_example, candidates, candidates_idxs, reasoning_results):
pred_prob = data_example.pred_prob.T # nc x 1
pred_prob = np.concatenate([1 - pred_prob, pred_prob], axis=1) # nc x 2
cols = np.arange(len(candidates_idxs[0]))[None, :]
corr_prob = pred_prob[cols, candidates_idxs]
costs = -np.sum(np.log(corr_prob + 1e-6), axis=1)
return costs
reasoner = Reasoner(
kb,
dist_func=multi_label_confidence_dist,
max_revision=3,
require_more_revision=3,
)
``max_revision`` and ``require_more_revision`` cap how many concept
flips the reasoner explores when searching for a consistent
revision.
Building Evaluation Metrics
---------------------------
We track two metrics. ``SymbolAccuracy`` measures how often the
predicted concepts match the ground-truth concepts, and
``BDDReasoningMetric`` measures the per-action accuracy after
passing the predicted concepts through ``logic_forward``.
.. code:: python
metric_list = [
SymbolAccuracy(prefix="bdd_oia"),
BDDReasoningMetric(kb=kb, prefix="bdd_oia"),
]
Bridging Learning and Reasoning
-------------------------------
Finally we bridge the learning and reasoning parts via ``BDDBridge``,
a thin subclass of ``SimpleBridge`` that handles the
multi-label-specific shape of ``pred_idx`` (a ``[1, nc]`` ndarray per
example).
.. code:: python
bridge = BDDBridge(model, reasoner, metric_list)
Training and testing reuse the standard ``SimpleBridge`` interface:
.. code:: python
print_log("Abductive Learning on the BDD_OIA example.", logger="current")
log_dir = ABLLogger.get_current_instance().log_dir
weights_dir = osp.join(log_dir, "weights")
bridge.train(
train_data,
loops=2,
segment_size=0.01,
save_interval=1,
save_dir=weights_dir,
)
bridge.test(test_data)