License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.CP.2023.33
URN: urn:nbn:de:0030-drops-190700
URL: https://drops.dagstuhl.de/opus/volltexte/2023/19070/
Go to the corresponding LIPIcs Volume Portal


Shati, Pouya ; Cohen, Eldan ; McIlraith, Sheila

SAT-Based Learning of Compact Binary Decision Diagrams for Classification

pdf-format:
LIPIcs-CP-2023-33.pdf (2 MB)


Abstract

Decision trees are a popular classification model in machine learning due to their interpretability and performance. However, the number of splits in decision trees grow exponentially with their depth which can incur a higher computational cost, increase data fragmentation, hinder interpretability, and restrict their applicability to memory-constrained hardware. In constrast, binary decision diagrams (BDD) utilize the same split across each level, leading to a linear number of splits in total. Recent work has considered optimal binary decision diagrams (BDD) as compact and accurate classification models, but has only focused on binary datasets and has not explicitly optimized the compactness of the resulting diagrams. In this work, we present a SAT-based encoding for a multi-terminal variant of BDDs (MTBDDs) that incorporates a state-of-the-art direct encoding of numerical features. We then develop and evaluate different approaches to explicitly optimize the compactness of the diagrams. In one family of approaches, we learn a tree BDD first and model the size of the diagram the tree will be reduced to as a secondary objective, in a one-stage or two-stage optimization scheme. Alternatively, we directly learn diagrams that support multi-dimensional splits for improved expressiveness. Our experiments show that direct encoding of numerical features leads to better performance. Furthermore, we show that exact optimization of size leads to more compact solutions while maintaining higher accuracy. Finally, our experiments show that multi-dimensional splits are a viable approach to achieving higher expressiveness with a lower computational cost.

BibTeX - Entry

@InProceedings{shati_et_al:LIPIcs.CP.2023.33,
  author =	{Shati, Pouya and Cohen, Eldan and McIlraith, Sheila},
  title =	{{SAT-Based Learning of Compact Binary Decision Diagrams for Classification}},
  booktitle =	{29th International Conference on Principles and Practice of Constraint Programming (CP 2023)},
  pages =	{33:1--33:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-300-3},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{280},
  editor =	{Yap, Roland H. C.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/19070},
  URN =		{urn:nbn:de:0030-drops-190700},
  doi =		{10.4230/LIPIcs.CP.2023.33},
  annote =	{Keywords: Binary Decision Diagram, Classification, Compactness, Numeric Data, MaxSAT}
}

Keywords: Binary Decision Diagram, Classification, Compactness, Numeric Data, MaxSAT
Collection: 29th International Conference on Principles and Practice of Constraint Programming (CP 2023)
Issue Date: 2023
Date of publication: 22.09.2023
Supplementary Material: Software (Source code): https://github.com/PouyaShati/BDD


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI