ModalDecisionTrees.jl offers a set of symbolic machine learning algorithms that extend classical decision tree learning algorithms, and are able to natively handle time series and image data. Modal Decision Trees leverage modal logics to perform a primitive-but-powerful form of entity-relation reasoning; this allows them to capture temporal and spatial patterns, and makes them suitable to natively deal (= no need for feature extraction) with data such as multivariate time-series and images.
Symbolic learning provides transparent (or interpretable) models, and is becoming increasingly popular as AI permeates more and more aspects of our lives, while simultaneously raising ethical concerns. Mainly based on decision trees and rule-based models, symbolic modeling have been largely studied with either propositional or first-order logic as the underlying logical formalism. These logics are two extremes in terms of expressive power and computational tractability: on one hand, propositional logic can only express a simple form of reasoning, which makes classical decision trees easy to learn but also unable to deal with non-tabular data; on the other hand, first-order logics can express complex sentences in terms of entities and relations, but at the cost of higher computational complexities. A middle point between the two has been overlooked: modal logic. ModalDecisionTrees.jl offers a set of symbolic machine learning methods based on extensions of classical decision tree learning algorithms (CART and C4.5), that leverage modal logics to perform a rather simple (but powerful) form of entity-relation reasoning; this allows "Modal Decision Trees" (MDTs) to capture temporal, spatial, and spatio-temporal patterns, and makes them suitable to natively deal (= no need for feature extraction) with data such as multivariate time-series and image data. To fix the ideas, consider the case of time-series classification. While classical trees can only make decisions based on scalar values, and thus can only deal with time-series when they are priorly flattenedly described by a set of scalar descriptors (feature extraction step), a modal image classification rule can speak in terms of temporal patterns such as there exists an interval in the time-series where variable i has a certain property, containing another interval where variable j has another property. Modal logic can express the existence of entities (for example, a time interval, or an image region) with given properties, and properties can be local, such as the value of a variable being always lower than a certain threshold within the time interval, or relational, such as one entity being contained in, or overlapping with another one. This process involves an intermediate step where data samples are represented as graphs (Kripke structures, in logical jargon) representing entities, their local properties, and their relations. Note how rules and patterns can, of course, be as complex as the reality they are trying to capture; however, they can always be straightforwardly translatable into natural language, which represents the essence of the transparency of these models, as well as the main reason why one may want to use this package. MDTs have been shown to achieve performances that are higher when compared to classical decision trees, and often comparable to those of functional gradient-based methods (e.g., Neural Networks), in tasks such as multivariate time-series classification (e.g., COVID-19 diagnosis from audio recordings of coughs and breaths) and image classification (e.g., land cover classification). Despite this package being at its infancy, ModalDecisionTrees.jl can be used with the Machine Learning Julia (MLJ) framework, and provides:
Package available at: https://github.com/giopaglia/ModalDecisionTrees.jl