Revisiting deep learning models for tabular data

1/2/2024

Given its simplicity, it can serve as a strong baseline for future work. Then, we compare these models with many existing solutions on a diverse set of tasks under the same protocols of training and hyperparameters tuning.įirst, we reveal that none of the considered DL models can consistently outperform the ResNet-like model. The first one is a ResNet-like architecture (He et al., 2015b) and the second one is FT-Transformer - our simple adaptation of the Transformer architecture (Vaswani et al., 2017) for tabular data. Thus, we take inspiration from well-known battle-tested architectures from other fields and obtain two simple models for tabular data. We start with a hypothesis that well-studied DL architecture blocks may be underexplored in the context of tabular data and may be used to design better baselines. Therefore, we believe it is timely to review the recent developments from the field and raise the bar of baselines in tabular DL. The described problems impede the research process and make the observations from the papers not conclusive enough. Such pipelines can then be trained end-to-end by gradient optimization for all modalities.įor these reasons, a large number of DL solutions were recently proposed, and new models continue to emerge (Klambauer et al., 2017 Popov et al., 2020 Arik and Pfister, 2020 Song et al., 2019 Wang et al., 2017, 2020a Badirli et al., 2020 Hazimeh et al., 2020 Huang et al., 2020a). In these problems, data points are represented as vectors of heterogeneous features, which is typical for industrial applications and ML competitions, where neural networks have a strong non-deep competitor in the form of GBDT (Chen and Guestrin, 2016 Prokhorenkova et al., 2018 Ke et al., 2017).Īlong with potentially higher performance, using deep learning for tabular data is appealing as it would allow constructing multi-modal pipelines for problems, where only one part of the input is tabular, and other parts include images, audio and other DL-friendly data. 1 Introductionĭue to the tremendous success of deep learning on such data domains as images, audio and texts (Goodfellow et al., 2016), there has been a lot of research interest to extend this success to problems with data stored in tabular format. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks.īoth models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols.Īs a result, it is unclear for both researchers and practitioners what models perform best.Īdditionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems.

The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets.

0 Comments

Revisiting deep learning models for tabular data

Leave a Reply.

Author

Archives

Categories