The Sparse Grid technology represents one of the most ambitious approaches for the solution of classification and regression problems.
It is the first universal multivariate method which scales linearly with the number of data points and thus can be applied to huge data sets. The main idea is to solve classification and regression problems directly via their operator euqtions - usually in the form of differential equations - by discretizing the feature space. This approch, which has proved itself for physical problems over decades (usually in form of the Finite Element Method), until now has failed in Data Mining for reasons of computational complexity that increases exponentially with the number of dimensions ("curse of dimension").
Sparse Grids for the first time allow a discretization of high-dimensional function spaces and are especially used for solving high-dimensional integral and differential equations since the late 90s. Mathematically, Sparse Grid functions represent high-dimensional wavelets over an hierarchy of anisotropic grids. The adaption of Sparse Grids for classification and regression problems for the first time enables to apply highly nonlinear classification and regression methods to large data sets. Thus, it represents a qualitative improvement compared to conventional methods such as Neural and Bayesian networks or SVMs.
Over the last years, Sparse Grids have been successfully used in the prudsys DISCOVERER classification tool. However, so far the maximum number of dimensions was bounded by about 20. After years of fundamental research, for regression problems a convergent combination of the Sparse Grid technique with dimension-based adaptivity has been accomplished. At this, adaptive error estimators are used for automatic grid refinement along the required dimensions.
Now Sparse Grids arrive at 30-50 dimensions for millions of data points.
The adaptive Sparse Grid method will be included into the new version of the prudsys DISCOVERER next year.