Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points, and these datasets are omnipresent in many fields. In this work, a model-free framework, named as “randomly distributed embedding” (RDE), is proposed to accurately predict future dynamics based on such short-term but high-dimensional data. The RDE framework creates the distribution information from the interactions among high-dimensional variables to compensate for the lack of time points in real applications. Instead of roughly predicting a single trial of future values, this framework achieves the accurate prediction by using the distribution information. Future state prediction for nonlinear dynamical systems is a challenging task, particularly when only a few time series samples for high-dimensional variables are available from real-world systems. In this work, we propose a model-free framework, named randomly distributed embedding (RDE), to achieve accurate future state prediction based on short-term high-dimensional data. Specifically, from the observed data of high-dimensional variables, the RDE framework randomly generates a sufficient number of low-dimensional “nondelay embeddings” and maps each of them to a “delay embedding,” which is constructed from the data of a to be predicted target variable. Any of these mappings can perform as a low-dimensional weak predictor for future state prediction, and all of such mappings generate a distribution of predicted future states. This distribution actually patches all pieces of association information from various embeddings unbiasedly or biasedly into the whole dynamics of the target variable, which after operated by appropriate estimation strategies, creates a stronger predictor for achieving prediction in a more reliable and robust form. Through applying the RDE framework to data from both representative models and real-world systems, we reveal that a high-dimension feature is no longer an obstacle but a source of information crucial to accurate prediction for short-term data, even under noise deterioration.
Ma, H., Leng, S., Aihara, K., Lin, W., & Chen, L. (2018). Randomly distributed embedding making short-term high-dimensional data predictable. Proceedings of the National Academy of Sciences, 201802987. https://doi.org/10.1073/pnas.1802987115