Nowadays machine learning tasks deal with sheer volumes of data a possibly incomplete, decentralized, and streaming nature that necessitate on-the-fly processing for real-time decision making. Conventional inference analytics mine such "Big Data" by leveraging their intrinsic parsimony, e.g., via models that include rank sparsity regularization or priors. Convex nuclear and ? 1-norm surrogates are typically adopted and offer well-documented guarantees in recovering informative low-dimensional structure from high-dimensional data. However, the computational complexity of the resulting algorithms tends to scale poorly due to the nuclear norms entangled structure, which also hinders streaming and decentralized analytics. To overcome this computational challenge, this chapter discusses a framework that leverages a bilinear characterization of the nuclear norm to bring separability at the expense of nonconvexity. This challenge notwithstanding, under mild conditions stationary points of the nonconvex program provably coincide with the optimum of the convex counterpart. Using this idea along with the theory of alternating minimization, lightweight algorithms are developed with low communication overhead for in-network processing. Provably convergent online subspace trackers that are suitable for streaming analytics are developed as well. Remarkably, even under the constraints imposed by decentralized computing and sequential data acquisition, one can still attain the performance offered by the prohibitively complex batch analytics.
Mardani, M., Mateos, G., & Giannakis, G. B. (2018). Big Data. In Cooperative and Graph Signal Processing: Principles and Applications (pp. 777–797). Elsevier. https://doi.org/10.1016/B978-0-12-813677-5.00030-4