Fine granularity parallel architectures for the efficient estimation of higher order statistics (HOS) are systematically derived in this paper. A unified methodology for constructing locally recursive algorithms and space-time linear mapping operators that lead to highly pipelined architectures consisting of multiple, tightly coupled array stages is discussed first. Then a farm of processors is synthesized that consumes second and fourth order moment estimates to produce the fourth order cumulants. The unified array synthesis methodology allows for the characterization of all valid solutions and the derivation of closed-form expressions for the permissible linear scheduling functions thus facilitating the search for a design instance meeting the architect's specified objectives. Achieving minimum latency and an optimal space-time matching between the farm and the moments generator architecture were the two main specifications driving the synthesis. A linear array solution, that is simpler to interface with the moments generator at the expense of adding some control complexity is also derived. As a result, a two-stage integrated VLSI architecture, that may accept data samples from the host and compute in real-time all non-redundant moment and cumulant terms, up to the fourth order is now possible.
Manolakos, E. S., & Stellakis, H. M. (2000). Systematic synthesis of parallel architectures for the computation of higher order cumulants. Parallel Computing, 26(5), 655–676. https://doi.org/10.1016/S0167-8191(99)00125-8