Bayesian entropy estimators for spike trains

  • Park I
  • Archer E
  • Pillow J
Citations of this article
Mendeley users who have this article in their library.

This artice is free to access.


Information theoretic quantities have played a central role in neuroscience for quantifying neural codes [1]. Entropy and mutual information can be used to measure the maximum encoding capacity of a neuron, quantify the amount of noise, spatial and temporal functional dependence, learning process, and provide a fundamental limit for neural coding. Unfortunately, estimating entropy or mutual information is notoriously difficult–especially when the number of observations N is less than the num-ber of possible symbols K [2]. For the neural spike trains, this is often the case due to the combinatorial nature of the symbols: for n simultaneously recorded neurons on m time bins, the number of possible symbols is K = 2 n+m . Therefore, the question is how to extrapolate when you may have a severely under-sampled distribution. Here we describe a couple of recent advances in Baye-sian entropy estimation for spike trains. Our approach follows that of Nemenman et al. [2], who formulated a Bayesian entropy estimator using a mixture-of-Dirichlet prior over the space of discrete distributions on K bins. We extend this approach to formulate two Bayesian estimators with different strategies to deal with severe under-sampling. For the first estimator, we design a novel mixture prior over countable distributions using the Pitman-Yor (PY) process [3]. The PY process is useful when the number of parameters is unknown a priori, and as a result finds many applications in Bayesian nonpara-metrics. PY process can model the heavy, power-law distributed tails which often occur in neural data. To reduce the bias of the estimator we analytically derive a set of mixing weights so that the resulting improper prior over entropy is approximately flat. We consider the posterior over entropy given a dataset (which con-tains some observed number of words but an unknown number of unobserved words), and show that the pos-terior mean can be efficiently computed via a simple numerical integral. The second estimator incorporates the prior knowl-edge about the spike trains. We use a simple Bernoulli process as a parametric model of the spike trains, and use a Dirichlet process to allow arbitrary deviation from the Bernoulli process. Under this model, very sparse spike trains are a priori orders of magnitude more likely than those with many spikes. Both estimators are com-putationally efficient, and statistically consistent. We applied those estimators to spike trains from early visual system to quantify neural coding characteristics.




Park, I. M., Archer, E., & Pillow, J. (2013). Bayesian entropy estimators for spike trains. BMC Neuroscience, 14(S1).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free