Sign up & Download
Sign in

Papers in this group

1 - 20 of 31
  1. Problems that involve interacting with humans, such as natural language understanding, have not proven to be solvable by concise, neat formulas like F = ma. Instead, the best approach appears to be to embrace the complexity of the domain and address…
  2. We demonstrate the utility of massively parallel computational infrastructure for statistical computing using the MapReduce paradigm for R. This framework allows users to write computations in a high-level language that are then broken up and…
  3. The use of simulation for high-dimensional intractable computations has revolutionized applied mathematics. Designing, improving and understanding the new tools leads to (and leans on) fascinating mathematics, from representation theory through…
  4. Many recent statistical applications involve inference under complex models, where it is computationally prohibitive to calculate likelihoods but possible to simulate data. Approximate Bayesian Computation (ABC) is devoted to these complex models…
  5. Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide…
  6. This concise book introduces you to several strategies for using R to analyze large datasets. You’ll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and…
  7. Overconfidence may be advantageous because it serves to increase ambition, morale, resolve, persistence or the credibility of bluffing, generating a self-fulfilling prophecy in which exaggerated confidence actually increases the probability of…
  8. We discuss and compare measures of accuracy of univariate time series forecasts. The methods used in the M-competition and the M3-competition, and many of the measures recommended by previous authors on this topic, are found to be degenerate in…
  9. Reliable and accurate prediction of time series over large future horizons has become the new frontier of the forecasting discipline. Current approaches to long-term time series forecasting rely either on iterated predictors, direct predictors or,…
  10. We give an overview of some of the software tools available in R, either as built- in functions or contributed packages, for the analysis of state space models. Several illustrative examples are included, covering constant and time-varying models…
  11. genoud is an R function that combines evolutionary algorithm methods with a derivative-based (quasi-Newton) method to solve difficult optimization problems. genoud may also be used for optimization problems for which derivatives do not exist. genoud…
  12. Computational techniques based on simulation have now become an essential part of the statistician's toolbox. It is thus crucial to provide statisticians with a practical understanding of those methods, and there is no better way to develop…
  13. This tutorial gives a practical introduction to creating R packages. We discuss how object oriented programming and S formulas can be used to give R code the usual look and feel, how to start a package from a collection of R functions, and how to…
  14. Text mining has gained big interest both in academic research as in business intelligence applications within the last decade. There is an enormous amount of textual data available in machine readable format which can be easily accessed via the…
  15. Data mining delivers insights, patterns, and descriptive and predictive models from the large amounts of data available today in many organisations. The data miner draws heavily on methodologies, techniques and algorithms from statistics, machine…
  16. This book is very different from any other publication in the field and it is unique because of its focus on the practical implementation of the simulation and estimation methods presented. The book should be useful to practitioners and students…
  17. Automatic forecasts of large numbers of univariate time series are often needed in business and other contexts. We describe two automatic forecasting algorithms that have been implemented in the forecast package for R. The first is based on…

Top tags in this group

Apply tags to papers in My Library or Mendeley Desktop to filter papers in this group by their content