software


 * SOFTWARE **

Tomada de: http://xkcd.com/628/ Distribución uniforme: [|artículo de wikipedia]

=MATLAB= Durante el curso se hará fuerte uso del lenguaje de programación MATLAB. A continuación se recomiendan algunas páginas donde el estudiante puede aprender por cuenta propia dicho lenguaje:


 * The Mathworks. MATLAB 7. Getting started guide. 2008. URL: http://www.mathworks.com/access/helpdesk/help/pdf_doc/matlab/getstart.pdf
 * Videos donde se enseña MATLAB. URL: http://www.mathworks.com/demos/matlab/getting-started-with-matlab-video-tutorial.html
 * Video demostraciones de MATLAB. URL: http://www.mathworks.com/products/matlab/demos.html
 * Documentación del toolbox de estadística de MATLAB: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/
 * Funciones de distribución en MATLAB: http://www.mathworks.com/access/helpdesk/help/toolbox/stats/index.html?/access/helpdesk/help/toolbox/stats/cdf.html
 * Video tutoriales de MATLAB en español: http://matlablatino.blogspot.com/ **(RECOMENDADO!)**

=GNU OCTAVE=

GNU Octave es un programa libre para realizar cálculos numéricos. MATLAB es considerado su equivalente comercial. Ambos lenguajes de programación son bastante parecidos.


 * Disponible en: http://www.gnu.org/software/octave/
 * Octave Wiki: http://wiki.octave.org/
 * Manual: http://www.network-theory.co.uk/octave/manual/
 * Octave Online: http://hara.mimuw.edu.pl/weboctave/
 * Página en Wikipedia: http://en.wikipedia.org/wiki/GNU_Octave
 * Funciones estadísticas de Octave: http://www.gnu.org/software/octave/doc/interpreter/Statistics.htm
 * Otro toolbox de estadística en Octave: http://laris.fesb.hr/digitalno_vodjenje/octave/doc/octave_26.html

=R-project=

R es un lenguaje y entorno de programación para análisis estadístico y gráfico. Es software libre.

Se puede encontrar en: http://www.r-project.org/

Página de Wikipedia: http://en.wikipedia.org/wiki/R_(programming_language)

Documentación en español

 * [|R para Principiantes], la versión en español de “R for Beginners”, traducido por Jorge A. Ahumada (PDF).
 * [|Versión en español de “An Introduction to R”] por Andrés González y Silvia González (PDF).
 * [|Estadística Básica con R y R-Commander] (libro libre) en knuth.uca.es.
 * [|Gráficos Estadísticos con R] por Juan Carlos Correa y Nelfi González (PDF).
 * [|Cartas sobre Estadística de la Revista Argentina de Bioingeniería] por Marcelo R. Risk (PDF).
 * [|Introducción al uso y programación del sistema estadístico R] por Ramón Díaz-Uriarte, transparencias preparadas para un curso de 16 horas sobre R, dirigido principalmente a biólogos y especialistas en bioinformática (PDF).

=MAXIMA=

Maxima es un software libre para la manipulación de expresiones simbólicas y numéricas, incluyendo diferenciación, integración, expansión en series de Taylor, ecuaciones diferenciales ordinarias, sistemas de ecuaciones lineales, y vectores, matrices y tensores. Maxima produce resultados con alta precisión usando fracciones exactas y representaciones con aritmética de coma flotante arbitraria. Estas propiedades lo hacen útil para resolver ciertos problemas en mecánica de sólidos. A continuación se recomiendan algunas páginas donde el estudiante puede aprender por cuenta propia dicho lenguaje:


 * [|Página oficial de MAXIMA] (de donde se puede descargar)
 * [|Documentación de MAXIMA]
 * [|Video tutoriales elaborados por Javier Arántegui] **(RECOMENDADO)**

=MS EXCEL 2007= (en probabilidad y estadística)


 * [|Ayuda de la página de Microsoft sobre las funciones estadística de MS EXCEL]
 * [|Funciones de MS EXCEL 2007 para Estadística organizadas por temas]
 * [|Uso de las funciones de distribución en MS EXCEL 2007]


 * ADVERTENCIA SOBRE EL USO DE MS EXCEL 2007 EN PROBABILIDAD Y ESTADISTICA **

MS EXCEL es una de las herramientas numéricas más usadas en el mundo; sin embargo, los estudiantes de probabilidad y estadística deben ser concientes que **//MS EXCEL __NO__ es un paquete adecuado para hacer cálculos estadísticos/probabilísticos serios cuando los resultados son importantes//**. Es preferible utilizar software como [|GNUmeric], [|R] o [|MATLAB] para realizar dichos cálculos.

Vínculos a páginas que sustentan la afirmación anterior:
 * [|MS EXCEL 2000, 2003 and 2007: faults, problems, workarounds and fixes]


 * En el Journal: **Computational Statistics & Data Analysis, Volume 52, Issue 10, Pages 4567-4878 (15 June 2008)**, aparece una sección especial sobre Microsoft Excel 2007, editada por B.D. McCullough. Esta sección contiene los siguientes artículos:

//Special section on Microsoft Excel 2007// Pages 4568-4569** DOI: []
 * B.D. McCullough

//On the accuracy of statistical procedures in Microsoft Excel 2007// Pages 4570-4578** DOI: [] DE LAS CONCLUSIONES: Since it is generally impossible for the average user to distinguish between the accurate and inaccurate Excel functions, the only safe course for Excel users is to rely on no statistical procedure in Excel unless Microsoft provides tangible evidence of its accuracy consisting of:
 * B.D. McCullough, David A. Heiser

1. test data with known input and output;

2. description of the algorithm sufficient to permit a third party to use the test data to reproduce Excel’s output; and

3. a bona fide reference for the algorithm.

If Microsoft does not perform these actions for each statistical procedure in Excel then there are only two safe alternatives for the user who is concerned about the accuracy of his statistical results: the user can perform all these actions himself, or simply not use Excel.

**A. Talha Yalta //The accuracy of statistical distributions in Microsoft® Excel 2007// Pages 4579-4586** DOI: [] CONCLUSIONES: It is our understanding that the algorithms for the computation of various statistical distributions in Excel 2007 can be inaccurate and/or unstable, and therefore can be unsafe to use. In particular, for the binomial, Poisson, inverse standard normal, inverse beta, inverse student’s t, and inverse F distributions, it is possible to obtain results with zero accurate digits. Our results also show that the alternative Gnumeric and OpenOffice.org Calc programs, which employ dissimilar subroutines for the computation of statistical distributions, provide better accuracy in general in comparison to Excel 2007. In particular, Gnumeric can uniformly return exact values with at least six digits of accuracy for probabilities as small as 10−300 in all of our tests except one, and this is already fixed within a few weeks after we contacted the developers about the problem. Calc has important numerical difficulties for the computation of the quantiles of various distributions including the inverse chi-square, inverse beta, inverse t, and inverse F distributions. Once notified about the problems, Calc developers expressed their intention to correct these flaws with the upcoming OpenOffice.org version 2.4.

A new trend in computing is Web-based applications, which facilitate the colloborative creation and modification of documents over the Internet in real time. The recently introduced Google Spreadsheet, a part of the Google Docs online service, is one such application competing with the microcomputer software evaluated in this study. Our cursory examination of Google Spreadsheet finds gross errors in the accuracy of the standard normal, binomial, hypergeometric, and Poisson distributions. Consequently, a through evaluation of this application is necessary to help researchers and practitioners make the decision whether to move from the PC to the grid.

It is a known fact that Excel is commonly used in a wide range of decision making processes from options trading to research in physical laboratories. Offering statistical functionality in a computer program is a serious matter and it brings important responsibilities to the software vendor. Microsoft has repeatedly shown its lack of interest to this concernment by releasing new versions of Excel without first correcting the problems documented by different authors on various different occasions. Because of Microsoft’s lack of commitment to accuracy, it is now possible to find on the Internet various users’ custom scripts and macros for proper computation of statistical distributions in Excel. It is unclear when, if at all, Microsoft will properly fix Excel’s inaccurate procedures for all of which there are free, well-known, and reliable alternatives. Meanwhile, researchers should continue to avoid using the statistical functions in Excel 2007 for any scientific purpose.

//Microsoft Excel’s ‘Not The Wichmann–Hill’ random number generators// Pages 4587-4593** DOI: [] DE LAS CONCLUSIONES: Twice Microsoft has attempted to implement the dozen lines of code that define the Wichmann and Hill (1982) RNG, and twice Microsoft has failed, apparently not using standard methods for verifying that an RNG has been correctly implemented. Consequently, users of Excel’s “rand” function have been using random numbers from an unknown and undocumented RNG of unknown period that is not known to pass any standard tests of randomness. Given Microsoft’s first failure to implement the WH RNG correctly, Microsoft should have taken special care to ensure that it was done correctly the second time. Microsoft did not do this. The second failure demonstrates clearly that the first failure (and the second, as well) was not some sort of “unfortunate accident” but, rather, a systemic failure on Microsoft’s part.
 * B.D. McCullough

Of course, one has to wonder why Microsoft chose an antiquated RNG instead of a modern RNG. In the first place, its period is only 2^43 which, following Ripley’s (1990) suggestion, can only support a couple of hundred thousand calls to the RNG. Further, Microsoft appealed to Marsaglia’s (1996) DIEHARD suite of randomness tests (see Microsoft Knowledge Base Article 828795) instead of the much more stringent tests offered by L’Ecuyer and Simard’s (2007) TESTU01 which has been available since 2002. Perhaps TESTU01 was released too late for Microsoft to make changes in Excel 2003, but certainly it was released in time for Microsoft to make changes in Excel 2007.

//It’s easy to produce chartjunk using Microsoft Excel 2007 but hard to make good graphs// Pages 4594-4601** DOI: [] CONCLUSIONES: A properly-designed chart can help people get the most information out of data and an excellent graphic tool helps people to achieve this task without much labour. The purpose of default settings in a graphic tool is to make it easy to produce good graphics that accord with the principles of statistical graphics. If the defaults do not embody these principles, then the only way to produce good graphics is to be sufficiently familiar with the principles of statistical graphics ([17], [18], [3], [4], [20], [16] and [6]).
 * Yu-Sung Su

This paper has shown that the default chart types in Excel 2007 do not embody these appropriate principles. Instead, these charts create chartjunk that hinder peoples’ ability to comprehend the data. Some users have developed add-ins and instructions to redress the malfunctions of the Excel default chart settings and thereby enable users to produce better graphics ([10], [11] and [19]). Nevertheless, those who want to use Excel are advised to get to know the principles of good graphing well enough so that they know how to choose the appropriate options to override the defaults. Microsoft® should overhaul its graphics engine so that it embodies the principles of statistical graphics and makes it easy for non-experts to produce good graphs.

//Teaching statistics with Excel 2007 and other spreadsheets// Pages 4602-4606** DOI: [] PARTE DEL ARTICULO: When one does finally find a tool, there is the concern about whether it does its job properly. McCullough (2004) argues that the sluggish response of Microsoft in fixing errors compromises the software. As McCullough shows, the Gnumeric programmers (who are but a few part-timers) were able to fix in several weeks the same errors that Microsoft was unable and/or unwilling to fix in several years. Excel’s statistical capabilities have been the subject of many complaints about errors and inaccuracies over many years. Finding and properly documenting such errors or weaknesses in scientific and statistical software is a difficult and tedious job, and reporting on them takes many pages of highly technical detail. Our personal and professional gratitude is owed to those who carry out such studies, such as those reported by McCullough (2004), McCullough and Wilson (2005) and McCullough (2008) and the apparently ongoing and large study by David Heiser (http://www.daheiser.info/). From the point of view of accuracy, my opinion is that Excel 2007 provides sufficient accuracy for most of the tasks in elementary statistics courses, but that it is very poor pedagogy to teach students to use a tool that is inadequate for “real-life” use.
 * John C. Nash

CONCLUSIONES: Spreadsheets are powerful tools for end-user computing (see, for instance, the program of the EuSpRIG 2004 conference at http://eusprig.org/eusprig-2004-conference-programme.pdf). However, the developers and vendors have vastly different agendas from statistics teachers, so we should not expect them to be well suited to our needs in the classroom. Most of us need tools that work well and offer clean, unambiguous interfaces. This means that for most spreadsheet applications I will use Gnumeric. However, for statistics, I used to use Minitab. With more students using Macintosh and Linux where a bundled textbook/Minitab package has not been available to us, I now would use the open-source R, either natively or via the Rcmdr interface.

=DECISION MAKING SOFTWARE= http://www.visionarytools.com/Handbook.htm#