Frequency analysis challenge – a console-based game for R/python

Six months ago we’ve introduced ‚The Proton’ – a console based R game with six data wrangling puzzles. Around 15-30 minutes of fun with data. The game is on CRAN in the package BetaBit.

And just few days ago we’ve added a second game – frequon(). Eight puzzles related with frequency analysis of encoded messages.

It’s much harder than proton.
Expect around two hours of playing with ciphers.
Try it yourself. To get the R version just type

install.packages("BetaBit")
library("BetaBit")
frequon()

You can also try the experimental python version.

pip install --upgrade https://github.com/BetaAndBit/BetaBitPython/archive/master.tar.gz

If you like these games and going to attend useR2016 (June, Stanford, USA) or eRum2016 (October, Poznań, Poland) feel free to ping me (Przemyslaw.Biecek).

All your models belong to us: how to combine package archivist and function trace()

Let’s see how to collect all linear regression models that you will ever create in R.

It’s easy with the trace() function. A really powerful, yet not that popular function, that allows you to inject any R code in any point of a body of any function.
Useful in debugging and have other interesting applications.
Below I will show how to use this function to store a copy of every linear model that is created with lm(). In the same way you may store copies of plots/other models/data frames/anything.

To store a persistent copy of an object one can simply use the save() function. But we are going to use the archivist package instead. It stores objects in a repository and give you some nice features, like searching within repository, sharing the repository with other users, checking session info for a particular object or restoring packages to versions consistent with a selected object.

To use archivist with the trace() function you just need to call two lines. First one will create an empty repo, and the second will execute ‘saveToLocalRepo()’ at the end of each call to the lm() function.

Now, at the end of every lm() function the fitted model will be stored in the repository.
Let’s see this in action.

All models are stored as rda files in a disk based repository.
You can load them to R with the asearch() function.
Let’s get all lm objects, apply the AIC function to each of them and sort along AIC.

The aread() function will download the selected model.

Now you can just create model after model and if needed they all can be restored.

Read more about the archivist here: http://pbiecek.github.io/archivist/.

Call for Papers: eRum 2016 (European R users meeting)

6_edited

The European R users meeting (eRum) is an international conference that aims at integrating users of the R language. eRum 2016 will be held on October 13 and 14, 2016, in Poznan, Poland at the Poznan University of Economics and Business. We already confirm the following invited speakers: Rasmus Bååth, Romain Francois, Ulrike Grömping, Matthias Templ, Heather Turner, Przemysław Biecek, Marek Gągolewski, Jakub Glinka, Katarzyna Kopczewska and Katarzyna Stąpor.

We would like to bring together participants from around the world. It will be a good chance to exchange experiences, broaden knowledge of R and collaborate. The conference will cover topics including:

• Bayesian Statistics,
• Bioinformatics,
• Economics, Finance and Insurance,
• High Performance Computing,
• Reproducible Research,
• Industrial Applications,
• Statistical Learning with Big Data,
• Spatial Statistics,
• Teaching,
• Visualization & Graphics,
• and many more.

We invite you to participate in eRum 2016:
(1) with a regular oral presentation,
(2) with a lightning talk,
(3) with a poster presentation,
(4) or without a presentation or poster.

Due to limited space at the conference venue, the organizers have set a limit for the number of participants at 250 and persons with regular/lighting talks/posters will be considered first and those attending without a presentation or poster will be handled on a first-come, first-served basis.

Please make your submission online at http://erum.ue.poznan.pl/#register. The submission deadline is June 15, 2016. Submitters will be notified via email by July 1, 2016 of acceptance. Additional details will be announced via the eRum conference website.

Why should you backup your R objects?

There is a saying that there are two groups of people: those who are already doing backups and those who will. So, how this is linked with reproducible research and R?

If your work is to analyze data then you often face a need to restore/recreate/update results that you have generated some time ago.
You may think ,,I have a knitr reports for everything!”. That’s great! It will save you a lot of troubles. But to have 100% of warranty for exactly same results you need to have exactly the same environment and same versions of packages.

Do you know how many R packages have been updated during last 12 months?

I took list of top 20 R packages from here, scrap dates of their current and older CRAN releases from here and generate a plot with dates of submissions to CRAN sorted along date of last submission.

Czytaj dalej Why should you backup your R objects?