All your models belong to us: how to combine package archivist and function trace()

Let’s see how to collect all linear regression models that you will ever create in R.

It’s easy with the trace() function. A really powerful, yet not that popular function, that allows you to inject any R code in any point of a body of any function.
Useful in debugging and have other interesting applications.
Below I will show how to use this function to store a copy of every linear model that is created with lm(). In the same way you may store copies of plots/other models/data frames/anything.

To store a persistent copy of an object one can simply use the save() function. But we are going to use the archivist package instead. It stores objects in a repository and give you some nice features, like searching within repository, sharing the repository with other users, checking session info for a particular object or restoring packages to versions consistent with a selected object.

To use archivist with the trace() function you just need to call two lines. First one will create an empty repo, and the second will execute ‘saveToLocalRepo()’ at the end of each call to the lm() function.

Now, at the end of every lm() function the fitted model will be stored in the repository.
Let’s see this in action.

All models are stored as rda files in a disk based repository.
You can load them to R with the asearch() function.
Let’s get all lm objects, apply the AIC function to each of them and sort along AIC.

The aread() function will download the selected model.

Now you can just create model after model and if needed they all can be restored.

Read more about the archivist here: http://pbiecek.github.io/archivist/.

Call for Papers: eRum 2016 (European R users meeting)

6_edited

The European R users meeting (eRum) is an international conference that aims at integrating users of the R language. eRum 2016 will be held on October 13 and 14, 2016, in Poznan, Poland at the Poznan University of Economics and Business. We already confirm the following invited speakers: Rasmus Bååth, Romain Francois, Ulrike Grömping, Matthias Templ, Heather Turner, Przemysław Biecek, Marek Gągolewski, Jakub Glinka, Katarzyna Kopczewska and Katarzyna Stąpor.

We would like to bring together participants from around the world. It will be a good chance to exchange experiences, broaden knowledge of R and collaborate. The conference will cover topics including:

• Bayesian Statistics,
• Bioinformatics,
• Economics, Finance and Insurance,
• High Performance Computing,
• Reproducible Research,
• Industrial Applications,
• Statistical Learning with Big Data,
• Spatial Statistics,
• Teaching,
• Visualization & Graphics,
• and many more.

We invite you to participate in eRum 2016:
(1) with a regular oral presentation,
(2) with a lightning talk,
(3) with a poster presentation,
(4) or without a presentation or poster.

Due to limited space at the conference venue, the organizers have set a limit for the number of participants at 250 and persons with regular/lighting talks/posters will be considered first and those attending without a presentation or poster will be handled on a first-come, first-served basis.

Please make your submission online at http://erum.ue.poznan.pl/#register. The submission deadline is June 15, 2016. Submitters will be notified via email by July 1, 2016 of acceptance. Additional details will be announced via the eRum conference website.

European R users meeting / meeting of R heroes / Poznań 12-14.10.2016

6_edited

European R users meeting (eRum 2016) will take place between October 12th and 14th.

We already have confirmed great invited speakers such as: Rasmus Bååth, Romain François, Ulrike Grömping, Matthias Templ, and Heather Turner, as well as strong representation from Poland: Przemysław Biecek (omg, it’s me!), Marek Gągolewski, Jakub Glinka, Katarzyna Kopczewska, and Katarzyna Stąpor. We are planning a meeting of more than 200 useRs from all across Europe working in different areas of the industry, academy, and government.

On behalf of organising committee, chaired by Maciej Beręsewicz, we want to invite you to be a part of this historical meeting by proposing a workshop, submitting a regular or lightning talk, presenting a poster, or just attending the activities we are preparing for the meeting.

You will find more details about the registration process on the website www.erum.ue.poznan.pl.

If you have any questions do not hesitate to ask through erum@konf.ue.poznan.pl.

See you in Poznań.

5_edited

Why should you backup your R objects?

There is a saying that there are two groups of people: those who are already doing backups and those who will. So, how this is linked with reproducible research and R?

If your work is to analyze data then you often face a need to restore/recreate/update results that you have generated some time ago.
You may think ,,I have a knitr reports for everything!”. That’s great! It will save you a lot of troubles. But to have 100% of warranty for exactly same results you need to have exactly the same environment and same versions of packages.

Do you know how many R packages have been updated during last 12 months?

I took list of top 20 R packages from here, scrap dates of their current and older CRAN releases from here and generate a plot with dates of submissions to CRAN sorted along date of last submission.

Czytaj dalej Why should you backup your R objects?

geom_christmas_tree(): a new geom for ggplot2 v2.0

iris2
Version 2.0 of the ggplot2 package (on GitHub) has a very nice mechanism for adding new geoms and stats (more about it here).
Christmas are coming, so maybe one would like to make his plots more tree’ish?
Below you will find a definition of geom_christmas_tree() geom. It supports following aesthetics: size (number of segments), fill, color, x and y.

With mpg data you can plot a colourful forest.

cars

Czytaj dalej geom_christmas_tree(): a new geom for ggplot2 v2.0

Hack the Proton. A data-crunching game from the Beta and Bit series

logo_eng
I’ve prepared a short console-based data-driven R game named ,,The Proton Game’’. The goal of a player is to infiltrate Slawomir Pietraszko’s account on a Proton server. To do this, you have to solve four data-based puzzles.

The game can be played by beginners as well as heavy users of R. Survey completed by people who completed the beta version of this game shows that the game gives around 15 minutes of fun to people experienced in R and up to around 60 minutes to people that just start programming and using R. More details about the results from beta-version are presented on the plot on the bottom.

PieczaraPieraszki

Czytaj dalej Hack the Proton. A data-crunching game from the Beta and Bit series

R in Insurance – the November meetup of the Warsaw R User Group

masterR
Inspired by the conference held in Amsterdam „R in Insurance”, we would like to dedicate the November meetup of Warsaw R Users Group to Insurance. The presentations will cover the practical aspects of insurance and more specifically the applications of R in insurance.
Join us on Thursday, November 26, 2015, 6:00 PM, Koszykowa 75, Warsaw, Room 329 MINI PW. Meetup will be in English.

Agenda
18.00-18.05 Welcome
18.05-18.40 „Experience vs. Data” Markus Gesmann (Lloyd’s, London)
18.40-19.00 Pizza break
19.00-19.35 „Non life insurance in R” Emilia Kalarus (Triple A – Risk Finance)
19.35-20.10 „Stochastic mortality modelling” Adam Wróbel (NN)
20.15 – Afterparty

This time our agenda is quite tight, since we have 3 very interesting presentations. We invite R programmers, data analysts as well as actuaries and risk professionals.

Czytaj dalej R in Insurance – the November meetup of the Warsaw R User Group

Warsaw R-Users Group Meeting #12

highres_437369023

After summer holidays we are back with two talks:
6pm-6:30 – Adolfo Álvarez PhD
,,5 lessons I have learned at Analyx”.
7pm-7:30 – Piotr Migdał, PhD
,,Jupyter – the environment for learning and doing data analysis’’.

See you tomorrow (22/10/2015) at 6 pm, Department of Mathematics, Warsaw University of Technology, Koszykowa 75 room 329.
You will find more details here (meetup).
You will find more materials here (github).

Warsaw Meetings of R Users / Warszawskie Spotkania Entuzjastów R

highres_437369023

With the summer holiday season coming to an end, we are back with Warsaw Meetings of R Users (Warszawskie Spotkania Entuzjastów R).

Three meetings ahead:

  • September 26 th (this Saturday) – let’s start with data-hack-day (DHD). Having data from Polish Seym (votes and transcripts), we are going to prepare some nice summaries of last cadency. Elections ahead, it is a good time for such statistics. MaszPrawoWiedzieć will support us in this effort. Be prepared for a lot of data cleaning and nice data exploration.
  • October 22 nd (Thursday), we will be talking about R and education. Two excellent speakers in the roster: Adolfo Álvarez (Advanced Customer Analyst at Analyx) and dr hab. Michał Ramsza (SGH).
  • November 26 th (Thursday). Topic for this meeting is ,,R in insurance’’. One of our special guest: Markus Gesmann (Lloyd’s, London). More to come.

You will find more information on our meetup page: http://www.meetup.com/Spotkania-Entuzjastow-R-Warsaw-R-Users-Group-Meetup/.

Thanks go to our partners and sponsors: Revolution Analytix/Microsoft, MINI PW, WLOG Solutions and SmarterPoland.

Incredible Adventures of Beta and Bit


I am working on a project that introduces data-driven reasoning (and of course R) to secondary schools conveyed by the fictional adventures of two teenagers, Beta and Bit.

Beta is a level-headed girl who has a passion for maths, logic and the art of deduction.
Bit is a hot-headed hacker and self-educated robotic engineer.

The first story from the series, called Pietraszko’ Cave, is available at this website (in English, Polish and Russian).

In the series, in each story strange adventures introduce Beta and Bit to concepts like: randomness, probability distributions, correlation, linear regression, hypothesis testing or some tools used by data analysts (so called data scientists nowadays).

Czytaj dalej Incredible Adventures of Beta and Bit