Over 50 practical recipes for data analysis with R in one book

Over 50 practical recipes for data analysis with R in one book
Ah, writing a blog post! This is a pleasure I was forgetting, and you can guess it looking at last post date of publication: it was around january... you may be wondering: what have you done along this long time? Well, quite a lot indeed:
  • changed my job ( I am now working @ Intesa Sanpaolo Banking Group on Basel III statistical models)
  • became dad for the third time (and if you are guessing, it’s a boy!)
  • fixed some issues with the updateR package
  • and I wrote a book!
Hope this pretty long list will help you forgive me for my long silence. I am actually pretty proud of all of them, but let’s talk about the book now. I think it is an useful contribution to the R community. But first of all, the title:

RStudio for R Statistical Computing Cookbook

Continue reading “Over 50 practical recipes for data analysis with R in one book”

How to add a live chat to your Shiny app

How to add a live chat to your Shiny app
As I am currently working on a Fraud Analytics Web Application based on Shiny (currently on beta version, more later on this blog) I found myself asking: wouldn’t be great to add live chat support to my Web Application visitors?
It would indeed!
an ancient example of chatting - Camera degli Sposi, Andrea Mantegna 1465 -1474
an ancient example of chatting – Camera degli Sposi, Andrea Mantegna 1465 -1474
But how to do it?
Unfortunately, looking on Google didn’t give any useful result.
Therefore I had to find it out by myself.

Continue reading “How to add a live chat to your Shiny app”

How to list file and folders within a folder ( basic file app)

How to list file and folders within a folder ( basic file app)

I know, we are not talking about analytics and no, this is not going to set me as a great data scientist… By the way: have you ever wondered how to list all files and folders within a root folder just hitting a button?

I have been looking for something like that quite a lot of times, for instance when asked to write down an index of all the working papers pertaining to a specific audit ( yes, I am an auditor, sorry about that): really time-consuming and not really value-adding activity. Continue reading “How to list file and folders within a folder ( basic file app)”

How to use Github with Rstudio : step-by-step tutorial

How to use Github with Rstudio : step-by-step tutorial

Pushing to my Github repository directly from the Rstudio project, avoiding that annoying “copy & paste” job. Since it is one of Best Practices for Scientific Computing, I have been struggling for a while with this problem.  Now that I managed to solve the problem, I think you may find useful the detailed tutorial that follows. I am not going to explain you the reason why you should use Github with your  Rstudio project, but if you are asking this to yourself, you may find useful a Stack Overflow discussion on the topic.

0. download last git version

you can download the last git version from Git website git logo

Continue reading “How to use Github with Rstudio : step-by-step tutorial”

download data to excel from web

download data to excel from web

This simple tutorial will show you how to download data into an excel spreadsheet, creating a web query.

Download data into excel

select “data” tab

download data in excel

 

select “from web”

 from web selection

 

input the desidered web URL

input web URL

click “go” button

go button click

select data you want to download

data selection

click “import” button

import button click


Refresh downloaded data

 

select “data” tab

Image [9]

select “connections”

dowload data into excel

select your connection

connection selection

click “refresh” button

download data into excel

 

How to Put Equations into Evernote

How to Put Equations into Evernote

THIS IS AN OUTDATED VERSION OF THE POST. YOU CAN FIND THE UPDATED AND MAINTAINED ONE AT http://www.andreacirillo.com/2014/09/11/equations-in-evernote/ 

Problem
If you have to put some math writing into your Evernote notes, and you have a Mac device, there is a very simple way to solve your problem out.
Solution
This way is called Grapher, a built-in application for visualising math stuffs.
Tutorial
Here below a simple tutorial:
1. find Grapher among your applications. You can either search for it within Spotlight or using the launchpad.
1equations in evernote
2. write the equation you would like to put into your Evernote note.

2 equations in evernote

 

3. copy the equation as TIFF
 equations in evernote
4. paste the equation into Evernote

3 equations in evernote

And that’s it!
I think this trick is very useful when you have some “heavy” equations that would not be clear enough if it would just be written in simple text.
Sharing the post on Google + I have received the good advice to use daum equation editor, specifically aimed at writing equations.
I think the Grapher advantage is that is a built-in application, nevertheless, I’m really grateful to Roberta Normano for the tip.
Other tips are welcome!

Code snippet: subsetting data frame in R by vector

Code snippet: subsetting data frame in R by vector

Problem:

you have to subset a data frame using as criteria the exact match of a vector content.

for instance:

you have a dataset with some attributes, and you have a vector with some values of one of the attributes. You want to make a filter based on the values in the vector.

Example: sales records, each record is a deal.

The vector is a list of selected customers you are interested in.

Is it possible to make such a kind of filter?

Solution

Of course it is!

you just have to use the %in% operator.

let’s see how to do it in the short tutorial here below.

Tutorial

suppose you have a sales data frame object like this:

#rstats

 

suppose you want to extract sales to Francesca, Tommaso and Gianna.

first, you have to assign those names to a vector.

vector = c(“Francesca","Tommaso","Gianna")

then, you can write the filtering statement, using the %in% operator.

query_result = sales[sales$customer %in% vector,]

and that’s it!

The meaning of %in% operator is exactly the one you guess:

“ select only  values present IN the specified group”.

Full code is available as an R workbook for quick reference:

filter_code.R

Let me know if you use any other method to obtain the same result.

Finally, if you enjoyed the tutorial, you can find more tutorial on page Tutorial (quite obvious, isnt’ it?).

Saturation with Parallel Computation in R

Saturation with Parallel Computation in R

I have just saturated all my PC:

full is the 4gb RAM

Full-in

and so is the CPU (I7 4770 @3.4 GHZ)

full-in

Parallel Computation in R

which is my secret?

the doParallel package for R on mac

The package lets you make some very useful parallel computation, giving you the possibility to use all the potentiality of your CPU.

As a matter of fact, the standard R option is to use  just on of the cores you have got on your PC.

With parallel computation, just to say it easy, you can take your job, divide it in some smaller jobs, solve them and then put them together  in one new R object.

Tutorial ( More or Less)

There are some useful tutorial on the web (try to google it), but let me introduce the stuff in a really basic way so that you can immediately try it out:

install.packages("doParallel")
library(doParallel)
cl = makeCluster(2) # if you want to use all your fire power, put the number of your cores
registerDoParallel(cl)
parallelization = function(x){
n = number #put here the number of repetitions you need
foreach ( i=1:n,.combine = rbind) { #this '.combine = rbind' let R understand that has to put the results together with an rbind function, you can use cbind as well
x*2
}
}

Final Warnings

just a tip: DON’T  MAKE THE RESULT BE AN OBJECT!

sorry about the capital letters but I have been stacked on this error for quite a long time…

Finally, are you using Windows? instead of doParallel you can obtain the same result with doSNOW package.

comments are welcome.