📊 streamline your analyses linking R to sas and more: the workfloweR experiment đŸ–„

📊 streamline your analyses linking R to sas and more: the workfloweR experiment đŸ–„
we all know R is the first choice for statistical analysis and data visualisation, but what about big data munging? tidyverse (or we’d better say hadleyverse 😏) has been doing a lot in this field, nevertheless it is often the case this kind of activities being handled from some other coding language. Moreover, sometimes you get as an input pieces of analyses performed with other kind of languages or, what is worst, piece of databases packed in proprietary format (like .dta .xpt and other). So let’s assume you are an R enthusiast like I am, and you do with R all of your work, reporting included, wouldn’t be great to have some nitty gritty way to merge together all these languages in a streamlined workflow?

Continue reading “📊 streamline your analyses linking R to sas and more: the workfloweR experiment đŸ–„”

Advertisement

ggplot2 themes examples

ggplot2 themes examples
this short post is exactly what it seems: a showcase of all ggplot2 themes available within the ggplot2 package. I was doing such a list for myself ( you know that feeling …”how would it look like with this theme? let’s try this one…”) and at the end I thought it could have be useful for my readers. At least this post will save you the time of trying all differents themes just to have a sense of how they look like.
Enjoy!

Continue reading “ggplot2 themes examples”

Euro 2016 analytics: Who’s playing the toughest game?

Euro 2016 analytics: Who’s playing the toughest game?

I am really enjoying Uefa Euro 2016 Footbal Competition, even because our national team has done pretty well so far. That’s why after  browsing for a while statistics section of official EURO 2016 website I decided to do some analysis on the data they share ( as at the 21th of June).

Just to be clear from the beginning: we are not talking of anything too rigourus, but just about some interesting questions with related answers gathered mainly through data visualisation.

We can divide following analyses into two main parts: a first part were we analyse distribution of fouls and their incidence on matches outcome and a second part where ball possession in analysed, once again looking at relationship between this stat and matches outcome. Let’s start with fouls then.

which team committed the  greatest number of fouls?

Here we are with the first question. And here it is the answer:

_fouls

Continue reading “Euro 2016 analytics: Who’s playing the toughest game?”

Over 50 practical recipes for data analysis with R in one book

Over 50 practical recipes for data analysis with R in one book
Ah, writing a blog post! This is a pleasure I was forgetting, and you can guess it looking at last post date of publication: it was around january... you may be wondering: what have you done along this long time? Well, quite a lot indeed:
  • changed my job ( I am now working @ Intesa Sanpaolo Banking Group on Basel III statistical models)
  • became dad for the third time (and if you are guessing, it’s a boy!)
  • fixed some issues with the updateR package
  • and I wrote a book!
Hope this pretty long list will help you forgive me for my long silence. I am actually pretty proud of all of them, but let’s talk about the book now. I think it is an useful contribution to the R community. But first of all, the title:

RStudio for R Statistical Computing Cookbook

Continue reading “Over 50 practical recipes for data analysis with R in one book”

ramazon: Deploy your Shiny App on AWS with a Function

ramazon: Deploy your Shiny App on AWS with a Function

THIS IS AN OUTDATED VERSION OF THE POST. YOU CAN FIND THE UPDATED AND MAINTAINED ONE AT http://www.andreacirillo.com/2015/08/18/deploy-your-shiny-app-on-aws-with-a-function/

Because Afraus received a good interest, last month I override shinyapps.io free plan limits.

That got me move my Shiny App on an Amazon AWS instance.

Well, it was not so straight forward: even if there is plenty of tutorials around the web, every one seems to miss a part: upgrading R version, removing shiny-server examples… And even having all info it is still quite a long, error-prone process.

All this pain is removed by ramazon, an R package that I developed to take care of everything is needed to deploy a shiny app on an AWS instance. An early disclaimer for Windows users: only Apple OS X is supported at the moment.

Continue reading “ramazon: Deploy your Shiny App on AWS with a Function”

Introducing Afraus: an Unsupervised Fraud Detection Algorithm

Introducing Afraus: an Unsupervised Fraud Detection Algorithm
The last Report to the Nation published by ACFE, stated that on average, fraud accounts for nearly the 5% of companies revenues.
on average, fraud accounts for nearly the 5% of companies revenues

Tweet: on average, fraud accounts for nearly the 5% of companies revenues. http://ctt.ec/u5E6x+

ACFE Infographic: typical organization loses 5% of their revenues for fraud
Projecting this number for the whole world GDP, it results that the “fraud-country” produces something like a GDP 3 times greater than the Canadian GDP.

Continue reading “Introducing Afraus: an Unsupervised Fraud Detection Algorithm”

How to add a live chat to your Shiny app

How to add a live chat to your Shiny app
As I am currently working on a Fraud Analytics Web Application based on Shiny (currently on beta version, more later on this blog) I found myself asking: wouldn’t be great to add live chat support to my Web Application visitors?
It would indeed!
an ancient example of chatting - Camera degli Sposi, Andrea Mantegna 1465 -1474
an ancient example of chatting – Camera degli Sposi, Andrea Mantegna 1465 -1474
But how to do it?
Unfortunately, looking on Google didn’t give any useful result.
Therefore I had to find it out by myself.

Continue reading “How to add a live chat to your Shiny app”

Catching Fraud with Benford’s law (and another Shiny App)

Catching Fraud with Benford’s law (and another Shiny App)

In the early ‘900 Frank Benford observed that ‘1’ was more frequent as first digit in his own logarithms manual.

More than one hundred years later, we can use this curious finding to look for fraud on populations of data.

just give a try to the shiny app

What ‘Benford’s Law’ stands for?

Around 1938 Frank Benford, a physicist at the General Electrics research laboratories, observed that logarithmic tables were more worn within first pages: was this casual or due to an actual prevalence of numbers near 1 as first digits?

Continue reading “Catching Fraud with Benford’s law (and another Shiny App)”

Network Visualisation With R

Network Visualisation With R

The main reason why

After all, I am still an Internal Auditor. Therefore I often face one of the typical internal auditors problems: understand links between people and companies, in order to discover the existence of hidden communities that could expose the company to unknown risks.

the solution: linker

In order to address this problem I am developing Linker, a lean shiny app that take 1 to 1 links as an input and gives as output a network map:
the Linker
click the picture to reach the app

Continue reading “Network Visualisation With R”

excel right() function in R

as part of the excel functions in R, I have developed this custom function, reproducing the excel right() function in th R language. Feel free to copy and use it.

right = function (string, char){
substr(string,nchar(string)-(char-1),nchar(string))}

you can find other function in the Excel functions in R post.