Category: analytics
Catching Fraud with Benford’s law (and another Shiny App)
In the early ‘900 Frank Benford observed that ‘1’ was more frequent as first digit in his own logarithms manual.
More than one hundred years later, we can use this curious finding to look for fraud on populations of data.
just give a try to the shiny app
What ‘Benford’s Law’ stands for?
Around 1938 Frank Benford, a physicist at the General Electrics research laboratories, observed that logarithmic tables were more worn within first pages: was this casual or due to an actual prevalence of numbers near 1 as first digits?
Continue reading “Catching Fraud with Benford’s law (and another Shiny App)”
How to use Github with Rstudio : step-by-step tutorial
Pushing to my Github repository directly from the Rstudio project, avoiding that annoying “copy & paste” job. Since it is one of Best Practices for Scientific Computing, I have been struggling for a while with this problem. Now that I managed to solve the problem, I think you may find useful the detailed tutorial that follows. I am not going to explain you the reason why you should use Github with your Rstudio project, but if you are asking this to yourself, you may find useful a Stack Overflow discussion on the topic.
0. download last git version
you can download the last git version from Git website
Continue reading “How to use Github with Rstudio : step-by-step tutorial”
Network Visualisation With R
The main reason why
the solution: linker
Querying Google With R
Best Practices for Scientific Computing
I reproduce here below principles from the amazing paper Best Practices for Scientific Computing, published on 2012 by a group of US and UK professors. The main purpose of the paper is to “teach” good programming habits shared from professional developers to people that weren’t born developer, and became developers just for professional purposes.
Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently
Best Practices for Scientific Computing
-
Write programs for people, not computers.
- a program should not require its readers to hold more than a handful of facts in memory at once
- names should be consistent, distinctive and meaningful
- code style and formatting should be consistent
- all aspects of software development should be broken down into tasks roughly an hour long Continue reading “Best Practices for Scientific Computing”
excel right() function in R
as part of the excel functions in R, I have developed this custom function, reproducing the excel right() function in th R language. Feel free to copy and use it.
right = function (string, char){ substr(string,nchar(string)-(char-1),nchar(string))}
you can find other function in the Excel functions in R post.
excel left() function in R
as part of the excel functions in R, I have developed this custom function, emulating the excel left() function in th R language. Feel free to copy and use it.
left = function (string,char){
substr(string,1,char)}
you can find other function in the Excel functions in R post.
excel functions in R
Excel Functions in R:
- find()
- vlookup()
Mining Twitter with R
Great tutorial on text mining with twitter by Paeng Angnakoon