Following the post about %in% operator, I received this tweet:
I gave a look to the code kindly provided by Ben and then I asked myself:
I know dplyr is a really nice package, but which snippet is faster?
to answer the question I’ve put the two snippets in two functions:
#Ben snippet
dplyr_snippet =function(object,column,vector){
filter(object,object[,column] %in% vector)
}
#AC snippet
Rbase_snippet =function(object,column,vector){
object[object[,column] %in% vector,]
}
Then, thanks to the great package microbenchmark, I made a comparison between those two functions, testing the time of execution of both, for 100.000 times.
comparison = microbenchmark(Rbase_snippet(iris,5,vec),dplyr_snippet(iris,5,vec),times = 100000)
#plot the output
autoplot(comparison)+
labs(title = "comparison between dplyr_snippet and Rbase_snippet", y="snippet")
And that was the result:
R Base package seems to be the winner, even if just for an handful of microseconds…
Nevertheless, I am really grateful to Ben, it was a great fun!
Quite interesting.
Check out my new blog post – IBM Watson Analytics – Powerful Analytics for Everyone #IBM #IBMWatson http://ow.ly/BD7RE