Language Identification Package

Many type of publicly available information, particularly social media, contain various languages. In addition to conducting machine translation, it is often helpful to simply identify the language. The cldr package allows users to leverage the Google Chrome language identification algorithm. This package is by far the fastest language identification package that I have found for R.

Installing CLDR

cldr is no longer on CRAN, but archive source files are available. To install cldr, run the following code:

#install from archive
url <- "http://cran.us.r-project.org/src/contrib/Archive/cldr/cldr_1.1.0.tar.gz"
pkgFile<-"cldr_1.1.0.tar.gz"
download.file(url = url, destfile = pkgFile)
install.packages(pkgs=pkgFile, type="source", repos=NULL)
unlink(pkgFile)
# or devtools::install_version("cldr",version="1.1.0")

#usage
library(cldr)
demo(cldr)
Written on February 2, 2017