Connect to Elastic Cloud with R Client
Elastic
(part of the rOpenSci project) is a general purpose R interface to Elasticsearch.
Scott Chamberlain (NA). elastic: General Purpose
Interface to 'Elasticsearch'. R package version
0.8.0.9100. https://github.com/ropensci/elastic
After you install this library following the instructions, you can import the library as follows.
library("elastic")
Setup ElasticSearch Server
Go to Elastic Cloud and create a cluster. (Free trial for 14 days available.)
Note that we need the HTTPS endpoint from the Overview page.
Connect to the cluster
Note that you should specify the connection information to connect with authentication.
- es_host: The endpoint of the cluster, without prefixes such as ‘http’. (i.e.,
xxxxx.us-east-1.aws.found.io
) - es_path: In this case we can just leave it blank.
- es_user: The user name for cluster authentication.The default user name of Elastic Cloud is
elastic
. - es_pwd: The password for cluster authentication.You can find it on the Security page.
- es_port: The port number of the cluster on the server. The default port is
9243
on Elastic Cloud. - es_transport_schema: the transport protocal, use
https
here.
connect(es_host = "aea56252e39a17de2c3f908d64a82ad9.us-east-1.aws.found.io", es_path = "", es_user="elastic", es_pwd = "g8QHIaXkRPqLEKvdyEiCrKV1", es_port = 9243, es_transport_schema = "https")
transport: https
host: aea56252e39a17de2c3f908d64a82ad9.us-east-1.aws.found.io
port: 9243
path: NULL
username: elastic
password: (secret)
errors: simple
headers (names): NULL
Upload Data
First, we need to load some data.
Public Library of Science (PLOS) data is a dataset inluded in the elastic package is metadata for PLOS scholarly articles.
plosdat <- system.file("examples", "plos_data.json", package = "elastic")
Then, upload the data we’ve just loaded.
we use the function docs_bulk
to upload the data plosdat
invisible(docs_bulk(plosdat))
Manipulate the uploaded data
Search the plos
index, limit to 1
result.
Search(index = "plos", size = 1)$hits$hits
[[1]]
[[1]]$`_index`
[1] "plos"
[[1]]$`_type`
[1] "article"
[[1]]$`_id`
[1] "0"
[[1]]$`_score`
[1] 1
[[1]]$`_source`
[[1]]$`_source`$id
[1] "10.1371/journal.pone.0007737"
[[1]]$`_source`$title
[1] "Phospholipase C-β4 Is Essential for the Progression of the Normal Sleep Sequence and Ultradian Body Temperature Rhythms in Mice"
Search the plos
index, and the article
document type. Query for antibody
, limit to 1
result.
Search(index = "plos", type = "article", q = "antibody", size = 1)$hits$hits
[[1]]
[[1]]$`_index`
[1] "plos"
[[1]]$`_type`
[1] "article"
[[1]]$`_id`
[1] "568"
[[1]]$`_score`
[1] 4.165291
[[1]]$`_source`
[[1]]$`_source`$id
[1] "10.1371/journal.pone.0085002"
[[1]]$`_source`$title
[1] "Evaluation of 131I-Anti-Angiotensin II Type 1 Receptor Monoclonal Antibody as a Reporter for Hepatocellular Carcinoma"
For more details/examples for the manipulation of R client using elastic
library, please refer to its tutorial.