ElasticSearch

由 datahunter 在五, 14/10/2022 - 15:00 發表

Elasticsearch Architecture

Elasticsearch is distributed, which means that indices can be divided into shards and

each shard can have zero or more replicas.

Each node hosts one or more shards and acts as a coordinator to delegate operations to the correct shard(s).

Rebalancing and routing are done automatically".

Elasticsearch — used to store data in the elastic database

Elasticsearch Node

Elasticsearch master node

controls the Elasticsearch cluster processing one cluster state at a time and

broadcasting the state to all other nodes.

The master node is in charge of all clusterwide operations,

including the creation and deletion of indexes.

Elasticsearch data node

contains data and the inverted index. This is the default configuration for nodes.

Elasticsearch client node

serves as a load balancer that routs incoming requests to various cluster nodes.

Installing Java on Ubuntu 20.04

apt update -y

apt-get install shasum wget

apt install default-jdk -y

java -version

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17....

shasum -a 512 -c elasticsearch-7.17.6-amd64.deb.sha512

dpkg -i elasticsearch-7.17.6-amd64.deb

systemctl daemon-reload

systemctl start elasticsearch

systemctl enable elasticsearch

Firewall Settings

# ufw

ufw allow from x.x.x.x to any port 9200
ufw enable
ufw status

# Test
curl 'http://localhost:9200'

# Port

9200 is for REST. (http) # http.port
9300 for nodes communication... (tcp) # transport.port

Settings

Config File: /etc/elasticsearch/elasticsearch.yml # Debian

# Elasticsearch listens for traffic from everywhere on port 9200

network.host: 0.0.0.0
http.port: 9200

discovery.type

discovery.type: single-node

Specifies whether Elasticsearch should form a multiple-node cluster.

Defaults to multi-node, which means that Elasticsearch discovers other nodes when forming a cluster and

allows other nodes to join the cluster later.

MAX_LOCKED_MEMORY

# Lock the memory on startup:

/etc/elasticsearch/elasticsearch.yml

bootstrap.memory_lock: true

# Debian 安裝(*.deb)的設定方式

systemctl edit elasticsearch

[Service]
LimitMEMLOCK=infinity

LimitMEMLOCK = Maximum locked memory size.

Set to unlimited if you use the bootstrap.memory_lock option in elasticsearch.yml.

systemctl daemon-reload

systemctl restart elasticsearch

Checking

curl -s http://localhost:9200/_nodes?pretty | grep mlockall

        "mlockall" : true

Heap size settings

By default, Elasticsearch automatically sets the JVM heap size based on a node’s roles and total memory.

To override the default heap size, set the minimum and maximum heap size settings, Xms and Xmx.

The minimum and maximum values must be the same.

* Set Xms and Xmx to no more than 50% of your total memory.

原因:

Elasticsearch requires memory for purposes other than the JVM heap.
For example, Elasticsearch uses off-heap buffers for efficient network communication and
relies on the operating system’s filesystem cache for efficient access to files.
The JVM itself also requires some memory.

Status

# Check License

curl -s http://localhost:9200/_license | jq

{
  "license" : {
    "status" : "active",
    "uid" : "UUID",
    "type" : "basic",
    ...
}

# Get cluster status

curl -s http://localhost:9200/_cluster/health | jq

{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 11,
  ...
}

a red status indicates that the specific shard is not allocated in the cluster,
yellow means that the primary shard is allocated but replicas are not, and
green means that all shards are allocated.

"pretty" in the above request => It enables human-readable format

# Get node status

curl -s http://localhost:9200/_nodes

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "nodes" : {
  ...
  }
}

P.S.

GET /_nodes/<node_id>
GET /_nodes/<node_id>/<metric>

curl -s http://localhost:9200/_stats

Using Elasticsearch

Elasticsearch uses a RESTful API: CRUD commands: create, read, update, and delete.

You can add your first entry like so:

curl -XPOST -H "Content-Type: application/json" 'http://localhost:9200/tutorial/helloworld/1' -d '{ "message": "Hello World!" }'

You can retrieve this first entry with an HTTP GET request.

curl -X GET -H "Content-Type: application/json" 'http://localhost:9200/tutorial/helloworld/1'

To modify an existing entry, you can use an HTTP PUT request.

    curl -X PUT -H "Content-Type: application/json" 'localhost:9200/tutorial/helloworld/1?pretty' -d '
    {
      "message": "Hello, People!"
    }'

"pretty" in the above request. => It enables human-readable format so that you can write each data field on a new row.

瀏覽次數： 477

夢想家