Building a fast text suggester using Node.js, Koa2 and Elasticsearch
In this blog post, I’ll show you how to create a fast text suggester using Koa2 framework of Node.js backed by Elasticsearch. Koa2 is a minimalist framework developed by team behind Express.js and comes with no middlewares inbuilt. Elasticsearch is a distributed search engine which comes with handy features for doing full text search.
To start with, I’ll use the koa-api-generator to bootstrap the application. Start by issuing following commands
1
2
3
npm i -g koa-api-generator
koa-api node-search-engine && cd node-search-engine
npm install
Now I’ll install the elasticsearch
package to connect to Elasticsearch.
1
npm i elasticsearch --save
I’ll assume that you have installed and configured the Elasticsearch on your machine. If not then follow the installation guide here to set up Elasticsearch on your machine. Next thing we need to do is to create an index
and define mapping
in Elasticsearch so that it can know how to map the data that is going to be stored in it, in order to serve results faster. To do so, I’ll use Elasticsearch’s HTTP APIs by issuing following curl request from command line:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
curl -X PUT \
http://localhost:9200/persons/ \
-H 'content-type: application/json' \
-d '{
"mappings":{
"person":{
"properties":{
"name":{
"type":"string"
},
"suggest":{
"type":"completion"
}
}
}
}
}'
This request creates an index persons
having type person
with property name
. The suggest
field would be used for completion and hence type has been specified as completion
. Now I will create a module elasticService.js
which can be imported wherever I want to interact with Elasticsearch.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import elasticsearch from "elasticsearch";
const elasticClient = new elasticsearch.Client({
host: "localhost:9200",
log: "info"
});
const elasticService = {};
const index = "persons";
const type = "person";
/**
* Add a person
*/
elasticService.addPerson = name => {
return elasticClient.index({
index,
type,
body: {
name,
suggest: name.split(" ")
}
});
};
/**
* Fetch suggestions
*/
elasticService.getSuggestions = input => {
return elasticClient.suggest({
index,
body: {
persons: {
prefix: input,
completion: {
field: "suggest",
fuzzy: true,
size: 10
}
}
}
});
};
export default elasticService;
addPerson
function is just adding the name of person to index persons
and populating suggest
field with array of strings obtained after splitting the name
. For example, if name
is Pranav Prakash
, suggest field will contain ["Pranav", "Prakash"]
. getSuggestions
uses Completion Suggester api to return suggestions. I have set the fuzzy
searching true
which will return results even if we have typo in search input. The size
parameters simply restricts the number of suggestion which in this case has been set to 10. Now since our module is complete, I will integrate it with Koa routes. I’ve populated the routes.js
with two routes as below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import Router from "koa-router";
import elasticService from "./elasticService";
const router = new Router();
/**
* Get suggestions
*/
router.get("/suggest/:term", async ctx => {
ctx.body = await elasticService.getSuggestions(ctx.params.term);
});
/**
* Create person
*/
router.post("/", async ctx => {
ctx.body = await elasticService.addPerson(ctx.request.body.name);
});
export default router;
These routes are quite simple: get("suggest/:term")
route is returning the suggestions and post("/")
route adds a person to Elasticsearch. Now we’ll start the server using following command:
1
npm run dev
Now I need to add some data in Elasticsearch so that I can perform searches on it. I’ll add some data by issuing following POST
requests from command line:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
curl -X POST \
http://localhost:8080/ \
-H 'content-type: application/json' \
-d '{
"name" : "Ranbir Kapoor"
}'
curl -X POST \
http://localhost:8080/ \
-H 'content-type: application/json' \
-d '{
"name" : "Ranvir Singh"
}'
curl -X POST \
http://localhost:8080/ \
-H 'content-type: application/json' \
-d '{
"name" : "Ranveer Singh"
}'
Now I’ll query the application using commands below:
1
2
curl -X GET \
http://localhost:8080/suggest/ranbir
And voila, I’m presented with results below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
{
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"persons": [
{
"text": "ranbir",
"offset": 0,
"length": 6,
"options": [
{
"text": "Ranbir",
"_index": "persons",
"_type": "person",
"_id": "AV7Sns-uOy-E67GQK4As",
"_score": 5,
"_source": {
"name": "Ranbir Kapoor",
"suggest": [
"Ranbir",
"Kapoor"
]
}
},
{
"text": "Ranvir",
"_index": "persons",
"_type": "person",
"_id": "AV7SnraQOy-E67GQK4Ar",
"_score": 3,
"_source": {
"name": "Ranvir Singh",
"suggest": [
"Ranvir",
"Singh"
]
}
}
]
}
]
}
Notice how fuzzy searching has returned results for both Ranbir
as well as Ravir
, pretty neat, eh? Only difference is in their score
which Elasticsearch calculates internally based on how well the query term matches with current result.
The complete source code for this project is available on Github here. This post just scrathces the surface of what can be done with Elasticsearch. You can dive deeper by specifying weight
for suggestions, trying out different analyser
and search_analyzer
, increasing or decreasing fuzziness
or using regex
queries depending on your use case.