😓 Leader speechless: searching for articles based on keywords you actually use like to achieve?

Cola and his team have been working on an article community platform recently. Due to lack of manpower, the front-end and back-end are written by front-end students. The back-end is implemented using nest .

One day, Coke received a request to search for articles based on a keyword, where the user enters a keyword to go and fuzzy match the following fields in the article table:

title : Title
content : Content
introduction : Introduction

Coke thought to himself, “That’s not so simple, a like fuzzy query thing.

Initial fuzzy search implementation

In less than ten minutes, Coke had finished writing the back-end code:

  async searchArticle(params: SearchArticleDto) {
    const { keyword, pageNo, pageSize } = params;
    const paginationService = new PaginationService<ArticleEntity>(
      this.articleRepository,
    );
    const res = await paginationService.paginate({
      page: pageNo,
      pageSize,
      options: {
        where: [
          { content: Like(`%${keyword}%`) },
          { title: Like(`%${keyword}%`) },
          { introduction: Like(`%${keyword}%`) },
        ],
        select: ['id', 'categoryId', 'introduction', 'title', 'creatorName'],
      },
    });
    return res;
  }

Suppose we enter the parameter:

{
    "keyword": "nest",
    "pageNo": 1,
    "pageSize": 10
}

Then the actual sql statement executed is:

SELECT id,title,categoryId,creatorName,introduction from articles WHERE content LIKE '%nest%' OR title LIKE '%nest%' OR introduction LIKE '%nest%' LIMIT 10 OFFSET 0

Then, he told the leader that he finished, inner OS: the last point you said I did five days to do a slow, this time I ten minutes to finish, you have nothing to say.

The leader took a look at the code Coke had written and said breathlessly:

First of all, I recognize that using like is the easiest way to accomplish this requirement as well. However, the drawbacks of implementing it this way are quite obvious:

There’s no way to go to an index with like like you have, and we’re not going to index those fields, and by the time the amount of data gets bigger and bigger, it’s going to be very inefficient to implement it that way
This implementation is case-sensitive, and for searching, if the user wants to search for react , assuming but that everything stored in the repository is React , then it won’t be searched.
There is no word separation logic, for example, if the input is , then some search engines may separate it into words such as “I”, “like”, “use”, “MeiliSearch”, “do”, “full-text search”, etc. to search, “MeiliSearch”, “carry out”, “full-text search” and other words to search.

It looks like you have never used a search engine right, there are many great search engines on the market, such as Elasticsearch（ES） , which can be a very powerful search engine. But for a small team like us, it’s not very suitable.

First of all, it takes up a lot of memory and CPU resources; secondly, it is difficult to get started; furthermore, for some participle functions, we need to debug them additionally.

You should learn about MeiliSearch, which we use for this search function. It is very fast and less resource intensive, deployment is also very simple, configuration is also very easy to get started.

After the chat, Coke started to go searching for some information about MeiliSearch and started to prepare for deploying and using it.

MeiliSearch

We can install MeiliSearch on Docker by following the steps below:

Pulling MeiliSearch mirrors: First, use the following command to pull the official mirror of MeiliSearch from Docker Hub .
```
docker pull getmeili/meilisearch
```
Create and Run MeiliSearch Containers: Then, use the following commands to create and run MeiliSearch containers in Docker .
```
docker run -d --rm \
-p 7700:7700 \
-e MEILI_MASTER_KEY=my_custom_master_key \
getmeili/meilisearch
```
In this order:

docker run : This command is used in Docker run一 containers.
-d : This option tells Docker to run the container in the background.
--rm : This option tells Docker to automatically delete the container when it stops running.
-p 7700:7700 : This option maps the container’s port to the host. MeiliSearch The server runs on port 7700 inside the container and the host can access MeiliSearch via localhost:7700 .
-e MEILI_MASTER_KEY=my_custom_master_key : This option sets the passwords for master key for MeiliSearch and master key for MeiliSearch .

Accessing the MeiliSearch Console: After completing the steps above, access the Web console at MeiliSearch by visiting http://localhost:7700 .

MeiliSearch First Experience

Here’s a look at some of the core concepts in MeiliSearch :

Index: An index is a logical unit used to organize and store data in MeiliSearch . Each index is a separate collection of data that contains a set of documents that can be searched, filtered, sorted, and so on. In MeiliSearch , the index is the basic unit of search.
Document: A document is the actual data object stored in MeiliSearch . Each document is a data record containing a certain structure, which may be in the JSON format, containing multiple fields ( Field ), each with a field name and corresponding value. For example, in an index named "books" , each document may represent a book, which contains field information such as book title, author, publication date, and so on.

That is, we need to create a articles index, and then add document data in the following format to the articles index:

{
    id:1,
    title:'title',
    content:'content',
    introduction:'introduction'
}

The above request uses post to access the indexes route, creating an index named articles .

After building the index, we don’t want the id field to be searchable, we can use the following to modify the fields that can be searched:

Visit /indexes/articles/settings with the parameters:

{
    "searchableAttributes": [
        "title",
        "content",
        "introduction"
    ]
}

Indicates that only the fields title , content , and introduction can be searched.

Then let’s try it by pushing an article into the articles index.

Use post to access the /indexes/:index/documents route to push a piece of test data into MeiliSearch :

[
    {
        "id": "1",
        "title": "Test Title",
        "content": "Test Content",
        "introduction": "Test Description"
    }
]

The data we’ve pushed can be seen at MeiliSearch的webui

The same can be done by requesting access to the

Explain the parameters:

q : Content of the query
attributesToRetrieve : Fields to be returned
limit : Number of articles per page
offset : Offset

Data Push

After roughly understanding the usage of MeiliSearch , you can write a test interface or script to synchronize the database data to MeiliSearch .

First, let’s start by accessing MeiliSearch in our project and installing the library first. npm install meilisearch

Then wrap a service as follows:

import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
import { MeiliSearch, SearchParams } from 'meilisearch';
@Injectable()
export class MeiliSearchService {
  private readonly client: MeiliSearch;
  constructor() {
    const configService = new ConfigService();
    const host = configService.get<string>(
      'MEILI_HOST',
      'http://localhost:7700',
    );
    const apiKey = configService.get<string>('MEILI_MASTER_KEY', 'master_key');
    this.client = new MeiliSearch({
      host,
      apiKey,
    });
  }

  async search(indexName: string, query: string, options: SearchParams) {
    return await this.client.index(indexName).search(query, options);
  }
  async addDocument(indexName: string, documents: Record<string, any>[]) {
    return await this.client.index(indexName).addDocuments(documents);
  }
}

Here we instantiate a MeiliSearch based on the configuration, and then simply encapsulate a method for querying and inserting documents.

Then we write an interface to synchronize all the data from the database to MeiliSearch .

 async pushAllArticles() {
    const list = await this.articleRepository.find({
      where: {
        status: 1,
        isDeleted: 0,
      },
    });
    await this.meiliSearchService.addDocument(ARTICLE_INDEX, list);
  }

Here the articles that have been published （status为1） and not deleted （isDeleted为0） are checked out and pushed to MeiliSearch , however the call reports an error.

Looking at the error is he says we are missing a request header which should have a value of master key or some other key , I typed out the MeiliSearch instance to look at it and found that it populates the Authorization field with master key .

It may be that the version of my Docker installation doesn’t particularly match the sdk version of this js api , but it does provide some way to inject request headers. The instantiation can be done as follows:

this.client = new MeiliSearch({
  host,
  apiKey,
  requestConfig: {
    headers: {
      'X-MEILI-API-KEY': apiKey,
      'Content-Type': 'application/json',
    },
  },
});

Then it will be pushed successfully

Once the full amount of data has been pushed, we still need to push the article data into MeiliSearch when the article is published.

  async pushAllArticles() {
    const list = await this.articleRepository.find({
      where: {
        status: 1,
      },
    });
    await this.meiliSearchService.addDocument(ARTICLE_INDEX, list);
  }

The implementation is much the same, so I won’t go into it here.

Search Interface Implementation

After the data has been pushed, we can rewrite the search function:

  async searchArticle(params: SearchArticleDto) {
    const { keyword, pageNo, pageSize } = params;
    const res: any = await this.meiliSearchService.search(
      ARTICLE_INDEX,
      keyword,
      {
        attributesToRetrieve: ['id', 'title', 'introduction'],
        limit: pageSize,
        offset: (pageNo - 1) * pageSize,
      },
    );
    const hits = res.hits;
    const nbHits = res.nbHits;
    return {
      list: hits,
      total: nbHits,
      pageSize: pageSize,
      currentPage: pageNo,
      totalPage: Math.ceil(nbHits / pageSize),
      isEnd: Math.ceil(nbHits / pageSize) === pageNo,
    };
  }

Explain the code above:

attributesToRetrieve : MeiliSearch Fields to be returned
hits : Number of items currently searched
nbHits : Total number of articles
According to the paging information and keywords passed by the front-end, call MeiliSearch to do the search, and after the search is complete, splice the paging interface to return to the front-end.

Keyword highlighting

The call to search can be accompanied by a attributesToHighlight field, which helps us to redact keywords in the result, for example if I want to redact the title and blurb, then I can set the

attributesToHighlight: ['title', 'introduction']

As you can see in the result, the corresponding area has been wrapped by the em tag, this use the front-end can use it to do different styles of display.

😓 Leader speechless: searching for articles based on keywords you actually use like to achieve?

Initial fuzzy search implementation

MeiliSearch

MeiliSearch First Experience

Data Push

Search Interface Implementation

Keyword highlighting

By hbb

Related Post

Leave a Reply Cancel reply

You Missed

8 Python practical scripts, save them for future use!

Python logging library logging summary – probably the best article summarizing the logging library so far

I hear you know Python?

An article on collection manipulation functions in Kotlin