Implementation: developing an accurate search engine and introducing multiple improvementsImplementation
N-iX has assisted the client with developing two services for the new search engine, namely, the Search service, and the Configuration service.
The Search service (on both the client’s website and the mobile application) uses the main search index and two additional indexes stored in Elasticsearch. The main search index helps users quickly find information on the website or in the mobile app. It maps search queries to documents or URLs that might appear in the results.
The first additional index is used to implement the autocomplete feature. Namely, when a user does not select one of the offered autocomplete options, the system makes a request to the first additional index, which returns metadata with information about the filters that should be applied in the request to the main search index.
The second additional index identifies various parameters in search requests, such as brands, categories, or gender. It uses exact matching and word stemming to recognize different forms of the same word even if spelling errors are present. This index helps configure exact filters for the search, improving its accuracy.
The Configuration service is a separate platform that includes a list of synonyms, URL redirections based on search requests, and mappings of search phrases to specific products, brands, and categories. It helps the search engine better identify products that users search for. N-iX experts have also developed an intuitive UI for the Configuration service.
After the new search engine was released to production with the two services, N-iX has been continuously monitoring the search results, comparing them to the results of the previously used search solution, and implementing search engine improvements.
Additionally, N-iX has implemented a testing script that can send up to 400 most popular search requests and gather results in the HTML report. The report helps to analyze and compare the quality of search results of the new system, competitors, and experimental versions of the search. The report distinguishes top brands and top categories according to search requests.
We have also used the CLIP neural network that unites text and image domains to ensure that all products are placed in corresponding categories with 100% accuracy.
Our engineers have used the Spark pipeline to gather data from several databases and create a new version of the search index in Elasticsearch. Lambda, which is launched every 5 minutes, has been used to update information about product availability in warehouses and for pre-orders. This helps improve the customer experience since unavailable products are removed and are not shown on the website.
We have used Redis cache to save filters and search requests that users have made previously, and store all configurations of the search engine. This eliminates the need to send requests to Elasticsearch and search indexes repeatedly and, therefore, accelerates search processes.
Finally, we have established automation testing as a service. To ensure that system non-functional requirements for performance (such as throughput, latency, and memory usage) are met, we have introduced performance engineering. We have created a few basic test scenarios for load, stress, web, and custom tests. We have also set up maintenance infrastructure by running the tests in Docker and storing the code in the Github repository. Furthermore, our team has adjusted the report generation in Docker after each automated test and published the results in Confluence. This provided quick and valuable insights into the system's performance. And, to accelerate the solution’s time-to-market we have set up the CI/CD pipeline that launches automation tests.