Insight About Kngine Architecture - Part 2
In our previous post, we discussed the challenges that we are facing, our goals, and how we are indexing the data. Today we will cover the following topics:
- Knowledge-Based Information Retrieval
- Storage and Processing Systems
Knowledge-Based Information Retrieval:
As we mentioned before, the first step to provide meaningful search results is to understand the document and the query. Unlike the traditional Search Engines, Kngine tries to understand the documents content and index it in a well structured knowledge-base that called 'Live Objects', which allow us to perform what we call it 'Knowledge-Based Information Retrieval'.
First, the Query Layer analayzes the query, then Kngine tries to identify the concept through Live Objects Identity Indexes, which is inverted indexes we use to recognize the concepts and perception words have multiple-meanings. For example when user search ‘Cairo’ Kngine will found many concepts, such as: ‘Cairo, Egypt’, ‘Cairo, Illinois’, and ‘Cairo (software)’. These concepts are sorted inside the index by rank. Also these identity indexes allow Kngine to support multi-languages because every entry into the Identity indexes refer to concept.
After identity the concepts Kngine fetch the concept information from Kngine Live Objects according to the concept type, loading the information is dependant on the analysis process that done by Query Layer. For example: 'Obama Books' return with Obama information, and a list of all books wrote by Obama, while 'Obama' return only with Obama information.
Storage & Processing System:
We have built all the storage and processing systems we use inside Kngine. Our storage system 'Kngine Storage System' is a rich set of disk-based data structure that serves variety of use-cases, Such as:
- Inverted Index.
- Key-Value Storage System (Vina).
- Reliable Queue/Stack.
The main component in our Storage System is Vina. The design of Vina -our Key-Value Store System- is similar to Redis, but Vina doesn't support real delete operation. The performance of Vina is also very competitive with the others Key-Value Storage Systems like Redis, Tokyo Tyrant, etc. We use Vina in many areas such as: Store Kngine Live Objects, and data processing.
Built on top of our storage system, we have a small data processing framework that takes care of parallelize and monitors the tasks. Our data processing framework is an actor based, which is similar to Microsoft Dryad, But it also can perform small Map-Reduce tasks.