Insight About Kngine Architecture - Part 1
- Introduction.
- Motivation.
- Goals.
- Indexing & Building Universal Knowledge-Base.
- Knowledge-Based IR.
- Storage System.
- System Architecture.
search engines today do a good job in helping people navigate the Web and find information, but they don’t do a very good job in enabling people to use the information they find. SO|i would like to categorize today challenges into two primary categories: Information, and Interface.
Information:
Despite the weak economy, a total of 3,892,179,868,480,350,000,000 (that's 3 sextillion, 892 quintillion, 179 quadrillion, 868 trillion, 480 billion, 350 million) new digital information bits were created in 2008 alone .
The information which are available on the Web today are more than abundant, using such information to generate the search results is a great challenge because in a lot of times we need different results, such as: sample answer, structured information, list of things, comparison tables, sentences with complete meaning, or ability to navigate the information.
Interface:
Even if we knew exactly how to collect, organize, and index such information in a smart way that allows us to dynamically adopt the results types for queries; We will still have another challenge about how to view such amount of information in a relavently good way , and how to allow the people to use such information via small and easy interface.
Motivation:
We can summarize our motivation in the previous challenges.
We still spend so much time in reaching the information that we looking for, even if it’s an answer of simple question. Today, to answer simple question you must search, browse, and pore through the pages.
Search engines don’t understand the indexed documents, and the queries. search engines still struggle with the Inverted Indexing technology. We believe that we already have reached the end of the current technology, and we need to think about new beginning.
Goals:
Kngine aims to organize the human beings Systematic Knowledge and Experiences and make them accessible to everyone. We aim to collect and organize all objective data, and make it possible and easy to access. Our goal is to build Web 3.0 Web Search Engine on the advances of Web Search Engine, Semantic Web, Data Representation technologies -- a new form of Web Search Engine that will unleash a revolution of new possibilities.
Indexing & Building Universal Knowledge-Base:
In order to provide meaningful search results, we must understand the indexed documents and the query. Instead of indexing the Web in Inverted Index fashion, We choose to indexing the Web in different forms that allows us to unlock the meaning 'Knowledge-Base/Semantic Network'.
But it's incredibly hard to build a self-supervised machine learning indexers. So instead ,we decided to build a bootstrap Knowledge-Base that we can use to index the whole Web, how?
- Build extendible Knowledge-Base.
- The Indexers will try to understand the documents and extract information by consulting the Knowledge-Base.
- The extracted information will added into the Knowledge-Base
Our Knowledge-Base or Live objects is huge graph content objects of live. Every note/concept have properties, and relation with other concepts. These nodes represent live objects, such as: Peoples, Companies, Books, etc. Live Objects following open world assumption. There are a few ways to referent the relation between the concepts, mainly we represent the relations inside the property value, for example: <Linus Torvalds><Innovator><Linux> and < California><State of><USA>.
The concepts information organized into domains and every domain have set of properties. For example ‘Abraham Lincoln’ concept will be exists into the following domains: Book Author, Politician, U.S. President, Military Person, Person, and Deceased Person.
Today March 21, 2010 Live Objects content 1.3 billion piece of information about more than 9 million concept.
[1] SDSC - Fusing High-Performance Data with High-Performance Computing Will Speed Research (http://www.sdsc.edu/News%20Items/PR022410_hpd.html )