Engine composition:
1, searcher
Its function is to roam the Internet, discover and collect information;
2, indexer
Its function is to understand the information retrieved by the searcher, extract index items from it, which are used to represent documents and generate the index table of document library;
3. The person who retrieved it
Its function is to quickly retrieve documents in the index database according to users' queries, evaluate the relevance, sort the results to be output, and feed back information reasonably according to users' query requirements;
4. User interface
Its function is to accept users' queries, display query results and provide personalized query items.
5. Robots
Robots protocol, the website tells the search engine which pages can be crawled and which pages cannot be crawled through the robots protocol. Robots protocol is a general moral principle in the field of Internet websites. Its purpose is to protect website data and sensitive information, and to ensure that users' personal information and privacy are not infringed. Because it is not an order, search engines need to consciously abide by it.
Extended data:
Index method of internet search engine;
In the simplest case, search engines only need to store words and their addresses.
In fact, this will limit the use of search engines, because it is impossible to distinguish whether a word is mainly used or briefly mentioned in a web page, whether the word is used once or many times, or whether the web page contains links to other web pages containing the keyword.
In other words, it will be impossible to build a ranking table and put the most useful web pages at the top of the query result list.
Baidu encyclopedia-network search engine