The development of search engines

The history of Excite can be traced back to February 1993. The idea of six Stanford university students is to analyze the relationship between words in order to search a large amount of information on the Internet more effectively. By the middle of 1993, the project had been fully invested, and they also released a version of search software for webmasters to use on their own websites, which was later called Excite for Web Servers.

Note: Excite later became famous for concept search. In May 2002, Excite, which was acquired by Infospace, stopped its own search engine and switched to the meta-search engine Dogpile.

2.1In April, 994, two doctoral students from Stanford University, Chinese-American Yang Zhiyuan and david filo, co-founded Yahoo! With the increase of the number of visits and links, Yahoo Directory began to support simple database search. Because of Yahoo! The data of is manually input, so it can't really be classified as a search engine. In fact, it's just a searchable directory. Yahoo! Because all the websites included in the website are attached with brief information, the search efficiency is obviously improved.

Attention: Yahoo! In the future, Altavista, Inktomi and Google will provide search engine services.

Yahoo! -almost became synonymous with the Internet in the 1990s.

3. 1995, a new form of search engine appeared-meta search engine. Users only need to submit a search request once, and the meta-search engine is responsible for the conversion and submission to multiple pre-selected independent search engines, and all the query results returned by independent search engines are returned to users after collection and processing.

The first meta-search engine was Eric Selberg, a graduate student of Washington University, and Metacrawler of Oren Ezioni. Meta-search engine looks good in concept, but the search effect is always unsatisfactory, so no meta-search engine has ever had a strong position.

4. Generation of intelligent retrieval: The word segmentation dictionary, synonym dictionary and homophone dictionary are used to improve the retrieval effect, and can further assist the query at the knowledge level or concept level. Through the retrieval processing of subject dictionary, upper and lower dictionary and related dictionaries at the same level, a knowledge system or concept network is formed, giving users intelligent knowledge tips, and finally helping users to obtain the best retrieval effect.

Example:

(1) Query "Computer", and you can also retrieve information related to "Computer";

⑵ The query scope can be further narrowed to "microcomputer" and "server" or expanded to "information technology" or related "electronic technology", "software" and "computer application".

⑶ It also includes ambiguous information and retrieval processing, such as whether "Apple" refers to a fruit or a computer brand. The distinction between "China people" and "China people * * * and China" will be processed by combining technologies such as ambiguous knowledge description database, full-text indexing, user retrieval context analysis and user-related feedback, so as to efficiently and accurately feed back the most needed information to users.

5. Personalization trend is an important feature and one of the inevitable trends of the future development of search engines. One way is to organize personal information through the community products of search engines (that is, to provide services to registered users), and then introduce personal factors into the basic information base retrieval of search engines for analysis, so as to obtain different search results for individuals. From June 5438+1October 2004, Yahoo launched myweb beta, and from June 5438+065438+1October a9, 2005, Googlesearchhistory basically followed the same path, analyzed the limited range of specific users' search needs, and then extended it to other similar websites on the Internet, giving the most according to the range of users' needs. The other is aimed at the popular Google personalized search engine, or yahooMindSet, or vivisimo, which we all know is foreground clustering. However, no matter which implementation method, that is, Google actively chooses the search scope or yahoo and vivisimo recombine the information they need in the results, it is an experiment or an idea, and it will not become the mainstream search engine application product in a short time.

6. Big global grid: Because there is no unified information organization standard to deal with network information resources, disordered network information resources are difficult to search, hand over, enjoy and even develop deeply, forming information islands. Grid technology is to eliminate information islands and realize the comprehensive connection of all resources on the Internet.

Global Information Grid (GIG)

The word robot has a special meaning for programmers. Computer robot refers to an automated program that can repeatedly perform a task at a speed that human beings can't reach. Because the robot program specially used to retrieve information crawls on the network like a spider, the robot program of the search engine is called a spider program.

Matthew Gray developed the World Wide Web Rover in 1993, which is the first "robot" program to detect the scale of the World Wide Web by using the links between HTML pages. At first, it was only used to count the number of servers on the Internet. Later, it was also able to capture web addresses (URLs).

1In April, 994, two doctoral students from Stanford University, Yang Zhiyuan and david filo, co-founded Yahoo. With the increase of the number of visits and links, Yahoo Directory began to support simple database search. Because of Yahoo! The data of is manually input, so it can't really be classified as a search engine. In fact, it's just a searchable directory. Yahoo acquired inktomi on February 23rd, 2002, Overture including Fast and Altavista on July 24th, 2003, and 372 1 Company on June 30th, 2003.

1994 In early 1994, Brian Pinkerton, a student at the University of Washington, started his small project, WebCrawler. 1On April 20th, 994, WebCrawler only contained content from 6000 servers. WebCrawler is the first full-text search engine on the Internet that supports searching all words in documents. Before it, users can only search through URL and abstract, which usually come from manual comments or programs that automatically extract the first 100 words of the text.

1In July, 994, Michael Mauldin of Carnegie Mellon University connected the spider program of John Leavitt to its indexing program and created Lycos. In addition to relevance ranking, Lycos also provides prefix matching and character similarity restrictions. Lycos is the first to use automatic summarization of web pages in search results, and its biggest advantage is that it far exceeds the data volume of other search engines.

At the end of 1994, Infoseek officially appeared. Its friendly interface and a large number of additional functions make it an important representative of search engines such as Lycos.

1995, a new form of search engine-a summary of meta-search engine appeared. The user only needs to submit a search request once, and the meta search engine is responsible for the conversion processing, and submits it to a plurality of pre-selected independent search engines, and all the query results returned by each independent search engine are collected and processed before returning to the user. The first meta-search engine was Eric Selberg, a graduate student of Washington University, and Metacrawler of Oren Ezioni.

1On September 26th, 995, Eric Brewer, an assistant professor at the University of California, Berkeley, and Paul Gauthier, a doctoral student, founded Inktomi. 1on may 20th, 996, Inktomi company was established, and a powerful HotBot appeared in front of the world. It claims that it can crawl more than 65438+ 1 100 million pages every day, so it has new content far beyond other search engines. HotBot also uses cookie to store users' personal search preferences.

199565438+In February, DEC officially released AltaVista. AltaVista is the first search engine that suppORts natural language search, AND it is also the first search engine that implements advanced search syntax (such as and, or, NOT, etc.). Users can use AltaVista to search newsgroups and get articles from the Internet. They can also search for words in picture names, titles, Java applets and ActiveX objects. AltaVista also claims to be the first search engine that supports users to submit or delete URLs to the web index database, and it can be started within 24 hours. One of the most interesting new features of AltaVista is to search all websites with URL links. AltaVista has also made many innovations in the user-oriented interface. It puts "tips" in the search box area to help users better express their search style. These tips are updated frequently, so that users will see many interesting functions that they may never know after searching for a few times. This series of functions are gradually widely adopted by other search engines. 1997, AltaVista released a graphic demonstration system, LiveTopics, to help users find what they want from thousands of search results.

1August, 997, the Northern Lights search engine officially appeared. It used to be one of the largest search engines in the database. It has no stop word. It has excellent current news, a special collection of more than 7 100 publications, and a good advanced search grammar. It is the first to support simple and automatic classification of search results.

Before 1998 10, Google was just a small project of Stanford University, BackRub. 1995, doctoral student Larry Page began to study search engine design, and registered the domain name on September 15, 1997. At the end of 1997, with the participation of sergey brin, Scott Hassan and Allen Strumberg, BachRub began to provide Demo. 1February, 999, Google completed the transformation from Alpha version to Beta version. Google regards1September 27th, 998 as its birthday. Google judges the importance of web pages on the basis of Pagerank, which greatly enhances the relevance of search results. Google's geek culture and not doing evil have won Google a high reputation and brand reputation. In April 2006, Google announced its Chinese name "Google", which was the first name given by Google in a non-English-speaking country.

Fast(Alltheweb) Company was founded in 1997, which is a by-product of academic research of Norwegian University of Science and Technology (NTNU). 1May, 999, released its own search engine AllTheWeb. The goal of Fast is to be the largest and fastest search engine in the world, and it has been close in recent years. Fast(Alltheweb) can automatically classify web pages according to ODP, support Flash and pdf search, support multilingual search, and also provide news search, picture search, video, MP3 and FTP search, which has extremely powerful advanced search function. (On February 25th, 2003, the Internet search department of Fast was acquired by Overture).

1In August, 1996, Sohu Company was established to classify Chinese websites. At one time, it had the reputation of "going out to find maps and surfing the Internet to find Sohu". With the rapid increase of Internet websites, this manually edited classified catalogue is no longer applicable. In August 2004, Sohu founded sogou, an independent domain name search website, calling itself the "third generation search engine". Openfind was founded in 1998 65438+ 10, and its technology comes from GAIS laboratory led by Professor Wu Sheng of Chung Cheng University in Taiwan. At first, Openfind was just a Chinese search engine. At its peak, it provided Chinese search engines for three famous portals: Sina, Qimo and Yahoo. But after 2000, the market was gradually divided by Baidu and Google. In June, 2002, Openfind re-released the beta version of Openfind search engine based on GAIS30 project, and launched PolyRankTM, and announced that it had accumulated 3.5 billion web pages and started to enter the English search field.

June 5438 +2000 10, Li Yanhong, two alumni of Peking University, a patent inventor of hyperlink analysis and a former senior engineer of Infoseek, and his friend Xu Yong (a postdoctoral fellow in Berkeley, California) founded Baidu Company in Zhongguancun, Beijing. Baidu search engine beta was released in August, 20001year (Baidu only provides search engines for Sohu, Sina, Tom and other portals), and Baidu search engine was officially released on October 22nd, 20001year, focusing on Chinese search.

Other functions of Baidu search engine include: Baidu snapshot, webpage preview/all webpage preview, related search words, typo correction tips, mp3 search and Flash search. After the Blitzen project was launched in March 2002, the technical upgrade was obviously accelerated. Later, a series of products such as Post Bar, Know, Map, Sinology, Encyclopedia, Document, Video and Blog were introduced, which were well received by netizens. On August 5, 2005, it was listed on NASDAQ, with an issue price of $27.00 and a code of BIDU. The opening price was US$ 66.00, closing at US$ 122.54, with an increase of 353.85%, setting a record for the highest increase of new shares listed in US stocks in the past five years.

On February 23rd, 2003, at 65438, the original HC Search officially operated independently, and China Search was established. In February 2004, China released the desktop search engine Internet Pig 1.0. In March 2006, Zhongsou changed its name to Internet Pig ig (Internet Gateway).

In June 2005, Sina officially launched its self-developed search engine "Aiwen". Since 2007, Sina loves to use Google search engine.

July 20071; Fully adopt the Youdao search technology independently developed by Netease, and merge the original comprehensive search and web search. There are web search, picture search and blog search to provide services for Netease search. Among them, web search uses its self-developed natural language processing, distributed storage and computing technology; Image search is the first advanced search function based on camera brand, model and even season; Compared with similar products, blog search has the advantages of comprehensive capture and timely update, and provides innovative functions such as "article preview" and "blog file".