How Octopus Collector Collects HowNet Data
Octopus provides users with intelligent identification and file download functions, which is very suitable for using Octopus to collect HowNet data. The following are the general collection steps: 1. Open octopus collector and create a new collection task. 2. In the task setting, enter the website of HowNet (www.cnki.net) as the starting website for collection. 3. Configure collection rules. You can use the intelligent identification function to make Octopus automatically identify the data structure of HowNet pages, or you can set the collection rules manually. 4. If you set collection rules manually, you can select data elements on the page, such as title, author, abstract, etc. , and set the corresponding collection rules to ensure the correct collection of the required data. 5. Set page turning rules. Because the search results of HowNet may be displayed as pages, it is necessary to set octopus collector to automatically turn pages to obtain more data. 6. Run the acquisition task. After confirming the correct settings, you can start the collection task and let the octopus start collecting the data on HowNet. 7. Wait for the collection to be completed. Octopus will automatically capture the data on the page according to the set rules, and save it locally or export it to the designated database. Then use other data analysis tools to analyze and process the data. Octopus is widely used in the field of scientific research and training in universities, and has become a long-term partner with hundreds of universities at home and abroad. Learn more about the application of Octopus in scientific research in colleges and universities. Please go to official website for details.