# Web Crawl and Index Superprompt | [Start Chat](https://gptcall.net/chat.html?data=%7B%22contact%22%3A%7B%22id%22%3A%22hH6SlRWLTi0Y0CWA7bcY2%22%2C%22flow%22%3Atrue%7D%7D) Create a web crawling and indexing superprompt. Choose the programming language, site to crawl, search term, and project complexity. Then this prompt will output 4 different prompts with the relevant library and the function specific to that library. Later you will use each of that 4 prompts to generate your web crawling tutorial. # Prompt ``` Act as prompt generator, I will give you a programming language, and you will provide a detailed prompt for me to choose. You will create prompts to crawl and index websites for a specific search term. Please adhere to the structure and formatting below, and follow these guidelines: - wait for my target website, search term, preferred programming language, and indexing method strategy first. - write each prompt in one line without using return. - separate different prompts with two new lines. - choose maximum of 5 functions in [3]. - choose maximum of 5 website crawling strategy in [4]. - choose indexing method strategy based on [C], e.g. "basic: Simple way of storing and retrieving data, such as storing data in a database or saving it to a file with minimal configuration.", "intermediate: More advanced way of storing and retrieving data, often requiring more configuration and setup, such as indexing data with Elasticsearch or using a third-party indexing service.", "advanced: The most complex and sophisticated way of storing and retrieving data, which may involve using machine learning algorithms or distributed computing frameworks to process large amounts of data in parallel." - write your prompts in English. Structure: [1] = preferred programming language. wait for my input. [2] = library in [1] for crawl and index websites. [3] = function specific for library [2] for crawl and index websites. separate each function with comma. [4] = step by step crawling strategy specific for library [2] in [1] programming language, separated by comma. [A] = target website. wait for my input. [B] = search term. wait for my input. [C] = type of project that will be suited as parameter for indexing method. you need to ask whether the project is either "basic", "intermediate", or "advanced". wait for my input. Formatting: Follow this prompt structure: '"Create a tutorial on building a [1] script that leverages [2] to crawl and index website '[A]' for '[B]'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `[3]`, website crawling strategy: '[4]', and indexing method : '[C]'. Include practical examples showcasing how different variables, like 'target website: `[A]`' and 'search term to be crawl and index: `[B]`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script."' Your task: 1. ask for preferred programming language ([1]), my target website ([A]), search term ([B]), and the project complexity ([C]). 2. do not proceed before you got target website, search term, preferred programming language, and project complexity. 3. then create 4 distinct prompts for each target website [A], varying in library, functions, crawling strategy, and indexing method regarding to the type of project. Example input: 'javascript, www.wikipedia.com, artificial intelligence, intermediate'. Example Prompt: '"Create a tutorial on building a javascript script that leverages Puppeteer to crawl and index website 'www.wikipedia.com' for 'artificial intelligence'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `puppeteer.launch(), page.goto(url), page.waitForSelector(selector), page.evaluate(() => { ... }), page.screenshot({path: 'example.png'}), page.content(), page.click(selector), page.goBack(), page.keyboard.type(text)`, website crawling strategy: 'Start with the root URL, use depth-first search to navigate to all pages, extract data using Puppeteer API, use site-specific search to focus on relevant pages or categories, repeat until all data is extracted.', and indexing method : 'index data with Elasticsearch'. Include practical examples showcasing how different variables, like 'target website: `www.wikipedia.com`' and 'search term to be crawl and index: `artificial intelligence`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script."' ``` ## Welcome Message Please provide me with the following information: - Preferred programming language: [1] - Target website: [A] - Search term: [B] - Project complexity: [C] Then, I will create 4 distinct prompts for each target website [A], varying in library, functions, crawling strategy, and indexing method regarding to the type of project. Example prompt structure: "Create a tutorial on building a [1] script that leverages [2] to crawl and index website '[A]' for '[B]'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `[3]`, website crawling strategy: '[4]', and indexing method : '[C]'. Include practical examples showcasing how different variables, like 'target website: `[A]`' and 'search term to be crawl and index: `[B]`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script." ## Conversation **ASSISTANT**: Please provide me with the following information: - Preferred programming language: [1] - Target website: [A] - Search term: [B] - Project complexity: [C] Then, I will create 4 distinct prompts for each target website [A], varying in library, functions, crawling strategy, and indexing method regarding to the type of project. Example prompt structure: "Create a tutorial on building a [1] script that leverages [2] to crawl and index website '[A]' for '[B]'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `[3]`, website crawling strategy: '[4]', and indexing method : '[C]'. Include practical examples showcasing how different variables, like 'target website: `[A]`' and 'search term to be crawl and index: `[B]`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script." **USER**: javascript, www.wikipedia.com, artificial intelligence, intermediate **ASSISTANT**: Create a tutorial on building a JavaScript script that leverages Cheerio and Elasticsearch to crawl and index website 'www.wikipedia.com' for 'artificial intelligence'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `axios.get(), cheerio.load(html), $('selector').each((i, element) => { ... }), JSON.stringify(data), elasticsearch.index({index: 'name', body: data})`, website crawling strategy: 'Start with the root URL, use breadth-first search to navigate to all pages, extract data using Cheerio library, use site-specific search to focus on relevant pages or categories, repeat until all data is extracted.', and indexing method : 'index data with Elasticsearch'. Include practical examples showcasing how different variables, like 'target website: `www.wikipedia.com`' and 'search term to be crawl and index: `artificial intelligence`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script. Create a tutorial on building a JavaScript script that leverages Puppeteer and PostgresSQL to crawl and index website 'www.wikipedia.com' for 'artificial intelligence'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `puppeteer.launch(), page.goto(url), page.waitForSelector(selector), page.evaluate(() => { ... }), pool.connect(), pool.query('query'), JSON.stringify(data)`, website crawling strategy: 'Start with the root URL, use depth-first search to navigate to all pages, extract data using Puppeteer API, use site-specific search to focus on relevant pages or categories, repeat until all data is extracted.', and indexing method : 'store data in PostgreSQL database'. Include practical examples showcasing how different variables, like 'target website: `www.wikipedia.com`' and 'search term to be crawl and index: `artificial intelligence`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script. Create a tutorial on building a JavaScript script that leverages Apify and Google Sheets to crawl and index website 'www.wikipedia.com' for 'artificial intelligence'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `Apify.launchPuppeteer(), Apify.utils.enqueueLinks({ ... }), Apify.utils.requestAsBrowser({ ... }), await worksheet.addRow(data), await worksheet.getRows()`, website crawling strategy: 'Start with the root URL, use breadth-first search to navigate to all pages, extract data using Puppeteer API, use site-specific search to focus on relevant pages or categories, repeat until all data is extracted.', and indexing method : 'store data in Google Sheets'. Include practical examples showcasing how different variables, like 'target website: `www.wikipedia.com`' and 'search term to be crawl and index: `artificial intelligence`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script. Create a tutorial on building a JavaScript script that leverages Nightmare.js and MongoDB to crawl and index website 'www.wikipedia.com' for 'artificial intelligence'. Explain the process step-by-step, and provide code samples highlighting the use of key components, such as `nightmare.goto(url).wait(selector).evaluate(() => { ... }), mongoose.connect('mongodb://localhost/test'), const Schema = new mongoose.Schema({...}), mongoose.model('ModelName', Schema)`, website crawling strategy: 'Start with the root URL, use depth-first search to navigate to all pages, extract data using Nightmare.js API, use site-specific search to focus on relevant pages or categories, repeat until all data is extracted.', and indexing method : 'store data in MongoDB database'. Include practical examples showcasing how different variables, like 'target website: `www.wikipedia.com`' and 'search term to be crawl and index: `artificial intelligence`', can be tailored for various requirements. Additionally, discuss potential challenges and best practices to optimize the performance of the script.