I have an async function that reads a list of websites from a csv file.
async function readCSV(){
const fileStream = fs.createReadStream('./topm.csv');
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
for await (const line of rl) {
var currentline=line.split(",");
var res_server_http = await check_page("http://www."+currentline[1])
}
}
Every time that I read a site I call check_page function that do some operations. Every time that I have one I wait its ending before start to new site.
async function check_page(web_page){
// do some operations....
}
Up this point it works correctly, but now I have to integrate my code with a web-crawler.
Inside readCSV
function I have to call it for every site that I read and for each one I should call check_page
function.
Now I've edit readCSV
in this way:
const fileStream = fs.createReadStream('./topm.csv');
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
for await (const line of rl) {
var currentline=line.split(",");
await (new Promise( resolve => {
new Crawler().configure({depth: 2})
.crawl(site, async (page) => {
//console.log(page.url);
var res_server_http = await check_page("http://www."+currentline[1])
// Resolve here
resolve();
});
}));
}
I'm using this code for web-crawler: https://www.npmjs.com/package/js-crawler
This function now doesn't work because it is not async. How can I change my code ?
Now I've this error:
(node:907) UnhandledPromiseRejectionWarning: ReferenceError: site is not defined
at /Users/francesco/Desktop/tesi/crawler.js:55:14
at new Promise (<anonymous>)
at readCSV (/Users/francesco/Desktop/tesi/crawler.js:53:12)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:907) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag --unhandled-rejections=strict
(see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
(node:907) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.