Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Tackling net::ERR_NAME_NOT_RESOLVED and timeout error on browser object creation when using Puppeteer

+1
−0

Used to work with PhantomJS. Want to upgrade to Puppeteer. Started with some code:

"use strict";
const puppeteer = require("puppeteer");

const capture = async () => {
	let browser;
	try {
		console.time('myTimer');
		console.info("Opening browser ...");
		const browser = await puppeteer.launch({
			headless:true,'ignoreHTTPSErrors':true,devtools:false
		});
		const page = await browser.newPage();

		page.on('console', (msg) => console.log('DEBUG:', msg.text()) );
		console.info("browser instance created.");

		await page.evaluate(() => console.log(`URL="${location.href}".`));
		await page.goto("https://www.chunkbase.com/apps/seed-map");
	//	await page.goto("https://checkip.amazonaws.com/");
		await page.evaluate(() => console.log(`URL="${location.href}".`));

		console.info("getting webpage dimensions.");
		const getDimensions = await page.evaluate(() => {
			return {
				pxH: document.documentElement.clientHeight,
				pxW: document.documentElement.clientWidth,
				scale: window.devicePixelRatio
			};
		});
		console.info("Dimensions: ", getDimensions);

		console.info("capturing.");
		await page.screenshot({ path:"D:/screenshot.png", type:"png", fullPage:false });
		console.timeEnd('myTimer');
	} catch (err) {
		console.warn("Could not create a browser instance: ", err);
		return;
	} finally {
		await browser.close();
	}
};
capture();

Output:

Opening browser ...
browser instance created.
DEBUG: URL="about:blank".
DEBUG: [.WebGL-000022F60095A300]GL Driver Message (OpenGL, Performance, GL_CLOSE_PATH_NV, High): GPU stall due to ReadPixels
DEBUG: fun-hooks: referenced 'adpod' but it was never created
DEBUG: %cVideoManagerComponent::noStickyPlaylistOrSekindo  color: #999; font-weight: bold; JSHandle@object
DEBUG: %cBaseDynamicAdsInjector::_logDensityInfo  color: #999; font-weight: bold; JSHandle@object
DEBUG: fun-hooks: referenced 'checkAdUnitSetup' but it was never created
DEBUG: fun-hooks: referenced 'checkAdUnitSetup' but it was never created
DEBUG: fun-hooks: referenced 'checkAdUnitSetup' but it was never created
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
DEBUG: Failed to load resource: net::ERR_NAME_NOT_RESOLVED
Could not create a browser instance:  TimeoutError: Navigation timeout of 30000 ms exceeded
    at ~\node_modules\puppeteer\lib\cjs\puppeteer\common\LifecycleWatcher.js:108:111 ~\scripts\chunkbase.js:51
                await browser.close();
                              ^
TypeError: Cannot read properties of undefined (reading 'close')

Notes:

  1. With the current page.goto code Puppeteer
    1. Throws a lot of output.
    2. Times out after 30 seconds.
    3. The page dimensions are skipped.
    4. The screen capture never occurs.
    5. If I set headless to false, the page renders for the most part. Some parts dont load.
  2. If I swap around the URLs in the page.goto Puppeteer
    1. works fast without a ton of DEBUGs in output.
    2. The page dimensions are calculated.
    3. The screen capture occurs.

Questions:

  1. What's the proper way to initialize browser to correct the .close() error within the try-catch?
  2. How can I troubleshoot this further? Especially where these net::ERR_NAME_NOT_RESOLVED errors come from (is it DNS-related)?

Open to all sorts of suggestions, on-line reading material, etc.

Thanks.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Both of these URLs produce screen captures for me, with the small modification `browser = ` rather th... (1 comment)

1 answer

+0
−0

To fix TypeError: Cannot read properties of undefined (reading 'close'), remove the const in the following code, which ensures your let browser outside the block will be assigned in the common case when browser.launch() doesn't throw:

- const browser = await puppeteer.launch({
+ browser = await puppeteer.launch({

But if browser.launch() throws, then let browser will be undefined, so a throw is still possible in the finally block. Handle this in your finally block by using optional chaining:

- await browser.close();
+ await browser?.close();

Also, as you're seeing, the

console.warn("Could not create a browser instance: ", err);

log is misleading, since you're clearly able to create a browser instance. This catch block will run if any errors throw within the main automation block, which could be more than just failing to create a browser instance. That's only the first line of the block. I would make it:

- console.warn("Could not create a browser instance: ", err);
+ console.error(err);

As for your other question, the following listener forwards all logs from the website console into your Node console:

page.on('console', (msg) => console.log('DEBUG:', msg.text()) );

The output you're seeing is mostly harmless noise logged to a greater or lesser extent from site to site. I generally wouldn't add this listener since most large sites spam the console with random resource loading failures like this.

Generally speaking, I recommend the following goto:

await page.goto(url, {waitUntil: "domcontentloaded"});

which is least likely to get stuck and throw a loading timeout. If you're taking a screenshot, wait for all images to load manually before doing so ("networkidle0" can get stuck, as can the default "load" predicate).

Furthermore, the many sites take anti-bot measures, and headless mode makes bots much easier to detect. The default user agent basically says "I am a bot", so you can start by setting it to a normal mobile browser user agent

const ua =
  "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Mobile Safari/537.3";
await page.setUserAgent(ua);

If that doesn't work, you can try the Puppeteer stealth plugin or other fingerprint obfuscating measures. Or use headful mode.

Note that browser automation has few silver bullets, so a script that can automate one site may not work on another. You'll generally need to adapt your code a bit to determine when a particular page is fully loaded or otherwise in an actionable state, depending on what you want to accomplish on it.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »