node.js - Chrome automatically changed DOM, or different from what cheerio gets -


so writing web scraping application using cheerio.js. things going until noticed cheerio $('tbody tr') return nothing, while when open same website in chrome, jquery $('tbody tr') return rows in table body. in cheerio's body, there no tbody. structure <table><theader></theader><tr></tr><tr></tr></table>. did chrome make change? did cheerio passed html response incorrectly?

following 3 html code snippets same when rendered html browser, yet original code different.

  1. no thead no tbody in source code

    <table><tr><td>row1</td></tr><tr><td>row2</td></tr></table>

  2. no tbody in source code

    <table><thead></thead><tr><td>row1</td></tr><tr><td>row2</td></tr></table>

  3. tbody , no thead in source code

    <table><tbody><tr><td>row1</td></tr><tr><td>row2</td></tr></tbody></table>

according w3schools.com browsers can use thead, tbody, tfoot elements enable scrolling of table body independently of header , footer.

browsers can optimize, normalize or modify dom before using display, long used dom renders intended.

in case, cheerio parser reads source code (result of node.js request) as-is , creates in-memory dom representation can traverse/modify later.

while jquery when run browser reads normalized , optimized dom parsed , processed html browser.

while 2 doms may different, same when presented user not bug, feature


Comments

Popular posts from this blog

php - Admin SDK -- get information about the group -

dns - How To Use Custom Nameserver On Free Cloudflare? -

Python Error - TypeError: input expected at most 1 arguments, got 3 -