How to crawl specific ASP.NET pages using Python? -


i want crawl asp.net website urls same how can crawl specific pages using python?

here website want crawl: http://www.fveconstruction.ch/index.htm

(i using beautifulsoup, urllib , python 3)

what information should distinguish page other?

if target website single page application, can't crawled. workaround can see requests (get, post etc) go when manually navigate through website , ask crawler use those. or, teach crawler execute javascript @ least what's on target website.

it's website need change crawlable, need provide reasonable non-ajax version of every page needs indexed, or links page needs indexed. or use pushstate in angularjs.


Comments

Popular posts from this blog

php - Admin SDK -- get information about the group -

dns - How To Use Custom Nameserver On Free Cloudflare? -

Python Error - TypeError: input expected at most 1 arguments, got 3 -