How to crawl specific ASP.NET pages using Python? -
i want crawl asp.net website urls same how can crawl specific pages using python?
here website want crawl: http://www.fveconstruction.ch/index.htm
(i using beautifulsoup, urllib , python 3)
what information should distinguish page other?
if target website single page application, can't crawled. workaround can see requests (get, post etc) go when manually navigate through website , ask crawler use those. or, teach crawler execute javascript @ least what's on target website.
it's website need change crawlable, need provide reasonable non-ajax version of every page needs indexed, or links page needs indexed. or use pushstate in angularjs.
Comments
Post a Comment