php - Screen Scraping -


Hi, I'm trying to implement a screen scrapping scenario on my website and the following set is up to now. What I'm finally trying to do is to replace all the links in the variables of the $ results in which "ResultsDetails.aspx?" For "result-scrape-detail /" then output again can someone tell me in the right direction?

  & lt ;? Php $ url = "http: // mysite: 90 / test / label / baggage / resultindex.spand"; $ Raw = file_get_contents ($ url); $ Newlines = array ("\ t", "\ n", "\ r", "\ x20 \ x20", "\ 0", "\ x0B"); $ Content = str_replace ($ neulines, "", html_entity_decode ($ raw)); $ Start = strpos ($ contents, "& lt; div id = 'pageback'"); $ End = strpos ($ content, '& lt; / body & gt;', $ start) + 6; $ Result = substr ($ content, $ start, $ end-$ start); $ Pattern = 'ResultsDetails.aspx?'; $ Replacement = 'result-scrap-detail /'; Preg_replace ($ pattern, $ substitution, $ result); $ Return result; Use a DOM tool such as  

With it you can find all the links, Whom you are looking for with a jazzy syntax.

domain = file_get_html ('http: //www.domain com / path / to / page'); create a dom object from the HTML source; // Find all mailing links for foreign currency ($ dom- & gt; ('a [href ^ = ResultsDetails.aspx'] change to $ node) {// href attribute value $ node- & gt; href = 'results -scrape-detail / ';} // Output Modified Dom $ dom- & gt; External Text;

Comments