Simple HTML Dom Scraping Google Result



I need to scrape the very little piece of text which Google returns to any enquiry as part of the "Knowledge Graph" result (the one generally on the right-hand side) which it gets from Wikipedia. This way I can then convert the plain-text to Voice Answer. Using Simple HTML Dom I have no problems scraping such info from Bing or Ask, but the very DIV (and SPAN) within which this result is nested on Google, I just can't get it. Simple function below:



$question = str_replace(' ','+',$_GET['question']);
$address = 'http://ift.tt/1kVmble'.$question;
$ret = scraping_Google($address);

function scraping_Google($url) {
// create HTML DOM
$html = file_get_html($url);

// get title
$ret = $html->find('div.kno-rdesc', 0)->plaintext;

// clean up memory
$html->clear();
unset($html);

return $ret;
}

echo $ret;


The very div.kno-rdesc is where the content is nested (this I easily retrieve using Code Inspector on Chrome). Yet, no success to parse this tiny piece of information. Anybody able to help out? Cheers!


No comments:

Post a Comment