Sunday, 17 August 2014

Jsoup parsing - Java



Hey so I'm going a project in Java, in which I parse data from IMDB-Rotten Tomatoes using Jsoup.


I've gotten most of the info I need but there are data I don't know how to get.


For example in the movie http://ift.tt/Wh40gq I need to get the number for User Reviews: 313,393 (need just the number) but using something like



Elements links19 = doc5.select("p[class=critic_stats]");


gets me all the p class="critic_stats"


"Average Rating: 5.8/10 Reviews Counted: 123 Fresh: 80 | Rotten: 43 Average Rating: 5.3/10 Critic Reviews: 23 Fresh: 13 | Rotten: 10 liked it Average Rating: 3.7/5 User Ratings: 313,393"


Also again in the same movie in IMDB http://ift.tt/uRf2zz I'm trying to get:


Country: USA | Bulgaria Language: English Release Date: 17 August 2012 (USA) Sound Mix: Dolby Digital | Datasat Color: Color Aspect Ratio: 2.35 : 1


Again I only need the parts in bold but everything is in



div class="txt-block">
h4 class="inline">


and I don't know if there's any way to get specific data based on for example



h4 class="inline">Sound Mix:</h4>

itemprop='url'>Dolby Digital</a>
itemprop='url'>Datasat</a>


Any ideas on how to get those .. I'm not sure what they are, child, attributes?


No comments:

Post a Comment