开发者

how do i get jsoup to work?

开发者 https://www.devze.com 2023-04-09 11:38 出处:网络
ive been going through these joup bits to get some information from a div: http://jsoup.org/cookbook/extracting-data/dom-navigation

ive been going through these joup bits to get some information from a div:

http://jsoup.org/cookbook/extracting-data/dom-navigation

Document doc = Jsoup.connect(path).get();
Element cat = doc.getElementById("category_1");
Elements links = cat.getElementsByTag("a");
for (Element link : links) 
{
    rstring += link.attr("href");
    rstring += link.text() + "\n";
}

that code bit i wrote does not work, and ive been working on this for hours.

i can get some of what i want with different jsoup functions, but i need to get the links in开发者_JS百科 this particular action so i can populate and array of certain things for my android app.

im attempting to parse http://android.myfewclicks.com for testing and building an app for my real site.

any assistance at all would be wonderful. jsoup just wont cooperate.

    <table class="table_list">
        <tbody class="header" id="category_1">
            <tr>
                <td colspan="4">
                    <div class="cat_bar">
                        <h3 class="catbg">
                            <a class="collapse" href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1"><img src="http://android.myfewclicks.com/Themes/default/images/collapse.gif" alt="-" /></a>
                            <a class="unreadlink" href="http://android.myfewclicks.com/index.php?action=unread;c=1">Unread Posts</a>
                            <a id="c1"></a><a href="http://android.myfewclicks.com/index.php?action=collapse;c=1;sa=collapse;c707bdb315=de9d7f201a0964cbab3d56e683507ad7#c1">Category A</a>
                        </h3>
                    </div>
                </td>
            </tr>
        </tbody>

on my test forum, there are four categorys. the three links inside this particular part is 1 set of the 4. if i can figure out how to adaquitely parse these out, then i should be able to make a big leap on my app. but jsoup isnt behaving the way im thinking it should, or im missing something very crucial.


You apparently need to login first in order to get the links with href. When I open the site in my browser while not logged in, I see

<tbody class="header" id="category_1">
    <tr>
        <td colspan="4">
            <div class="cat_bar">
                <h3 class="catbg">
                    <a id="c1"></a>Category A
                </h3>
            </div>
        </td>
    </tr>
</tbody>

I can get the links as follows:

Document document = Jsoup.connect("http://android.myfewclicks.com/").get();
Elements category1links = document.select("#category_1 a");

for (Element category1link : category1links) {
    System.out.println(category1links);
}

Which prints

<a id="c1"></a>

Note that there's no href or text!

Jsoup does not login for you automatically, nor does it take over the cookies of an arbitrary browser which is already installed on your machine. You need to login and maintain the session cookie yourself. See also Sending POST request with username and password and save session cookie for an example.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号