BeautifulSoup counting tags without parsing deep inside them

I thought about the following while writing an answer to this question.

Suppose I have a deeply nested xml file like this (but much more nested and much longer):


<section name="1">
    <subsection name"foo">
        <subsubsection name="bar">
            <deeper name="hey">
                <much_deeper name"yo">
                    <li>Some content</li>
                </much_deeper>
            </deeper>
        </subsubsection>
    </subsection>
</section>
<section name="2">
    ... and so forth
</section>

The problem with len(soup.find_all("section")) is that BS kind of "carries" the whole tree for each element in this process.

So, two questions:

Is there a way to make BS count the number of section without returning the whole content?

Is it more efficient or is it the same internal process?

BeautifulSoup counting tags without parsing deep inside them

No comments:

Post a Comment