pysseract.ResultIterator

class pysseract.ResultIterator

Iterator that yields results for an image at chosen level. If you’re familiar with C/C++ iterators, the methods of this class should look familiar. The returned iterator must be deleted after use, so ideally it should be used in a with block. The iterator points to data held within the pysseract instance that spawned it, and therefore can only be used while that instance still exists and has not been subjected to a call of Init, SetImage, Recognize, Clear, End or anything else that changes the internal PAGE_RES.

t: pysseract.Pysseract
resIter = t.GetIterator()
LEVEL = pysseract.PageIteratorLevel.TEXTLINE
while True:
    box = resIter.BoundingBox(LEVEL)
    text = resIter.GetUTF8Text(LEVEL)
    if not resIter.Next(LEVEL):
        break

For more examples, please consult https://github.com/tesseract-ocr/tesseract/wiki/APIExample

__init__()

Initialize self. See help(type(self)) for accurate signature.

Attributes

Methods

ResultIterator.Begin(self)

Moves the iterator to point to the start of the page to begin an iteration

ResultIterator.BlanksBeforeWord(self)

Returns whether there are any blank spaces before the start of the current text object

ResultIterator.BoundingBox(self, pageIterLv)

Returns the bounding box of the current item

ResultIterator.Confidence(self, pageIterLv)

Return the confidence level expressed by the model for the current object at the specified page hierarchy level

ResultIterator.Empty(self, pageIterLv)

Returns a boolean flag indicating whether the iterator is empty at the given PageIteratorLevel

ResultIterator.GetBestLSTMSymbolChoices(self)

Returns the LSTM choices for every LSTM timestep for the current word.

ResultIterator.GetUTF8Text(self, pageIterLv)

Returns the text of the current object at the specified page hierarchy level in UTF-8 format

ResultIterator.IsAtBeginningOf(self, pageIterLv)

IsAtBeginningOf() returns whether we’re at the logical beginning of the given level.

ResultIterator.IsAtFinalElement(self, …)

Implement PageIterator’s IsAtFinalElement correctly in a BiDi context.

ResultIterator.Next(self, pageIterLv)

Moves to the start of the next object at the given level in the page hierarchy in the appropriate reading order and returns false if the end of the page was reached.

ResultIterator.ParagraphIsLtr(self)

Return whether the current paragraph’s dominant reading direction is left-to-right (as opposed to right-to-left).