org.faceless.pdf2
Class PageExtractor.Text

java.lang.Object
  extended by org.faceless.pdf2.PageExtractor.Text
Enclosing class:
PageExtractor

public abstract static class PageExtractor.Text
extends Object

A class representing a piece of text which is extracted from the PageExtractor. Each text object has a location on the page, font-size, font-name, color and text.

Since:
2.6.2

Constructor Summary
PageExtractor.Text()
           
 
Method Summary
 AnnotationMarkup createAnnotationMarkup(String type)
          Create a new AnnotationMarkup of the specified type to cover this text.
 float getAngle()
          Return the angle of rotation of this text on the page, in degrees clockwise from 12 o'clock.
abstract  Paint getColor()
          Return the color of this text.
 float[] getCorners()
          Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the quadrilateral that encompasses the text, specified clockwise from bottom left.
abstract  String getFontName()
          Return the font name of this text.
abstract  float getFontSize()
          Return the font size of this Text in points
 float getLength()
          Return the length of this Text in points.
abstract  float getOffset(int pos)
          Given an offset into the text, return the start position of that letter.
abstract  PDFPage getPage()
          Return the PDFPage this text was found on - simply the page the parent PageExtractor was created from.
abstract  String getText()
          Return the text.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PageExtractor.Text

public PageExtractor.Text()
Method Detail

getLength

public float getLength()
Return the length of this Text in points. This method measures the baseline of the text, so for rotated text the value will always be positive regardless of the angle.

Returns:
the length of the text in points at its baseline

getCorners

public final float[] getCorners()
Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the quadrilateral that encompasses the text, specified clockwise from bottom left. The text baseline runs from (x1,y1) to (x4,y4).


createAnnotationMarkup

public AnnotationMarkup createAnnotationMarkup(String type)
Create a new AnnotationMarkup of the specified type to cover this text. The annotation is not added to the page

Parameters:
type - the type of markup - "Highlight", "Underline" etc.
Since:
2.8

getAngle

public float getAngle()
Return the angle of rotation of this text on the page, in degrees clockwise from 12 o'clock. Most text is not rotated and so will return 0.

Returns:
the angle of the text

getFontSize

public abstract float getFontSize()
Return the font size of this Text in points


getOffset

public abstract float getOffset(int pos)
Given an offset into the text, return the start position of that letter. Because text may not be on a horizontal line, this value is returned as a float in the range 0 to 1 (0 being at the start of the text, 1 being the end). For the common case where text is horizontal, you can calculate it's start position like so:
 float left = text.getCorners()[0] + (text.getOffset(pos) * text.getLength());
 

Parameters:
pos - the position of the letter in the Text to retrive the position for. In the range 0 to getText().length() - 1
Since:
2.6.12

getPage

public abstract PDFPage getPage()
Return the PDFPage this text was found on - simply the page the parent PageExtractor was created from.

Since:
2.6.12

getColor

public abstract Paint getColor()
Return the color of this text.

Returns:
the color

getFontName

public abstract String getFontName()
Return the font name of this text.

Returns:
the fontname

getText

public abstract String getText()
Return the text.

Returns:
the text


Copyright © 2001-2004 Big Faceless Organization