C# 在ITextractionStrategy中使用ITextractionStrategy和LocationTextExtractionStrategy获取字符串的坐标
我有一个PDF文件,我正在使用iTextractionStrategy读取字符串。现在我从字符串中提取一个类似于C# 在ITextractionStrategy中使用ITextractionStrategy和LocationTextExtractionStrategy获取字符串的坐标,c#,itextsharp,C#,Itextsharp,我有一个PDF文件,我正在使用iTextractionStrategy读取字符串。现在我从字符串中提取一个类似于的子字符串,我的名字是XYZ,需要从PDF文件中获取子字符串的直角坐标,但无法获取。通过谷歌搜索,我知道位置文本提取策略但无法获取如何使用它来获得坐标 这是密码 ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy(); string currentText = PdfTextExtractor.GetT
的子字符串,我的名字是XYZ
,需要从PDF文件中获取子字符串的直角坐标,但无法获取。通过谷歌搜索,我知道位置文本提取策略
但无法获取如何使用它来获得坐标
这是密码
ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);
string getcoordinate="My name is XYZ";
如何使用ITEXTSHARP获取此子字符串的直角坐标
请提供帮助。这是一个非常简单的实现版本 在实施之前,重要的是要知道PDF没有“单词”、“段落”、“句子”等概念。此外,PDF中的文本不一定从左到右、从上到下排列,这与非LTR语言无关。“Hello World”一词可以写成PDF格式:
Draw H at (10, 10)
Draw ell at (20, 10)
Draw rld at (90, 10)
Draw o Wo at (50, 20)
它也可以写成
Draw Hello World at (10,10)
您需要实现的itextractionstrategy
接口有一个名为RenderText
的方法,该方法对PDF中的每一段文本调用一次。注意我说的是“块”而不是“字”。在上面的第一个示例中,该方法将为这两个单词调用四次。在第二个例子中,这两个词会被调用一次。这是需要理解的非常重要的部分。PDF没有文字,因此,iTextSharp也没有文字。“单词”部分完全由你来解决
同样,正如我上面所说,PDF没有段落。需要注意这一点的原因是PDF无法将文本换行。任何时候,当您看到类似于段落返回的内容时,您实际上看到的是一个全新的文本绘制命令,其坐标与前一行不同。看
下面的代码是一个非常简单的实现。为此,我对LocationTextExtractionStrategy
进行了子类化,它已经实现了itextractionstrategy
。每次调用RenderText()
时,我都会找到当前块的矩形(使用),并将其存储以备将来使用。我使用这个简单的助手类来存储这些块和矩形:
//Helper class that stores our rectangle and text
public class RectAndText {
public iTextSharp.text.Rectangle Rect;
public String Text;
public RectAndText(iTextSharp.text.Rectangle rect, String text) {
this.Rect = rect;
this.Text = text;
}
}
下面是子类:
public class MyLocationTextExtractionStrategy : LocationTextExtractionStrategy {
//Hold each coordinate
public List<RectAndText> myPoints = new List<RectAndText>();
//Automatically called for each chunk of text in the PDF
public override void RenderText(TextRenderInfo renderInfo) {
base.RenderText(renderInfo);
//Get the bounding box for the chunk of text
var bottomLeft = renderInfo.GetDescentLine().GetStartPoint();
var topRight = renderInfo.GetAscentLine().GetEndPoint();
//Create a rectangle from it
var rect = new iTextSharp.text.Rectangle(
bottomLeft[Vector.I1],
bottomLeft[Vector.I2],
topRight[Vector.I1],
topRight[Vector.I2]
);
//Add this to our main collection
this.myPoints.Add(new RectAndText(rect, renderInfo.GetText()));
}
}
我再怎么强调也不过分,上面的没有考虑到“单词”,这将取决于你。传递到RenderText
的TextRenderInfo
对象有一个名为getCharacterRenderInfo()
的方法,您可以使用该方法获取更多信息。如果您不关心字体中的下行项,您可能还希望使用GetBaseline(),而不是GetDescentLine()`
编辑
(我吃了一顿丰盛的午餐,所以我觉得自己更有帮助。)
下面是一个更新版本的MyLocationTextExtractionStrategy
,它实现了我下面的评论所说的功能,即搜索一个字符串并在每个块中搜索该字符串。出于列出的所有原因,这在某些/许多/大多数/所有情况下都不起作用。如果子字符串在单个块中存在多次,它也将只返回第一个实例。连字和变音符号也可能搞乱这一点
public class MyLocationTextExtractionStrategy : LocationTextExtractionStrategy {
//Hold each coordinate
public List<RectAndText> myPoints = new List<RectAndText>();
//The string that we're searching for
public String TextToSearchFor { get; set; }
//How to compare strings
public System.Globalization.CompareOptions CompareOptions { get; set; }
public MyLocationTextExtractionStrategy(String textToSearchFor, System.Globalization.CompareOptions compareOptions = System.Globalization.CompareOptions.None) {
this.TextToSearchFor = textToSearchFor;
this.CompareOptions = compareOptions;
}
//Automatically called for each chunk of text in the PDF
public override void RenderText(TextRenderInfo renderInfo) {
base.RenderText(renderInfo);
//See if the current chunk contains the text
var startPosition = System.Globalization.CultureInfo.CurrentCulture.CompareInfo.IndexOf(renderInfo.GetText(), this.TextToSearchFor, this.CompareOptions);
//If not found bail
if (startPosition < 0) {
return;
}
//Grab the individual characters
var chars = renderInfo.GetCharacterRenderInfos().Skip(startPosition).Take(this.TextToSearchFor.Length).ToList();
//Grab the first and last character
var firstChar = chars.First();
var lastChar = chars.Last();
//Get the bounding box for the chunk of text
var bottomLeft = firstChar.GetDescentLine().GetStartPoint();
var topRight = lastChar.GetAscentLine().GetEndPoint();
//Create a rectangle from it
var rect = new iTextSharp.text.Rectangle(
bottomLeft[Vector.I1],
bottomLeft[Vector.I2],
topRight[Vector.I1],
topRight[Vector.I2]
);
//Add this to our main collection
this.myPoints.Add(new RectAndText(rect, this.TextToSearchFor));
}
这是一个老问题,但我把我的回答留在这里,因为我在网上找不到正确的答案
正如克里斯·哈斯(Chris Haas)所揭示的,处理单词并不容易,因为iText处理的是块。Chris post在我的大部分测试中都失败了,因为一个单词通常被分成不同的块(他在帖子中警告)
为了解决这个问题,我采用了以下策略:
以字符分割块(实际上每个字符包含textrenderinfo对象)
将字符按行分组。这不是直截了当的,因为您必须处理块对齐
搜索每行需要查找的单词
我把密码留在这里。我用几个文档对它进行了测试,效果很好,但在某些情况下可能会失败,因为这个chunk->words转换有点棘手
希望这对某人有帮助
class LocationTextExtractionStrategyEx : LocationTextExtractionStrategy
{
private List<LocationTextExtractionStrategyEx.ExtendedTextChunk> m_DocChunks = new List<ExtendedTextChunk>();
private List<LocationTextExtractionStrategyEx.LineInfo> m_LinesTextInfo = new List<LineInfo>();
public List<SearchResult> m_SearchResultsList = new List<SearchResult>();
private String m_SearchText;
public const float PDF_PX_TO_MM = 0.3528f;
public float m_PageSizeY;
public LocationTextExtractionStrategyEx(String sSearchText, float fPageSizeY)
: base()
{
this.m_SearchText = sSearchText;
this.m_PageSizeY = fPageSizeY;
}
private void searchText()
{
foreach (LineInfo aLineInfo in m_LinesTextInfo)
{
int iIndex = aLineInfo.m_Text.IndexOf(m_SearchText);
if (iIndex != -1)
{
TextRenderInfo aFirstLetter = aLineInfo.m_LineCharsList.ElementAt(iIndex);
SearchResult aSearchResult = new SearchResult(aFirstLetter, m_PageSizeY);
this.m_SearchResultsList.Add(aSearchResult);
}
}
}
private void groupChunksbyLine()
{
LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk1 = null;
LocationTextExtractionStrategyEx.LineInfo textInfo = null;
foreach (LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk2 in this.m_DocChunks)
{
if (textChunk1 == null)
{
textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
this.m_LinesTextInfo.Add(textInfo);
}
else if (textChunk2.sameLine(textChunk1))
{
textInfo.appendText(textChunk2);
}
else
{
textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
this.m_LinesTextInfo.Add(textInfo);
}
textChunk1 = textChunk2;
}
}
public override string GetResultantText()
{
groupChunksbyLine();
searchText();
//In this case the return value is not useful
return "";
}
public override void RenderText(TextRenderInfo renderInfo)
{
LineSegment baseline = renderInfo.GetBaseline();
//Create ExtendedChunk
ExtendedTextChunk aExtendedChunk = new ExtendedTextChunk(renderInfo.GetText(), baseline.GetStartPoint(), baseline.GetEndPoint(), renderInfo.GetSingleSpaceWidth(), renderInfo.GetCharacterRenderInfos().ToList());
this.m_DocChunks.Add(aExtendedChunk);
}
public class ExtendedTextChunk
{
public string m_text;
private Vector m_startLocation;
private Vector m_endLocation;
private Vector m_orientationVector;
private int m_orientationMagnitude;
private int m_distPerpendicular;
private float m_charSpaceWidth;
public List<TextRenderInfo> m_ChunkChars;
public ExtendedTextChunk(string txt, Vector startLoc, Vector endLoc, float charSpaceWidth,List<TextRenderInfo> chunkChars)
{
this.m_text = txt;
this.m_startLocation = startLoc;
this.m_endLocation = endLoc;
this.m_charSpaceWidth = charSpaceWidth;
this.m_orientationVector = this.m_endLocation.Subtract(this.m_startLocation).Normalize();
this.m_orientationMagnitude = (int)(Math.Atan2((double)this.m_orientationVector[1], (double)this.m_orientationVector[0]) * 1000.0);
this.m_distPerpendicular = (int)this.m_startLocation.Subtract(new Vector(0.0f, 0.0f, 1f)).Cross(this.m_orientationVector)[2];
this.m_ChunkChars = chunkChars;
}
public bool sameLine(LocationTextExtractionStrategyEx.ExtendedTextChunk textChunkToCompare)
{
return this.m_orientationMagnitude == textChunkToCompare.m_orientationMagnitude && this.m_distPerpendicular == textChunkToCompare.m_distPerpendicular;
}
}
public class SearchResult
{
public int iPosX;
public int iPosY;
public SearchResult(TextRenderInfo aCharcter, float fPageSizeY)
{
//Get position of upperLeft coordinate
Vector vTopLeft = aCharcter.GetAscentLine().GetStartPoint();
//PosX
float fPosX = vTopLeft[Vector.I1];
//PosY
float fPosY = vTopLeft[Vector.I2];
//Transform to mm and get y from top of page
iPosX = Convert.ToInt32(fPosX * PDF_PX_TO_MM);
iPosY = Convert.ToInt32((fPageSizeY - fPosY) * PDF_PX_TO_MM);
}
}
public class LineInfo
{
public string m_Text;
public List<TextRenderInfo> m_LineCharsList;
public LineInfo(LocationTextExtractionStrategyEx.ExtendedTextChunk initialTextChunk)
{
this.m_Text = initialTextChunk.m_text;
this.m_LineCharsList = initialTextChunk.m_ChunkChars;
}
public void appendText(LocationTextExtractionStrategyEx.ExtendedTextChunk additionalTextChunk)
{
m_LineCharsList.AddRange(additionalTextChunk.m_ChunkChars);
this.m_Text += additionalTextChunk.m_text;
}
}
}
class LocationTextExtractionStrategyEx:LocationTextExtractionStrategy
{
私有列表m_DocChunks=新列表();
私有列表m_LinesTextInfo=新列表();
公共列表m_SearchResultsList=新列表();
私有字符串m_SearchText;
公共常数浮动PDF_PX_至_MM=0.3528f;
公共浮动m_PageSizeY;
public LocationTextExtractionStrategyEx(字符串sSearchText,浮点fPageSizeY)
:base()
{
this.m_SearchText=sSearchText;
this.m_PageSizeY=fPageSizeY;
}
私有void searchText()
{
foreach(m_LinesTextInfo中的LineInfo aLineInfo)
{
int iIndex=aLineInfo.m_Text.IndexOf(m_SearchText);
如果(iIndex!=-1)
{
TextRenderInfo aFirstLetter=aLineInfo.m_LineCharsList.ElementAt(iIndex);
SearchResult aSearchResult=新的搜索结果(aFirstLetter,m_PageSizeY);
这个.m_SearchResultsList.Add(aSearchResult);
}
}
}
私有void groupChunksbyLine()
{
LocationTextExtractionStrategyEx.ExtendedTextChunkTextChunk1=null;
LocationTextExtractionStrategyEx.LineInfo textInfo=null;
foreach(LocationTextExtractionStrategyEx.extendedTextChunkTextChunk2在此.m_DocChunks中)
{
if(textChunk1==null)
{
textInfo=新位置textextextractionstrategyex.LineInfo(textChunk2);
this.m_LinesTextInfo.Add(textInfo);
}
else if(textChunk2.sameLine(textChunk1))
{
textInfo.appendText(textChunk2);
}
其他的
{
var t = new MyLocationTextExtractionStrategy("sample");
class LocationTextExtractionStrategyEx : LocationTextExtractionStrategy
{
private List<LocationTextExtractionStrategyEx.ExtendedTextChunk> m_DocChunks = new List<ExtendedTextChunk>();
private List<LocationTextExtractionStrategyEx.LineInfo> m_LinesTextInfo = new List<LineInfo>();
public List<SearchResult> m_SearchResultsList = new List<SearchResult>();
private String m_SearchText;
public const float PDF_PX_TO_MM = 0.3528f;
public float m_PageSizeY;
public LocationTextExtractionStrategyEx(String sSearchText, float fPageSizeY)
: base()
{
this.m_SearchText = sSearchText;
this.m_PageSizeY = fPageSizeY;
}
private void searchText()
{
foreach (LineInfo aLineInfo in m_LinesTextInfo)
{
int iIndex = aLineInfo.m_Text.IndexOf(m_SearchText);
if (iIndex != -1)
{
TextRenderInfo aFirstLetter = aLineInfo.m_LineCharsList.ElementAt(iIndex);
SearchResult aSearchResult = new SearchResult(aFirstLetter, m_PageSizeY);
this.m_SearchResultsList.Add(aSearchResult);
}
}
}
private void groupChunksbyLine()
{
LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk1 = null;
LocationTextExtractionStrategyEx.LineInfo textInfo = null;
foreach (LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk2 in this.m_DocChunks)
{
if (textChunk1 == null)
{
textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
this.m_LinesTextInfo.Add(textInfo);
}
else if (textChunk2.sameLine(textChunk1))
{
textInfo.appendText(textChunk2);
}
else
{
textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
this.m_LinesTextInfo.Add(textInfo);
}
textChunk1 = textChunk2;
}
}
public override string GetResultantText()
{
groupChunksbyLine();
searchText();
//In this case the return value is not useful
return "";
}
public override void RenderText(TextRenderInfo renderInfo)
{
LineSegment baseline = renderInfo.GetBaseline();
//Create ExtendedChunk
ExtendedTextChunk aExtendedChunk = new ExtendedTextChunk(renderInfo.GetText(), baseline.GetStartPoint(), baseline.GetEndPoint(), renderInfo.GetSingleSpaceWidth(), renderInfo.GetCharacterRenderInfos().ToList());
this.m_DocChunks.Add(aExtendedChunk);
}
public class ExtendedTextChunk
{
public string m_text;
private Vector m_startLocation;
private Vector m_endLocation;
private Vector m_orientationVector;
private int m_orientationMagnitude;
private int m_distPerpendicular;
private float m_charSpaceWidth;
public List<TextRenderInfo> m_ChunkChars;
public ExtendedTextChunk(string txt, Vector startLoc, Vector endLoc, float charSpaceWidth,List<TextRenderInfo> chunkChars)
{
this.m_text = txt;
this.m_startLocation = startLoc;
this.m_endLocation = endLoc;
this.m_charSpaceWidth = charSpaceWidth;
this.m_orientationVector = this.m_endLocation.Subtract(this.m_startLocation).Normalize();
this.m_orientationMagnitude = (int)(Math.Atan2((double)this.m_orientationVector[1], (double)this.m_orientationVector[0]) * 1000.0);
this.m_distPerpendicular = (int)this.m_startLocation.Subtract(new Vector(0.0f, 0.0f, 1f)).Cross(this.m_orientationVector)[2];
this.m_ChunkChars = chunkChars;
}
public bool sameLine(LocationTextExtractionStrategyEx.ExtendedTextChunk textChunkToCompare)
{
return this.m_orientationMagnitude == textChunkToCompare.m_orientationMagnitude && this.m_distPerpendicular == textChunkToCompare.m_distPerpendicular;
}
}
public class SearchResult
{
public int iPosX;
public int iPosY;
public SearchResult(TextRenderInfo aCharcter, float fPageSizeY)
{
//Get position of upperLeft coordinate
Vector vTopLeft = aCharcter.GetAscentLine().GetStartPoint();
//PosX
float fPosX = vTopLeft[Vector.I1];
//PosY
float fPosY = vTopLeft[Vector.I2];
//Transform to mm and get y from top of page
iPosX = Convert.ToInt32(fPosX * PDF_PX_TO_MM);
iPosY = Convert.ToInt32((fPageSizeY - fPosY) * PDF_PX_TO_MM);
}
}
public class LineInfo
{
public string m_Text;
public List<TextRenderInfo> m_LineCharsList;
public LineInfo(LocationTextExtractionStrategyEx.ExtendedTextChunk initialTextChunk)
{
this.m_Text = initialTextChunk.m_text;
this.m_LineCharsList = initialTextChunk.m_ChunkChars;
}
public void appendText(LocationTextExtractionStrategyEx.ExtendedTextChunk additionalTextChunk)
{
m_LineCharsList.AddRange(additionalTextChunk.m_ChunkChars);
this.m_Text += additionalTextChunk.m_text;
}
}
}
using System.Collections.Generic;
using iTextSharp.text.pdf.parser;
namespace Logic
{
public class LocationTextExtractionStrategyWithPosition : LocationTextExtractionStrategy
{
private readonly List<TextChunk> locationalResult = new List<TextChunk>();
private readonly ITextChunkLocationStrategy tclStrat;
public LocationTextExtractionStrategyWithPosition() : this(new TextChunkLocationStrategyDefaultImp()) {
}
/**
* Creates a new text extraction renderer, with a custom strategy for
* creating new TextChunkLocation objects based on the input of the
* TextRenderInfo.
* @param strat the custom strategy
*/
public LocationTextExtractionStrategyWithPosition(ITextChunkLocationStrategy strat)
{
tclStrat = strat;
}
private bool StartsWithSpace(string str)
{
if (str.Length == 0) return false;
return str[0] == ' ';
}
private bool EndsWithSpace(string str)
{
if (str.Length == 0) return false;
return str[str.Length - 1] == ' ';
}
/**
* Filters the provided list with the provided filter
* @param textChunks a list of all TextChunks that this strategy found during processing
* @param filter the filter to apply. If null, filtering will be skipped.
* @return the filtered list
* @since 5.3.3
*/
private List<TextChunk> filterTextChunks(List<TextChunk> textChunks, ITextChunkFilter filter)
{
if (filter == null)
{
return textChunks;
}
var filtered = new List<TextChunk>();
foreach (var textChunk in textChunks)
{
if (filter.Accept(textChunk))
{
filtered.Add(textChunk);
}
}
return filtered;
}
public override void RenderText(TextRenderInfo renderInfo)
{
LineSegment segment = renderInfo.GetBaseline();
if (renderInfo.GetRise() != 0)
{ // remove the rise from the baseline - we do this because the text from a super/subscript render operations should probably be considered as part of the baseline of the text the super/sub is relative to
Matrix riseOffsetTransform = new Matrix(0, -renderInfo.GetRise());
segment = segment.TransformBy(riseOffsetTransform);
}
TextChunk tc = new TextChunk(renderInfo.GetText(), tclStrat.CreateLocation(renderInfo, segment));
locationalResult.Add(tc);
}
public IList<TextLocation> GetLocations()
{
var filteredTextChunks = filterTextChunks(locationalResult, null);
filteredTextChunks.Sort();
TextChunk lastChunk = null;
var textLocations = new List<TextLocation>();
foreach (var chunk in filteredTextChunks)
{
if (lastChunk == null)
{
//initial
textLocations.Add(new TextLocation
{
Text = chunk.Text,
X = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[0]),
Y = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[1])
});
}
else
{
if (chunk.SameLine(lastChunk))
{
var text = "";
// we only insert a blank space if the trailing character of the previous string wasn't a space, and the leading character of the current string isn't a space
if (IsChunkAtWordBoundary(chunk, lastChunk) && !StartsWithSpace(chunk.Text) && !EndsWithSpace(lastChunk.Text))
text += ' ';
text += chunk.Text;
textLocations[textLocations.Count - 1].Text += text;
}
else
{
textLocations.Add(new TextLocation
{
Text = chunk.Text,
X = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[0]),
Y = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[1])
});
}
}
lastChunk = chunk;
}
//now find the location(s) with the given texts
return textLocations;
}
}
public class TextLocation
{
public float X { get; set; }
public float Y { get; set; }
public string Text { get; set; }
}
}
using (var reader = new PdfReader(inputPdf))
{
var parser = new PdfReaderContentParser(reader);
var strategy = parser.ProcessContent(pageNumber, new LocationTextExtractionStrategyWithPosition());
var res = strategy.GetLocations();
reader.Close();
}
var searchResult = res.Where(p => p.Text.Contains(searchText)).OrderBy(p => p.Y).Reverse().ToList();
inputPdf is a byte[] that has the pdf data
pageNumber is the page where you want to search in
Class TextExtractor
Inherits LocationTextExtractionStrategy
Implements iTextSharp.text.pdf.parser.ITextExtractionStrategy
Public oPoints As IList(Of RectAndText) = New List(Of RectAndText)
Public Overrides Sub RenderText(renderInfo As TextRenderInfo) 'Implements IRenderListener.RenderText
MyBase.RenderText(renderInfo)
Dim bottomLeft As Vector = renderInfo.GetDescentLine().GetStartPoint()
Dim topRight As Vector = renderInfo.GetAscentLine().GetEndPoint() 'GetBaseline
Dim rect As Rectangle = New Rectangle(bottomLeft(Vector.I1), bottomLeft(Vector.I2), topRight(Vector.I1), topRight(Vector.I2))
oPoints.Add(New RectAndText(rect, renderInfo.GetText()))
End Sub
Private Function GetLines() As Dictionary(Of Single, ArrayList)
Dim oLines As New Dictionary(Of Single, ArrayList)
For Each p As RectAndText In oPoints
Dim iBottom = p.Rect.Bottom
If oLines.ContainsKey(iBottom) = False Then
oLines(iBottom) = New ArrayList()
End If
oLines(iBottom).Add(p)
Next
Return oLines
End Function
Public Function Find(ByVal sFind As String) As iTextSharp.text.Rectangle
Dim oLines As Dictionary(Of Single, ArrayList) = GetLines()
For Each oEntry As KeyValuePair(Of Single, ArrayList) In oLines
'Dim iBottom As Integer = oEntry.Key
Dim oRectAndTexts As ArrayList = oEntry.Value
Dim sLine As String = ""
For Each p As RectAndText In oRectAndTexts
sLine += p.Text
If sLine.IndexOf(sFind) <> -1 Then
Return p.Rect
End If
Next
Next
Return Nothing
End Function
End Class
Public Class RectAndText
Public Rect As iTextSharp.text.Rectangle
Public Text As String
Public Sub New(ByVal rect As iTextSharp.text.Rectangle, ByVal text As String)
Me.Rect = rect
Me.Text = text
End Sub
End Class
Sub EncryptPdf(ByVal sInFilePath As String, ByVal sOutFilePath As String)
Dim oPdfReader As iTextSharp.text.pdf.PdfReader = New iTextSharp.text.pdf.PdfReader(sInFilePath)
Dim oPdfDoc As New iTextSharp.text.Document()
Dim oPdfWriter As PdfWriter = PdfWriter.GetInstance(oPdfDoc, New FileStream(sOutFilePath, FileMode.Create))
'oPdfWriter.SetEncryption(PdfWriter.STRENGTH40BITS, sPassword, sPassword, PdfWriter.AllowCopy)
oPdfDoc.Open()
oPdfDoc.SetPageSize(iTextSharp.text.PageSize.LEDGER.Rotate())
Dim oDirectContent As iTextSharp.text.pdf.PdfContentByte = oPdfWriter.DirectContent
Dim iNumberOfPages As Integer = oPdfReader.NumberOfPages
Dim iPage As Integer = 0
Dim iBottomMargin As Integer = txtBottomMargin.Text '10
Dim iLeftMargin As Integer = txtLeftMargin.Text '500
Dim iWidth As Integer = txtWidth.Text '120
Dim iHeight As Integer = txtHeight.Text '780
Dim oStrategy As New parser.SimpleTextExtractionStrategy()
Do While (iPage < iNumberOfPages)
iPage += 1
oPdfDoc.SetPageSize(oPdfReader.GetPageSizeWithRotation(iPage))
oPdfDoc.NewPage()
Dim oPdfImportedPage As iTextSharp.text.pdf.PdfImportedPage =
oPdfWriter.GetImportedPage(oPdfReader, iPage)
Dim iRotation As Integer = oPdfReader.GetPageRotation(iPage)
If (iRotation = 90) Or (iRotation = 270) Then
oDirectContent.AddTemplate(oPdfImportedPage, 0, -1.0F, 1.0F,
0, 0, oPdfReader.GetPageSizeWithRotation(iPage).Height)
Else
oDirectContent.AddTemplate(oPdfImportedPage, 1.0F, 0, 0, 1.0F, 0, 0)
End If
'Dim sPageText As String = parser.PdfTextExtractor.GetTextFromPage(oPdfReader, iPage, oStrategy)
'sPageText = System.Text.Encoding.UTF8.GetString(System.Text.ASCIIEncoding.Convert(System.Text.Encoding.Default, System.Text.Encoding.UTF8, System.Text.Encoding.Default.GetBytes(sPageText)))
'If txtFind.Text = "" OrElse sPageText.IndexOf(txtFind.Text) <> -1 Then
Dim oTextExtractor As New TextExtractor()
PdfTextExtractor.GetTextFromPage(oPdfReader, iPage, oTextExtractor) 'Initialize oTextExtractor
Dim oRect As iTextSharp.text.Rectangle = oTextExtractor.Find(txtFind.Text)
If oRect IsNot Nothing Then
Dim iX As Integer = oRect.Left + oRect.Width + iLeftMargin 'Move right
Dim iY As Integer = oRect.Bottom - iBottomMargin 'Move down
Dim field As PdfFormField = PdfFormField.CreateSignature(oPdfWriter)
field.SetWidget(New Rectangle(iX, iY, iX + iWidth, iY + iHeight), PdfAnnotation.HIGHLIGHT_OUTLINE)
field.FieldName = "myEmptySignatureField" & iPage
oPdfWriter.AddAnnotation(field)
End If
Loop
oPdfDoc.Close()
End Sub