Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/mongodb/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 在ITextractionStrategy中使用ITextractionStrategy和LocationTextExtractionStrategy获取字符串的坐标_C#_Itextsharp - Fatal编程技术网

C# 在ITextractionStrategy中使用ITextractionStrategy和LocationTextExtractionStrategy获取字符串的坐标

C# 在ITextractionStrategy中使用ITextractionStrategy和LocationTextExtractionStrategy获取字符串的坐标,c#,itextsharp,C#,Itextsharp,我有一个PDF文件,我正在使用iTextractionStrategy读取字符串。现在我从字符串中提取一个类似于的子字符串,我的名字是XYZ,需要从PDF文件中获取子字符串的直角坐标,但无法获取。通过谷歌搜索,我知道位置文本提取策略但无法获取如何使用它来获得坐标 这是密码 ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy(); string currentText = PdfTextExtractor.GetT

我有一个PDF文件,我正在使用iTextractionStrategy读取字符串。现在我从字符串中提取一个类似于
的子字符串,我的名字是XYZ
,需要从PDF文件中获取子字符串的直角坐标,但无法获取。通过谷歌搜索,我知道
位置文本提取策略
但无法获取如何使用它来获得坐标

这是密码

ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);
currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
text.Append(currentText);

string getcoordinate="My name is XYZ";
如何使用ITEXTSHARP获取此子字符串的直角坐标


请提供帮助。

这是一个非常简单的实现版本

在实施之前,重要的是要知道PDF没有“单词”、“段落”、“句子”等概念。此外,PDF中的文本不一定从左到右、从上到下排列,这与非LTR语言无关。“Hello World”一词可以写成PDF格式:

Draw H at (10, 10)
Draw ell at (20, 10)
Draw rld at (90, 10)
Draw o Wo at (50, 20)
它也可以写成

Draw Hello World at (10,10)
您需要实现的
itextractionstrategy
接口有一个名为
RenderText
的方法,该方法对PDF中的每一段文本调用一次。注意我说的是“块”而不是“字”。在上面的第一个示例中,该方法将为这两个单词调用四次。在第二个例子中,这两个词会被调用一次。这是需要理解的非常重要的部分。PDF没有文字,因此,iTextSharp也没有文字。“单词”部分完全由你来解决

同样,正如我上面所说,PDF没有段落。需要注意这一点的原因是PDF无法将文本换行。任何时候,当您看到类似于段落返回的内容时,您实际上看到的是一个全新的文本绘制命令,其坐标与前一行不同。看

下面的代码是一个非常简单的实现。为此,我对
LocationTextExtractionStrategy
进行了子类化,它已经实现了
itextractionstrategy
。每次调用
RenderText()
时,我都会找到当前块的矩形(使用),并将其存储以备将来使用。我使用这个简单的助手类来存储这些块和矩形:

//Helper class that stores our rectangle and text
public class RectAndText {
    public iTextSharp.text.Rectangle Rect;
    public String Text;
    public RectAndText(iTextSharp.text.Rectangle rect, String text) {
        this.Rect = rect;
        this.Text = text;
    }
}
下面是子类:

public class MyLocationTextExtractionStrategy : LocationTextExtractionStrategy {
    //Hold each coordinate
    public List<RectAndText> myPoints = new List<RectAndText>();

    //Automatically called for each chunk of text in the PDF
    public override void RenderText(TextRenderInfo renderInfo) {
        base.RenderText(renderInfo);

        //Get the bounding box for the chunk of text
        var bottomLeft = renderInfo.GetDescentLine().GetStartPoint();
        var topRight = renderInfo.GetAscentLine().GetEndPoint();

        //Create a rectangle from it
        var rect = new iTextSharp.text.Rectangle(
                                                bottomLeft[Vector.I1],
                                                bottomLeft[Vector.I2],
                                                topRight[Vector.I1],
                                                topRight[Vector.I2]
                                                );

        //Add this to our main collection
        this.myPoints.Add(new RectAndText(rect, renderInfo.GetText()));
    }
}
我再怎么强调也不过分,上面的没有考虑到“单词”,这将取决于你。传递到
RenderText
TextRenderInfo
对象有一个名为
getCharacterRenderInfo()
的方法,您可以使用该方法获取更多信息。如果您不关心字体中的下行项,您可能还希望使用
GetBaseline(),而不是
GetDescentLine()`

编辑

(我吃了一顿丰盛的午餐,所以我觉得自己更有帮助。)

下面是一个更新版本的
MyLocationTextExtractionStrategy
,它实现了我下面的评论所说的功能,即搜索一个字符串并在每个块中搜索该字符串。出于列出的所有原因,这在某些/许多/大多数/所有情况下都不起作用。如果子字符串在单个块中存在多次,它也将只返回第一个实例。连字和变音符号也可能搞乱这一点

public class MyLocationTextExtractionStrategy : LocationTextExtractionStrategy {
    //Hold each coordinate
    public List<RectAndText> myPoints = new List<RectAndText>();

    //The string that we're searching for
    public String TextToSearchFor { get; set; }

    //How to compare strings
    public System.Globalization.CompareOptions CompareOptions { get; set; }

    public MyLocationTextExtractionStrategy(String textToSearchFor, System.Globalization.CompareOptions compareOptions = System.Globalization.CompareOptions.None) {
        this.TextToSearchFor = textToSearchFor;
        this.CompareOptions = compareOptions;
    }

    //Automatically called for each chunk of text in the PDF
    public override void RenderText(TextRenderInfo renderInfo) {
        base.RenderText(renderInfo);

        //See if the current chunk contains the text
        var startPosition = System.Globalization.CultureInfo.CurrentCulture.CompareInfo.IndexOf(renderInfo.GetText(), this.TextToSearchFor, this.CompareOptions);

        //If not found bail
        if (startPosition < 0) {
            return;
        }

        //Grab the individual characters
        var chars = renderInfo.GetCharacterRenderInfos().Skip(startPosition).Take(this.TextToSearchFor.Length).ToList();

        //Grab the first and last character
        var firstChar = chars.First();
        var lastChar = chars.Last();


        //Get the bounding box for the chunk of text
        var bottomLeft = firstChar.GetDescentLine().GetStartPoint();
        var topRight = lastChar.GetAscentLine().GetEndPoint();

        //Create a rectangle from it
        var rect = new iTextSharp.text.Rectangle(
                                                bottomLeft[Vector.I1],
                                                bottomLeft[Vector.I2],
                                                topRight[Vector.I1],
                                                topRight[Vector.I2]
                                                );

        //Add this to our main collection
        this.myPoints.Add(new RectAndText(rect, this.TextToSearchFor));
    }

这是一个老问题,但我把我的回答留在这里,因为我在网上找不到正确的答案

正如克里斯·哈斯(Chris Haas)所揭示的,处理单词并不容易,因为iText处理的是块。Chris post在我的大部分测试中都失败了,因为一个单词通常被分成不同的块(他在帖子中警告)

为了解决这个问题,我采用了以下策略:

  • 以字符分割块(实际上每个字符包含textrenderinfo对象)
  • 将字符按行分组。这不是直截了当的,因为您必须处理块对齐
  • 搜索每行需要查找的单词
  • 我把密码留在这里。我用几个文档对它进行了测试,效果很好,但在某些情况下可能会失败,因为这个chunk->words转换有点棘手

    希望这对某人有帮助

      class LocationTextExtractionStrategyEx : LocationTextExtractionStrategy
    {
        private List<LocationTextExtractionStrategyEx.ExtendedTextChunk> m_DocChunks = new List<ExtendedTextChunk>();
        private List<LocationTextExtractionStrategyEx.LineInfo> m_LinesTextInfo = new List<LineInfo>();
        public List<SearchResult> m_SearchResultsList = new List<SearchResult>();
        private String m_SearchText;
        public const float PDF_PX_TO_MM = 0.3528f;
        public float m_PageSizeY;
    
    
        public LocationTextExtractionStrategyEx(String sSearchText, float fPageSizeY)
            : base()
        {
            this.m_SearchText = sSearchText;
            this.m_PageSizeY = fPageSizeY;
        }
    
        private void searchText()
        {
            foreach (LineInfo aLineInfo in m_LinesTextInfo)
            {
                int iIndex = aLineInfo.m_Text.IndexOf(m_SearchText);
                if (iIndex != -1)
                {
                    TextRenderInfo aFirstLetter = aLineInfo.m_LineCharsList.ElementAt(iIndex);
                    SearchResult aSearchResult = new SearchResult(aFirstLetter, m_PageSizeY);
                    this.m_SearchResultsList.Add(aSearchResult);
                }
            }
        }
    
        private void groupChunksbyLine()
        {                     
            LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk1 = null;
            LocationTextExtractionStrategyEx.LineInfo textInfo = null;
            foreach (LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk2 in this.m_DocChunks)
            {
                if (textChunk1 == null)
                {                    
                    textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
                    this.m_LinesTextInfo.Add(textInfo);
                }
                else if (textChunk2.sameLine(textChunk1))
                {                      
                    textInfo.appendText(textChunk2);
                }
                else
                {                                        
                    textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
                    this.m_LinesTextInfo.Add(textInfo);
                }
                textChunk1 = textChunk2;
            }
        }
    
        public override string GetResultantText()
        {
            groupChunksbyLine();
            searchText();
            //In this case the return value is not useful
            return "";
        }
    
        public override void RenderText(TextRenderInfo renderInfo)
        {
            LineSegment baseline = renderInfo.GetBaseline();
            //Create ExtendedChunk
            ExtendedTextChunk aExtendedChunk = new ExtendedTextChunk(renderInfo.GetText(), baseline.GetStartPoint(), baseline.GetEndPoint(), renderInfo.GetSingleSpaceWidth(), renderInfo.GetCharacterRenderInfos().ToList());
            this.m_DocChunks.Add(aExtendedChunk);
        }
    
        public class ExtendedTextChunk
        {
            public string m_text;
            private Vector m_startLocation;
            private Vector m_endLocation;
            private Vector m_orientationVector;
            private int m_orientationMagnitude;
            private int m_distPerpendicular;           
            private float m_charSpaceWidth;           
            public List<TextRenderInfo> m_ChunkChars;
    
    
            public ExtendedTextChunk(string txt, Vector startLoc, Vector endLoc, float charSpaceWidth,List<TextRenderInfo> chunkChars)
            {
                this.m_text = txt;
                this.m_startLocation = startLoc;
                this.m_endLocation = endLoc;
                this.m_charSpaceWidth = charSpaceWidth;                
                this.m_orientationVector = this.m_endLocation.Subtract(this.m_startLocation).Normalize();
                this.m_orientationMagnitude = (int)(Math.Atan2((double)this.m_orientationVector[1], (double)this.m_orientationVector[0]) * 1000.0);
                this.m_distPerpendicular = (int)this.m_startLocation.Subtract(new Vector(0.0f, 0.0f, 1f)).Cross(this.m_orientationVector)[2];                
                this.m_ChunkChars = chunkChars;
    
            }
    
    
            public bool sameLine(LocationTextExtractionStrategyEx.ExtendedTextChunk textChunkToCompare)
            {
                return this.m_orientationMagnitude == textChunkToCompare.m_orientationMagnitude && this.m_distPerpendicular == textChunkToCompare.m_distPerpendicular;
            }
    
    
        }
    
        public class SearchResult
        {
            public int iPosX;
            public int iPosY;
    
            public SearchResult(TextRenderInfo aCharcter, float fPageSizeY)
            {
                //Get position of upperLeft coordinate
                Vector vTopLeft = aCharcter.GetAscentLine().GetStartPoint();
                //PosX
                float fPosX = vTopLeft[Vector.I1]; 
                //PosY
                float fPosY = vTopLeft[Vector.I2];
                //Transform to mm and get y from top of page
                iPosX = Convert.ToInt32(fPosX * PDF_PX_TO_MM);
                iPosY = Convert.ToInt32((fPageSizeY - fPosY) * PDF_PX_TO_MM);
            }
        }
    
        public class LineInfo
        {            
            public string m_Text;
            public List<TextRenderInfo> m_LineCharsList;
    
            public LineInfo(LocationTextExtractionStrategyEx.ExtendedTextChunk initialTextChunk)
            {                
                this.m_Text = initialTextChunk.m_text;
                this.m_LineCharsList = initialTextChunk.m_ChunkChars;
            }
    
            public void appendText(LocationTextExtractionStrategyEx.ExtendedTextChunk additionalTextChunk)
            {
                m_LineCharsList.AddRange(additionalTextChunk.m_ChunkChars);
                this.m_Text += additionalTextChunk.m_text;
            }
        }
    }
    
    class LocationTextExtractionStrategyEx:LocationTextExtractionStrategy
    {
    私有列表m_DocChunks=新列表();
    私有列表m_LinesTextInfo=新列表();
    公共列表m_SearchResultsList=新列表();
    私有字符串m_SearchText;
    公共常数浮动PDF_PX_至_MM=0.3528f;
    公共浮动m_PageSizeY;
    public LocationTextExtractionStrategyEx(字符串sSearchText,浮点fPageSizeY)
    :base()
    {
    this.m_SearchText=sSearchText;
    this.m_PageSizeY=fPageSizeY;
    }
    私有void searchText()
    {
    foreach(m_LinesTextInfo中的LineInfo aLineInfo)
    {
    int iIndex=aLineInfo.m_Text.IndexOf(m_SearchText);
    如果(iIndex!=-1)
    {
    TextRenderInfo aFirstLetter=aLineInfo.m_LineCharsList.ElementAt(iIndex);
    SearchResult aSearchResult=新的搜索结果(aFirstLetter,m_PageSizeY);
    这个.m_SearchResultsList.Add(aSearchResult);
    }
    }
    }
    私有void groupChunksbyLine()
    {                     
    LocationTextExtractionStrategyEx.ExtendedTextChunkTextChunk1=null;
    LocationTextExtractionStrategyEx.LineInfo textInfo=null;
    foreach(LocationTextExtractionStrategyEx.extendedTextChunkTextChunk2在此.m_DocChunks中)
    {
    if(textChunk1==null)
    {                    
    textInfo=新位置textextextractionstrategyex.LineInfo(textChunk2);
    this.m_LinesTextInfo.Add(textInfo);
    }
    else if(textChunk2.sameLine(textChunk1))
    {                      
    textInfo.appendText(textChunk2);
    }
    其他的
    {
    
    var t = new MyLocationTextExtractionStrategy("sample");
    
      class LocationTextExtractionStrategyEx : LocationTextExtractionStrategy
    {
        private List<LocationTextExtractionStrategyEx.ExtendedTextChunk> m_DocChunks = new List<ExtendedTextChunk>();
        private List<LocationTextExtractionStrategyEx.LineInfo> m_LinesTextInfo = new List<LineInfo>();
        public List<SearchResult> m_SearchResultsList = new List<SearchResult>();
        private String m_SearchText;
        public const float PDF_PX_TO_MM = 0.3528f;
        public float m_PageSizeY;
    
    
        public LocationTextExtractionStrategyEx(String sSearchText, float fPageSizeY)
            : base()
        {
            this.m_SearchText = sSearchText;
            this.m_PageSizeY = fPageSizeY;
        }
    
        private void searchText()
        {
            foreach (LineInfo aLineInfo in m_LinesTextInfo)
            {
                int iIndex = aLineInfo.m_Text.IndexOf(m_SearchText);
                if (iIndex != -1)
                {
                    TextRenderInfo aFirstLetter = aLineInfo.m_LineCharsList.ElementAt(iIndex);
                    SearchResult aSearchResult = new SearchResult(aFirstLetter, m_PageSizeY);
                    this.m_SearchResultsList.Add(aSearchResult);
                }
            }
        }
    
        private void groupChunksbyLine()
        {                     
            LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk1 = null;
            LocationTextExtractionStrategyEx.LineInfo textInfo = null;
            foreach (LocationTextExtractionStrategyEx.ExtendedTextChunk textChunk2 in this.m_DocChunks)
            {
                if (textChunk1 == null)
                {                    
                    textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
                    this.m_LinesTextInfo.Add(textInfo);
                }
                else if (textChunk2.sameLine(textChunk1))
                {                      
                    textInfo.appendText(textChunk2);
                }
                else
                {                                        
                    textInfo = new LocationTextExtractionStrategyEx.LineInfo(textChunk2);
                    this.m_LinesTextInfo.Add(textInfo);
                }
                textChunk1 = textChunk2;
            }
        }
    
        public override string GetResultantText()
        {
            groupChunksbyLine();
            searchText();
            //In this case the return value is not useful
            return "";
        }
    
        public override void RenderText(TextRenderInfo renderInfo)
        {
            LineSegment baseline = renderInfo.GetBaseline();
            //Create ExtendedChunk
            ExtendedTextChunk aExtendedChunk = new ExtendedTextChunk(renderInfo.GetText(), baseline.GetStartPoint(), baseline.GetEndPoint(), renderInfo.GetSingleSpaceWidth(), renderInfo.GetCharacterRenderInfos().ToList());
            this.m_DocChunks.Add(aExtendedChunk);
        }
    
        public class ExtendedTextChunk
        {
            public string m_text;
            private Vector m_startLocation;
            private Vector m_endLocation;
            private Vector m_orientationVector;
            private int m_orientationMagnitude;
            private int m_distPerpendicular;           
            private float m_charSpaceWidth;           
            public List<TextRenderInfo> m_ChunkChars;
    
    
            public ExtendedTextChunk(string txt, Vector startLoc, Vector endLoc, float charSpaceWidth,List<TextRenderInfo> chunkChars)
            {
                this.m_text = txt;
                this.m_startLocation = startLoc;
                this.m_endLocation = endLoc;
                this.m_charSpaceWidth = charSpaceWidth;                
                this.m_orientationVector = this.m_endLocation.Subtract(this.m_startLocation).Normalize();
                this.m_orientationMagnitude = (int)(Math.Atan2((double)this.m_orientationVector[1], (double)this.m_orientationVector[0]) * 1000.0);
                this.m_distPerpendicular = (int)this.m_startLocation.Subtract(new Vector(0.0f, 0.0f, 1f)).Cross(this.m_orientationVector)[2];                
                this.m_ChunkChars = chunkChars;
    
            }
    
    
            public bool sameLine(LocationTextExtractionStrategyEx.ExtendedTextChunk textChunkToCompare)
            {
                return this.m_orientationMagnitude == textChunkToCompare.m_orientationMagnitude && this.m_distPerpendicular == textChunkToCompare.m_distPerpendicular;
            }
    
    
        }
    
        public class SearchResult
        {
            public int iPosX;
            public int iPosY;
    
            public SearchResult(TextRenderInfo aCharcter, float fPageSizeY)
            {
                //Get position of upperLeft coordinate
                Vector vTopLeft = aCharcter.GetAscentLine().GetStartPoint();
                //PosX
                float fPosX = vTopLeft[Vector.I1]; 
                //PosY
                float fPosY = vTopLeft[Vector.I2];
                //Transform to mm and get y from top of page
                iPosX = Convert.ToInt32(fPosX * PDF_PX_TO_MM);
                iPosY = Convert.ToInt32((fPageSizeY - fPosY) * PDF_PX_TO_MM);
            }
        }
    
        public class LineInfo
        {            
            public string m_Text;
            public List<TextRenderInfo> m_LineCharsList;
    
            public LineInfo(LocationTextExtractionStrategyEx.ExtendedTextChunk initialTextChunk)
            {                
                this.m_Text = initialTextChunk.m_text;
                this.m_LineCharsList = initialTextChunk.m_ChunkChars;
            }
    
            public void appendText(LocationTextExtractionStrategyEx.ExtendedTextChunk additionalTextChunk)
            {
                m_LineCharsList.AddRange(additionalTextChunk.m_ChunkChars);
                this.m_Text += additionalTextChunk.m_text;
            }
        }
    }
    
    using System.Collections.Generic;
    using iTextSharp.text.pdf.parser;
    
    namespace Logic
    {
        public class LocationTextExtractionStrategyWithPosition : LocationTextExtractionStrategy
        {
            private readonly List<TextChunk> locationalResult = new List<TextChunk>();
    
            private readonly ITextChunkLocationStrategy tclStrat;
    
            public LocationTextExtractionStrategyWithPosition() : this(new TextChunkLocationStrategyDefaultImp()) {
            }
    
            /**
             * Creates a new text extraction renderer, with a custom strategy for
             * creating new TextChunkLocation objects based on the input of the
             * TextRenderInfo.
             * @param strat the custom strategy
             */
            public LocationTextExtractionStrategyWithPosition(ITextChunkLocationStrategy strat)
            {
                tclStrat = strat;
            }
    
    
            private bool StartsWithSpace(string str)
            {
                if (str.Length == 0) return false;
                return str[0] == ' ';
            }
    
    
            private bool EndsWithSpace(string str)
            {
                if (str.Length == 0) return false;
                return str[str.Length - 1] == ' ';
            }
    
            /**
             * Filters the provided list with the provided filter
             * @param textChunks a list of all TextChunks that this strategy found during processing
             * @param filter the filter to apply.  If null, filtering will be skipped.
             * @return the filtered list
             * @since 5.3.3
             */
    
            private List<TextChunk> filterTextChunks(List<TextChunk> textChunks, ITextChunkFilter filter)
            {
                if (filter == null)
                {
                    return textChunks;
                }
    
                var filtered = new List<TextChunk>();
    
                foreach (var textChunk in textChunks)
                {
                    if (filter.Accept(textChunk))
                    {
                        filtered.Add(textChunk);
                    }
                }
    
                return filtered;
            }
    
            public override void RenderText(TextRenderInfo renderInfo)
            {
                LineSegment segment = renderInfo.GetBaseline();
                if (renderInfo.GetRise() != 0)
                { // remove the rise from the baseline - we do this because the text from a super/subscript render operations should probably be considered as part of the baseline of the text the super/sub is relative to 
                    Matrix riseOffsetTransform = new Matrix(0, -renderInfo.GetRise());
                    segment = segment.TransformBy(riseOffsetTransform);
                }
                TextChunk tc = new TextChunk(renderInfo.GetText(), tclStrat.CreateLocation(renderInfo, segment));
                locationalResult.Add(tc);
            }
    
    
            public IList<TextLocation> GetLocations()
            {
    
                var filteredTextChunks = filterTextChunks(locationalResult, null);
                filteredTextChunks.Sort();
    
                TextChunk lastChunk = null;
    
                 var textLocations = new List<TextLocation>();
    
                foreach (var chunk in filteredTextChunks)
                {
    
                    if (lastChunk == null)
                    {
                        //initial
                        textLocations.Add(new TextLocation
                        {
                            Text = chunk.Text,
                            X = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[0]),
                            Y = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[1])
                        });
    
                    }
                    else
                    {
                        if (chunk.SameLine(lastChunk))
                        {
                            var text = "";
                            // we only insert a blank space if the trailing character of the previous string wasn't a space, and the leading character of the current string isn't a space
                            if (IsChunkAtWordBoundary(chunk, lastChunk) && !StartsWithSpace(chunk.Text) && !EndsWithSpace(lastChunk.Text))
                                text += ' ';
    
                            text += chunk.Text;
    
                            textLocations[textLocations.Count - 1].Text += text;
    
                        }
                        else
                        {
    
                            textLocations.Add(new TextLocation
                            {
                                Text = chunk.Text,
                                X = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[0]),
                                Y = iTextSharp.text.Utilities.PointsToMillimeters(chunk.Location.StartLocation[1])
                            });
                        }
                    }
                    lastChunk = chunk;
                }
    
                //now find the location(s) with the given texts
                return textLocations;
    
            }
    
        }
    
        public class TextLocation
        {
            public float X { get; set; }
            public float Y { get; set; }
    
            public string Text { get; set; }
        }
    }
    
            using (var reader = new PdfReader(inputPdf))
                {
    
                    var parser = new PdfReaderContentParser(reader);
    
                    var strategy = parser.ProcessContent(pageNumber, new LocationTextExtractionStrategyWithPosition());
    
                    var res = strategy.GetLocations();
    
                    reader.Close();
                 }
                    var searchResult = res.Where(p => p.Text.Contains(searchText)).OrderBy(p => p.Y).Reverse().ToList();
    
    
    
    
    inputPdf is a byte[] that has the pdf data
    
    pageNumber is the page where you want to search in
    
    Class TextExtractor
        Inherits LocationTextExtractionStrategy
        Implements iTextSharp.text.pdf.parser.ITextExtractionStrategy
        Public oPoints As IList(Of RectAndText) = New List(Of RectAndText)
        Public Overrides Sub RenderText(renderInfo As TextRenderInfo) 'Implements IRenderListener.RenderText
            MyBase.RenderText(renderInfo)
    
            Dim bottomLeft As Vector = renderInfo.GetDescentLine().GetStartPoint()
            Dim topRight As Vector = renderInfo.GetAscentLine().GetEndPoint() 'GetBaseline
    
            Dim rect As Rectangle = New Rectangle(bottomLeft(Vector.I1), bottomLeft(Vector.I2), topRight(Vector.I1), topRight(Vector.I2))
            oPoints.Add(New RectAndText(rect, renderInfo.GetText()))
        End Sub
    
        Private Function GetLines() As Dictionary(Of Single, ArrayList)
            Dim oLines As New Dictionary(Of Single, ArrayList)
            For Each p As RectAndText In oPoints
                Dim iBottom = p.Rect.Bottom
    
                If oLines.ContainsKey(iBottom) = False Then
                    oLines(iBottom) = New ArrayList()
                End If
    
                oLines(iBottom).Add(p)
            Next
    
            Return oLines
        End Function
    
        Public Function Find(ByVal sFind As String) As iTextSharp.text.Rectangle
            Dim oLines As Dictionary(Of Single, ArrayList) = GetLines()
    
            For Each oEntry As KeyValuePair(Of Single, ArrayList) In oLines
                'Dim iBottom As Integer = oEntry.Key
                Dim oRectAndTexts As ArrayList = oEntry.Value
                Dim sLine As String = ""
                For Each p As RectAndText In oRectAndTexts
                    sLine += p.Text
                    If sLine.IndexOf(sFind) <> -1 Then
                        Return p.Rect
                    End If
                Next
            Next
    
            Return Nothing
        End Function
    
    End Class
    
    Public Class RectAndText
        Public Rect As iTextSharp.text.Rectangle
        Public Text As String
        Public Sub New(ByVal rect As iTextSharp.text.Rectangle, ByVal text As String)
            Me.Rect = rect
            Me.Text = text
        End Sub
    End Class
    
    Sub EncryptPdf(ByVal sInFilePath As String, ByVal sOutFilePath As String)
    
            Dim oPdfReader As iTextSharp.text.pdf.PdfReader = New iTextSharp.text.pdf.PdfReader(sInFilePath)
            Dim oPdfDoc As New iTextSharp.text.Document()
            Dim oPdfWriter As PdfWriter = PdfWriter.GetInstance(oPdfDoc, New FileStream(sOutFilePath, FileMode.Create))
            'oPdfWriter.SetEncryption(PdfWriter.STRENGTH40BITS, sPassword, sPassword, PdfWriter.AllowCopy)
            oPdfDoc.Open()
    
            oPdfDoc.SetPageSize(iTextSharp.text.PageSize.LEDGER.Rotate())
    
            Dim oDirectContent As iTextSharp.text.pdf.PdfContentByte = oPdfWriter.DirectContent
            Dim iNumberOfPages As Integer = oPdfReader.NumberOfPages
            Dim iPage As Integer = 0
    
            Dim iBottomMargin As Integer = txtBottomMargin.Text '10
            Dim iLeftMargin As Integer = txtLeftMargin.Text '500
            Dim iWidth As Integer = txtWidth.Text '120
            Dim iHeight As Integer = txtHeight.Text '780
    
            Dim oStrategy As New parser.SimpleTextExtractionStrategy()
    
    
            Do While (iPage < iNumberOfPages)
                iPage += 1
                oPdfDoc.SetPageSize(oPdfReader.GetPageSizeWithRotation(iPage))
                oPdfDoc.NewPage()
    
                Dim oPdfImportedPage As iTextSharp.text.pdf.PdfImportedPage =
                oPdfWriter.GetImportedPage(oPdfReader, iPage)
                Dim iRotation As Integer = oPdfReader.GetPageRotation(iPage)
                If (iRotation = 90) Or (iRotation = 270) Then
                    oDirectContent.AddTemplate(oPdfImportedPage, 0, -1.0F, 1.0F,
                     0, 0, oPdfReader.GetPageSizeWithRotation(iPage).Height)
                Else
                    oDirectContent.AddTemplate(oPdfImportedPage, 1.0F, 0, 0, 1.0F, 0, 0)
                End If
    
                'Dim sPageText As String = parser.PdfTextExtractor.GetTextFromPage(oPdfReader, iPage, oStrategy)
                'sPageText = System.Text.Encoding.UTF8.GetString(System.Text.ASCIIEncoding.Convert(System.Text.Encoding.Default, System.Text.Encoding.UTF8, System.Text.Encoding.Default.GetBytes(sPageText)))
                'If txtFind.Text = "" OrElse sPageText.IndexOf(txtFind.Text) <> -1 Then
    
                Dim oTextExtractor As New TextExtractor()
                PdfTextExtractor.GetTextFromPage(oPdfReader, iPage, oTextExtractor) 'Initialize oTextExtractor
    
                Dim oRect As iTextSharp.text.Rectangle = oTextExtractor.Find(txtFind.Text)
                If oRect IsNot Nothing Then
                    Dim iX As Integer = oRect.Left + oRect.Width + iLeftMargin 'Move right
                    Dim iY As Integer = oRect.Bottom - iBottomMargin 'Move down
    
                    Dim field As PdfFormField = PdfFormField.CreateSignature(oPdfWriter)
                    field.SetWidget(New Rectangle(iX, iY, iX + iWidth, iY + iHeight), PdfAnnotation.HIGHLIGHT_OUTLINE)
                    field.FieldName = "myEmptySignatureField" & iPage
                    oPdfWriter.AddAnnotation(field)
                End If
    
            Loop
    
            oPdfDoc.Close()
    
        End Sub