开发者

Parse PDF for Form Titles

开发者 https://www.devze.com 2023-04-11 09:54 出处:网络
I want parse 开发者_运维技巧pdf for form field names and types. Is it possible? Because when I tried one PDF, it gave me some strange characters e.g.:

I want parse 开发者_运维技巧pdf for form field names and types. Is it possible? Because when I tried one PDF, it gave me some strange characters e.g.:

...

?õ»â¢_¸ðO´×¢É]Ì|BQÔQClã(¢dVò¶~?ýg?þª í

pÅ2ÞÎÉÍ??Ú?wȳ.?d;k)*lÙ´¸(ò!ú©=ià??d?éPض2Èåäý?»p?nÜÈûÏ??M

õl:`Þ°Ã3£BíTCy5 ?ð?tN¿7fDõK

±¦?i¹vü~»X?s÷A~Ôê±4?ÕµX±¤?

...

Where could be the problem? I used tool http://support.persits.com/pdf/demo_formfields.asp and pdf https://www.drsr.sk//priznania/dpfoa2010.pdf

I want make some parser for iOS. Thanks for answer.


For PDF parsing on iOS, use the Quartz API.

For an example of an app which makes use of this API, see this reader.

To extract the specific information you're interested in, you will need to read the PDF document structure specification and figure out which dictionaries it's in (or, if you're lucky find some sample code).


Ok, so I looked into reference and found something. I was able to open PDF and make some CGPDFDictionaryRef but I am stuck at that point. This is my code:

CFURLRef pdfURL = CFBundleCopyResourceURL(CFBundleGetMainBundle(), CFSTR("simple_form.pdf"), NULL, NULL);
CGPDFDocumentRef myDocument = CGPDFDocumentCreateWithURL((CFURLRef)pdfURL);
//CFRelease(pdfURL);

int k;
CGPDFPageRef myPage;

NSInteger numOfPages = CGPDFDocumentGetNumberOfPages (myDocument);
for (k = 0; k < numOfPages; k++) {
  myPage = CGPDFDocumentGetPage (myDocument, k + 1 );
  CGPDFDictionaryRef ref = CGPDFPageGetDictionary(myPage); //what at this point?
  CGPDFPageRelease (myPage);
}

I`d like to have something similar to Figure 14-1 here

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号