开发者

How To programmatically detect XFA (Adobe XML Forms Architecture) dynamic PDF

开发者 https://www.devze.com 2023-04-12 07:46 出处:网络
I have a system that converts pdf to tif. Basically it\'s a program written in csharp that uses iTextSharp to get metadata about the pdf and pdf2tif (http://pdftotif.sourceforge.net/) to convert to th

I have a system that converts pdf to tif. Basically it's a program written in csharp that uses iTextSharp to get metadata about the pdf and pdf2tif (http://pdftotif.sourceforge.net/) to convert to the file. I've noticed a number of pdf's do not conv开发者_StackOverflow社区ert correctly. In Acrobat and Foxit they open as multi page forms but in any other viewer (Ghostscript...) they open as 1 page documents with the message

"To view the full contents of this document, you need a later version of the PDF viewer. You can upgrade to the latest version of Adobe Reader from "www.adobe.com/products/acrobat/readstep2.html" For further support, go to http://www.adobe.com/support/products/acrreader.html"

Some goggling around told me that these are XFA dynamic PDF's is there any way i can programmatically detect that so I can try to handle these pdf’s differently?


The iText API is a good start.

In iTextSharp you access the object's property instead of calling a method. (if you've done a moderate amount of work with iTextSharp you probably already know this)

Anyway, here's a simple example using an HTTP Handler:

<%@ WebHandler Language="C#" Class="iTextXfa" %>
using System;
using System.Web;
using iTextSharp.text;  
using iTextSharp.text.pdf;

public class iTextXfa : IHttpHandler {
  public void ProcessRequest (HttpContext context) {
    HttpServerUtility Server = context.Server;
    string[] testFiles = { 
      Server.MapPath("./non-XFA.pdf"), Server.MapPath("./XFA.pdf") 
    };
    foreach (string file in testFiles) {
      XfaForm xfa = new XfaForm(new PdfReader(file));
      context.Response.Write(string.Format(
        "<p>File: {0} is XFA: {1}</p>",
        file,
        xfa.XfaPresent ? "YES" : "NO"
      ));
    }
  }
  public bool IsReusable { get { return false; } }
}


Command line approach:

strings document.pdf | grep XFA

If you get a line or two you're probably working with an XFA PDF:

<</Names[(!ADBE::0100_VersChkStrings) 364 0 R(!ADBE::0100_VersChkVars) 365 0 R(!ADBE::0200_VersChkCode_XFACheck) 366 0 R]>>
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号