开发者

How to use Embedded Equations in Java Apache POI library?

开发者 https://www.devze.com 2023-03-22 19:43 出处:网络
I am trying to use \"Apache POI\" to extract embedded equation and text from a .doc MS Word file into a .ppt MS Powerpoint file, I have successfully extracted text, but how do I extract embedded开发者

I am trying to use "Apache POI" to extract embedded equation and text from a .doc MS Word file into a .ppt MS Powerpoint file, I have successfully extracted text, but how do I extract embedded开发者_JS百科 equations?

the Embedded Equations comes out like this if I only extract it as text:

!!EMBED Equation.3


This may not help you with the binary .doc format, but for the newer .docx format, I was able to get to the equation, which is embedded as an OLE document, using the following code:

 InputStream in = new FileInputStream(f);
 XWPFDocument doc = new XWPFDocument(in);
 for (PackagePart p : doc.getAllEmbedds()) {
   POIFSFileSystem poifs = new POIFSFileSystem(p.getInputStream());
   byte[] oleData = IOUtils.toByteArray(
              poifs.createDocumentInputStream("Equation Native"));
 }

And then you can extract the MathType data in there and hand it to a MTEF parser.

If you don't need the MathType data, there is also a placeholder image (in WMF format) that just renders the equation.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号