开发者

Why does PPTX download as ZIP file from a website?

开发者 https://www.devze.com 2023-03-06 19:11 出处:网络
I know the root cause of the problem of downloading (say) a PPTX from a web site and it downloading as a ZIP (Office 2007 file format is renamed zip) and I know how to fix it in the web server (add MI

I know the root cause of the problem of downloading (say) a PPTX from a web site and it downloading as a ZIP (Office 2007 file format is renamed zip) and I know how to fix it in the web server (add MIME types).

But I'm interested in understanding why this is happening and the mechanics of the process been carried out by the web server and web browser. I'm aware that HTTP traffic can be naturally zipped and unzipped (gzip) to improve performance so I'm guessing that this could also be part of the problem.开发者_JAVA技巧

For example, one assumes the file name and path is passed back to the browser by HTTP. Is it the web server that's renaming the extension or the web browser?

A little flow diagram would be ideal.


Apologies for answering this very old thread, but hopefully this is useful information.

The reason that pptx (or docx) files are renamed to zip is a combination of actions by both the web server and the browser. Most probably, the web server has not been configured to handle pptx files, so it sends them with Content-Type: text/plain. Some browsers (e.g. Chrome and Firefox) may say "ok, I believe you", and simply save the file under your instruction. Other browsers (e.g. MSIE) may say "I'll just check that"; and they check the file contents, which indicate a ZIP file. So, if MSIE has an option somewhere for "do not check MIME types when downloading files" then that is what you need.

Another solution lies with the web server, which really needs to send Content-type: application/mspowerpoint or similar. If you have suitable access to the web server you just need to add a line to the .htaccess file saying AddType application/mspowerpoint .pptx which will force the server to send a Content-type header that MSIE will correctly interpret.


1) Its probable the web browser is using magic numbers to identify the type of file, based on the first few bytes of the file (typically a header of some sort for binary files).

As you are aware, Office 2007 files are packaged as zip, and so the browser (when it doesn't have any MIME information to help), starts downloading the file, sees the zip header, and so saves it (or prompts you to save it) as a zip file.

This to me seems like strange behaviour for the browser, I would have expected it to keep the file name (and extension) as provided by the server, but that may vary between browsers and on exactly what MIME type is provided (or not provided).

2) Alternatively, the server may be doing the same thing, when it doesn't have a MIME type associated with a particular file extension. It might check the start of the file and find that it looks like a zip file, so will serve the file back to the client with a zip MIME type.

You could rule out the server doing any MIME type guessing by inspecting the HTTP response or raw packets (either server or client side) with something like Wireshark.

3) Gzipping won't be the problem, that happens on a lower level and is unrelated to MIME types.


The best explanation I've found -- both as to why this happens and how to fix it -- is http://blogs.msdn.com/b/asiatech/archive/2012/03/28/office-documents-will-be-recognized-as-zip-file-when-downloading-from-ie.aspx.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号