开发者

How can ant compile-and-jar byte-identical jar files, i.e. so MD5 matches unless .java (and thus .class) changes?

开发者 https://www.devze.com 2023-04-01 03:51 出处:网络
Summary How can you make ant repeatedly generate byte-identical jar files from the same .class files? Background

Summary

How can you make ant repeatedly generate byte-identical jar files from the same .class files?

Background

Our build process does the following:

  1. gets web-services-definition (wsdl) files from another application's source repository
  2. runs wsdl2java to generate .java file for use by web-service clients (i.e. our app)
  3. compiles the java file开发者_JAVA百科s
  4. generates a .jar file from the compiler output
  5. checks the 'artifact' jar file into source control

Note: We do this last step so developers have access to this jar file w/o building it themselves. We use a special 'derived' directory to distinguish source from artifacts.

The problem

We cannot get ant to generate byte-identical .jar files, even if the source files have not changed, i.e. each build generates a slightly different jar (with different MD5)

I checked the internet and found this question from some 5 years back:

If I compile some code and create a jar and related md5 file using ANT the checksum in the md5 file is different everytime even though the code hasn't changed. Any idea's why this is so how it can circumvented ? I suspect there is some timestamp information coming in somewhere.

http://www.velocityreviews.com/forums/t150783-creating-new-jar-same-code-different-md5.html

Per the responses, I've attempted the following:

  1. setting the timestamp to '0' on all .class files before jarring
  2. specifying a manifest file and also setting the timestamp to 0 for this manifest

[Note: this second step seems ineffective. See below]

After each build, the .jar file still has a different MD5 sum.

CSI: Jar file

I've unjarred and examined and the jars both contents and timestamps match between the "different" jars with one exception: different timestamps for META-INF/MANIFEST.MF.

Code

   <-- touch classes and manifest to set consistent timestamp across builds -->
   <touch millis="0">
    <fileset dir="${mycompany.ws.classes.dir}"/>
   </touch>
   <touch millis="0" file="mymanifest.mf"/>

   <jar destfile="${derived.lib.dir}/mycompanyws.jar"
        manifest="mymanifest.mf"
        basedir="${mycompany.ws.classes.dir}"
        includes="**/com/mycompany/**,**/org/apache/xml/**" 
    />

Other Options

We could use fancier ant programming to only check in the .jar file if the .java files have changed.


I have been facing a similar problem, yet slightly different. I decided to share it here as it relates to the topic of the question. In order to produce two byte-identical digitally signed JAR files in a different time one has to take the following points to consideration:

  • Timestamps: **/*.class files have to have the same timestamp (java.util.zip.ZipEntry.setTime(long)). In addition, the META-INF/MANIFEST.MF file and the certificate files (*.RSA, *.DSA, and *.SF) are added to the JAR file with a "now" timestamp. So even if you decide not to compile the classes and use the ones already compiled (i.e. the ones with the original JAR's timestamp), your resulting JAR will be binary different.
  • MANIFEST.MF Entries Ordering: Note that the key-value pairs in the MANIFEST.MF file are represented as a java.util.HashMap which "does not guarantee that the order will remain constant over time.". So you may run into another binary difference when signing the JAR files using JDK v5 and JDK v6 jarsigner tool as the order of the MANIFEST.MF entries may change (http://stackoverflow.com/questions/1879897/order-of-items-in-a-hashmap-differ-when-the-same-program-is-run-in-jvm5-vs-jvm6).

So basically there are two levels of the problem. Firstly, the JAR/ZIP tool that packages the files with their file-system timestamps and, thus, creates binary different JAR files for the same set of Java classes that are binary equal, but were compiled in a different time. Secondly, the JAR signer tool that modifies the META-INF/MANIFEST-MF file and appends more files to the JAR archive (certificates and class file check-sums).

The solution maybe a custom JAR signer, that sets the timestamps of all the JAR file items to a constant time and orders the MANIFEST.MF file entries (e.g. by alphabet). So far, this is, according to my knowledge, the only way to producing two byte-identical digitally signed JAR files in different time points.


Since a jar is just a zip file incognito, you could try using the zip task to add the manifest file under META-INF/ by hand. Hopefully that circumvents any internal magic associated with handling the manifest by the jar task.

Just an side note, since it sounds like having equal MD5s is critical, I would recommend you add a sanity test as part of the build, such as compile some special "dummy" code that never changes into a jar and check the jar MD5 equals the one expected. This will safeguard the build against unexpected changes (e.g. after an upgrade to ant, JRE, OS, timezone change etc.)


Had this same problem, landed on this page. The answer above by Jiri Patera was very helpful in understanding why I could not get the md5sums of what I expected to be two identical files to be the same after unsigning and resigning the jar files.

This is the solution I used instead:

jar -tvf $JARFILE | grep -v META-INF | perl -p -e's/^\s+(\d+).*\s+([\w]+)/$1 $2/g' | md5sum

It doesn't give 100% certainty that the jars are equivalent but it gives a fairly reliable indication.

It takes a listing of all the files in the jarfile minus the META_INF files, parses out file size and file name, and then runs the text of filesizes plus filenames thru the md5sum algorithm.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号