44

I am building a Javadoc for a module with 2,509 classes. This currently takes 7 min or 6 files per second.

I have tried

mvn -T 1C install

However javadoc only uses 1 CPU. Is there a way to use more and/or speed up?

I am using Oracle JDK 8 update 112. My dev machine has 16 cores and 128 GB of memory.

Running flight recorder I can see that there is only one thread main

enter image description here

For those who are interested, I've used the following options:

<plugin>
    <artifactId>maven-javadoc-plugin</artifactId>
    <configuration>
        <additionalJOptions>
            <additionalJOption>-J-XX:+UnlockCommercialFeatures</additionalJOption>
            <additionalJOption>-J-XX:+FlightRecorder</additionalJOption>
            <additionalJOption>-J-XX:StartFlightRecording=name=test,filename=/tmp/myrecording-50.jfr,dumponexit=true</additionalJOption>
            <additionalJOption>-J-XX:FlightRecorderOptions=loglevel=debug</additionalJOption>
        </additionalJOptions>
    </configuration>
</plugin>

NOTE: One workaround is to do:

-Dmaven.javadoc.skip=true
Mikhail Kholodkov
  • 17,248
  • 15
  • 52
  • 69
Peter Lawrey
  • 498,481
  • 72
  • 700
  • 1,075
  • Profile the javadoc process. I would assume it's probably IO bound. So you could load the source onto a ramdisk or ssd. – Elliott Frisch Dec 16 '16 at 17:22
  • @ElliottFrisch A good thought, the disk is 3% busy, but the CPU is almost exactly 100% (one cpu). I can profile it with Flight Recorder though, will update. – Peter Lawrey Dec 16 '16 at 17:23
  • CPU might be in IO wait and 100%. – Elliott Frisch Dec 16 '16 at 17:24
  • On this machine `3.7% us, 0.2 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st.` – Peter Lawrey Dec 16 '16 at 17:40
  • Perhaps doxygen is multithreaded and compatible with javadoc syntax ? – Marged Dec 16 '16 at 17:58
  • What jdk are you running? Have you measured the time for running javadoc directly? – pvg Dec 16 '16 at 17:59
  • @pvg I am using OracleJDK 8 update 112. I am running the javadoc from maven but don't expect it to be faster without it. I have added a screen shot of the flight recorder report. – Peter Lawrey Dec 16 '16 at 18:06
  • Can you get the actual javadoc invocation? I just tried it on the 2k classes in the `java` package. Took 35 seconds so something seems off about your times. – pvg Dec 16 '16 at 18:37
  • 2
    The Oracle `javac` compiler is not multithreaded, but the Eclipse compiler is. Can the Eclipse compiler perhaps generate javadoc too? – Andreas Dec 16 '16 at 18:48
  • Does `mvn -T 16 install` behave differently? – Elliott Frisch Dec 16 '16 at 21:40
  • @ElliottFrisch tried it and the difference was a second (possibly random variation) – Peter Lawrey Dec 17 '16 at 11:54
  • 1
    I'm thinking the `-T` controls how maven starts `javac` compiler processes, `javadoc` is a standalone tool. There are very few [options](http://docs.oracle.com/javase/8/docs/technotes/tools/windows/javadoc.html#CHDFDACB) documented, for example [`-verbose`](http://docs.oracle.com/javase/8/docs/technotes/tools/windows/javadoc.html#CHDGHFJJ) will tell you how long it's spending on each file. – Elliott Frisch Dec 18 '16 at 23:40
  • 4
    You might be triggering a JavaDoc bug. Most of the time is spent in HashMap.put() and ClassMember.isEqual() which could indicate a poor hash code algorithm which leads to too many conflicts. – Eduard Wirch Dec 20 '16 at 20:38
  • 1
    It's not necessarily related, but you invoke `javadoc` through `mvn`, so a maven speed-up may be worth a shot, i.e., `export MAVEN_OPTS="-client -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xverify:none"` (cf. [this blog](http://blog2.vorburger.ch/2016/06/improve-maven-build-speed-with-q.html)). I don't have too much hope about that, but who knows? – D. Kovács Jan 05 '17 at 07:59
  • What version of maven are you using? Did you set any MAVEN_OPTS? What version of maven-javadoc-plugin are you using? – pringi Jan 06 '17 at 15:19
  • @pringi I haven't set any OPTS, the process is long running (minutes) so I am not sure this will help. I am using version 2.10.3 of the plugin. – Peter Lawrey Jan 06 '17 at 15:28
  • Maven is running in java, so if you set MAVEN_OPTS will help the JVM (ex: -Xms256m -Xmx512m). What is the version of maven you are using? – pringi Jan 06 '17 at 15:55
  • 1
    See this link if it helps : https://issues.apache.org/jira/browse/LUCENE-5282 – Sanjit Kumar Mishra Jan 17 '17 at 13:43
  • 1
    interesting. is it safe to assume that you are invoking `mvn javadoc:javadoc`? – Eugene Oct 06 '18 at 12:56
  • @Eugene correct. – Peter Lawrey Oct 06 '18 at 17:06
  • 1
    @PeterLawrey that is very interesting, I've tried running that command for sources of `JMH` - 3618 of classes, around 12 seconds. I'm running `3.0.1` version of the plugin. – Eugene Oct 07 '18 at 12:38
  • @Eugene I will try updating the plugin. I suspect the problem is the number of relationships between classes. – Peter Lawrey Oct 07 '18 at 20:08
  • @PeterLawrey for the record, I've also tried around 5 other projects I have from `openjdk` and our 10 of internals ones - some modules, all much above 2k classes... it's most probably the data itself in your project that triggers a weird path. plz post back with results once you do – Eugene Oct 07 '18 at 20:10
  • @PeterLawrey is it possible to generate javadocs individually for each submodule/subpackage and then assemble these parts? – Andrew Tobilko Oct 09 '18 at 11:19

6 Answers6

6

Running maven with -T1C will cause maven to try to build modules in parallel, so if you have a multi-module project, at best it will build each module's javadoc in parallel (if your dependency graph between modules allow it).

The javadoc process itself is single-threaded, so you won't be able to use multiple cores to generate the javadoc of one single module.

However, since you have many classes (and possibly many @link doclets or similar ?), maybe the javadoc process could benefit from extended heap. Have you looked into GC activity ? Try adding this in your configuration, see if it helps :

<additionalJOption>-J-Xms2g</additionalJOption>
<additionalJOption>-J-Xmx2g</additionalJOption>
lbndev
  • 720
  • 5
  • 14
  • I could check the memory size is not limited. The default should be 32 GB on this machine. – Peter Lawrey Feb 02 '17 at 09:23
  • @PeterLawrey the problem may not be the limits, but the starting size. JVM will extend memory a little only after every full GC, so a reasonable amount of Xms will let JVM avoid too much GC before it extends the memory enough for your workload – Tair Oct 09 '18 at 17:25
5

@lbndev is right, at least with the default Doclet (com.sun.tools.doclets.formats.html.HtmlDoclet) that is supplied with Javadoc. A look through the source confirms the single threaded implementation:

(Those links are to JDK 8 source. With JDK 11 the classes have moved, but the basic for loops in HtmlDoclet and AbstractDoclet are still there.)

Some sample based profiling confirmed these are the methods that are the bottleneck: Javadoc-profiling

This won't be what you're hoping to hear, but this looks like no option in the current standard Javadoc for multi-threading, at least within a single Maven module.

generateClassFiles() etc would lend themselves well to a bit of multithreading, though this would probably need to be a change in the JDK. As mentioned below AbstractDoclet.isValidDoclet() even actively blocks subclassing of HtmlDoclet. Trying to reimplement some of those loops as a third party would need to pull in a lot of other code.

A scan around other Doclet implementations (e.g. javadown) only found a similar implementation style around the package and class drilldown. It's possible others on this thread will know more.

Thinking a bit more widely, there might be room for tuning around DocFileFactory. It's clearly marked up as an internal class (not even public in the package), but it does abstract the writing of the (HTML) files. It seems possible an alternative version of this could buffer the HTML in memory, or stream directly to a zip file, to improve the IO performance. But clearly this would also need to understand the risk of change in the JDK tools.

df778899
  • 10,135
  • 1
  • 18
  • 32
  • Hmmm, a crafty subclass of `HtmlDoclet` could override such as `generateClassFiles()` and introduce an executor. It's going to need to be JDK specific though. Can I check which JDK is the target now? – df778899 Oct 06 '18 at 22:37
  • Good luck with that, with all the singletons, stateful classes *and* javac. It's futile. – rustyx Oct 07 '18 at 08:51
  • Well ... the idea was to jump in later with a new doclet implementation - as the `-doclet` parameter. Looks like the implementers of `HtmlDoclet` saw this possibility and locked it down though. [AbstractDoclet.isValidDoclet()](https://github.com/JetBrains/jdk8u_langtools/blob/master/src/share/classes/com/sun/tools/doclets/internal/toolkit/AbstractDoclet.java#L52) checks the fully qualified classname of the subclass is `com.sun.tools.doclets.formats.html.HtmlDoclet`. It's private, and called from an internal method, so there would be lots to reimplement. Similar in JDK 11. – df778899 Oct 07 '18 at 15:18
0

javadoc, and the standard doclet, are currently fundamentally single-threaded.

It is "on the radar" to improve this, primarily by generating pages in parallel, but this means retrofitting MT-safeness to various shared data structures.

0

You can have Maven to use multiple threads per core in all the cores.

For eg.

mvn -T 4C install # will use 4 threads per available CPU core

You can change 4 above to whatever number you want. You have a machine with lots of resources. Try 8 or 16.

Also have you tried using javadoc-no-fork ? This will ensure javadoc is not triggered second time - https://maven.apache.org/plugins/maven-javadoc-plugin/examples/javadoc-nofork.html

Arun Avanathan
  • 736
  • 4
  • 10
  • 19
0

Maven customization is a way to speed up javadoc generation.

Another approach would be to change the doclet used for generating the javadoc. The maven javadoc plugin allow you to change the doclet used to generate the javadoc

https://maven.apache.org/plugins/maven-javadoc-plugin/examples/alternate-doclet.html

I did found the following commercial doclet (I'm not affiliated with them in any way) wich claims to be faster than traditional javadoc. It offers a free/trial/commercial license. If you're realy eager to speed up your javadoc build maybe it is worth to look if it's worth the price

http://www.filigris.com/docflex-javadoc

Maybe opensource alternatives exists on internet...

Mumrah81
  • 1,914
  • 1
  • 13
  • 21
-3

Use doxygen instead of the regular mvn, what you are using now.

gyurix
  • 1,046
  • 8
  • 22