I need to resolve a large number (hundreds of thousands) of domains to IP addresses in Java. While using InetAddress.getByName()
is feasible for small numbers it is far to slow for use in large quantities (probably because it is only sending one request at a time to the DNS server and waiting for the response before moving on to the next one).
Is there a more efficient way (such as sending them to the DNS server in bulk) that would cut down the time required to resolve a large number of domains?
At fmucar's request I'm adding the code used to try a more multi-threaded approach:
Set<String> ips = Collections.synchronizedSet(new HashSet<String>());
int i = 0;
List<Set<String>> sets = new ArrayList<Set<String>>();
for (String host : domains) {
if (i++ % 5 == 0) {
sets.add(new HashSet<String>());
}
Set<String> ipset = sets.get(sets.size()-1);
ipset.add(host);
}
for (Set<String> ipset : sets) {
Thread t = new Thread(new DomainResolver(ips, ipset));
t.start();
}
At 250 per thread we peaked around 700 results per minute. Which, while better than before (<300) was still not that great when needing to resolve hundreds of thousands. Lowering it to only 5 per thread greatly speeds this up to several thousand per minute. This obviously creates an insane amount of threads though, so presently investigating doing the resolving in C to make use of http://www.chiark.greenend.org.uk/~ian/adns/