Log in

No account? Create an account
Clip Man


Daniel Einspanjer's journal

Data warehousing, ETL, BI, and general hackery

Previous Entry Share Next Entry
Performance improvements at the cost of complexity
Clip Man
I discovered something that I feel is a bit of a bug in the Sun Java implementation. 

If you pass in a string to the method InetAddress.getByName(), it does a bunch of testing to see if it is a domain name or a literal IP address.
If it is an IPv4 address, it will then use String.split() to split the four parts.  String.split() uses regexes to do its work.

That means that if you are querying for hundreds or millions of addresses in a tight loop (as I've been doing), the JVM is spawning and compiling hundreds or millions of regex objects, in addition to a String array and four String objects per call.

So at first, I just worked around it by doing basic substringing instead of splitting.  That gave me about 100x performance improvement. But then I realized I was still generating four string objects for every call..

So I came up with this mapping method and it runs about 1000x faster with a near constant minimal memory footprint.

I pre-calculate a multidimensional array of shorts where each element is indexed by the literal character value - 48 of the digits making up the number 0 - 255.

With that array available, at run time, I can do a simple lookup of the short value and then do the math to get the long representation of the IP address.  I'm still generating a couple of references and a few intermediate int values, but the JIT optimizer can make quick work of that.

Linked is the test program I created to play with the different methods:  InetAddressParse test

Powered by ScribeFire.

  • 1

Why that complex?

Wouldn't something like this give approximately the same perf?:

int length = ip.length();
long res = 0;
int block = 0;
for (int i = 0; i < length; i++) {
char c = ip.charAt(i);
if (c == '.') {
res = res<<8 + block;
block = 0;
} else {
block = block*10 + c - '0';
return res;

Re: Why that complex?

I have to agree that this code makes a lot of sense. There were a couple of bugs in it, and unfortunately, there were a couple of bugs in my method as well. Once those were cleaned up, this one outperformed my method3 very slightly because it avoids the String.indexOf() call.

  • 1