Friday, November 7, 2008

Vector v. ArrayList

Vector has always been by favorite Java List implementation. With it's auto-resizing and easy iteration features it's always been my first choice for storing data in list form. (As an aside, the new version of Sun's Java which requires Vector to be parameterized seems broken to me. But I digress.) Another feature of Vector that I always regarded as a bonus is its built-in synchronization. However, I've begun to rethink that.

Only recently, I've begun to do much concurrent programming, and usually using my MapReduce class. (I have a new, much improved version of that, which I will blog about soon.) MapReduce uses Vector but not in a way that requires synchronization. In fact, I can't think of a single place where I'm using Vector that does.

Does this matter? I think so. Being synchronized, every addition to or removal from a Vector requires a lock to be obtained and then released. Not very noticeable overhead in a small program, but lately I've been trying to shave milliseconds off the running time in each module of a program that runs for over two hours.

Enter ArrayList. I never used ArrayList before but apparently it is basically equivalent to Vector except without synchronization. So I ran a quick test of adding and then removing a million Integer objects from both Vector and ArrayList. (Care had to be taken to avoid including garbage collection time in the results.) Guess what? ArrayList ran in about 3/4 of the time.

I think I have a new favorite Java List implementation.

No comments: