243

I'm sure there's a good reason, but could someone please explain why the java.util.Set interface lacks get(int Index), or any similar get() method?

It seems that sets are great for putting things into, but I can't find an elegant way of retrieving a single item from it.

If I know I want the first item, I can use set.iterator().next(), but otherwise it seems I have to cast to an Array to retrieve an item at a specific index?

What are the appropriate ways of retrieving data from a set? (other than using an iterator)

I'm sure the fact that it's excluded from the API means there's a good reason for not doing this -- could someone please enlighten me?

EDIT: Some extremely great answers here, and a few saying "more context". The specific scenario was a dbUnit test, where I could reasonably assert that the returned set from a query had only 1 item, and I was trying to access that item.

However, the question is more valid without the scenario, as it remains more focussed:

What's the difference between set and list.

Thanks to all for the fantastic answers below.

DontDivideByZero
  • 891
  • 11
  • 26
Marty Pitt
  • 26,266
  • 34
  • 115
  • 190
  • 1
    Why would you get an element from a set by index? Are you trying to use a set as a sorted array? – MSN Apr 20 '09 at 19:41
  • The particular instance here is a dbUnit test against a Set returned from a hibernate call. In my test, it's reasonable to assume (because I assert it) that the object returned is in a specific order, because of my IDataSet I used to set it up. It's a non-typical case, but lead to my curiousty about the API. – Marty Pitt Apr 20 '09 at 19:48
  • 1
    Adding things in a specific order doesn't mean they'll stay that way, unless you're using a custom Set implementation. – Michael Myers Apr 20 '09 at 19:55
  • 1
    "If I know I want the first item, I can use set.iterator().next()" - This line doesn't actually make sense. You are really saying "If I know I want the first item, by the implementation's definition of the first item, then I can...". Set itself is unordered, so indexed access doesn't make sense. Now if there were an ArrayListSet, that would make more sense (just cast to "List" and be happy). Perhaps you could give more context for the question? – jsight Apr 20 '09 at 20:25
  • Set isn't unordered! Certain implementations of it are, but some implementations are explicitly ordered in a particular way. – reinierpost Feb 02 '15 at 17:01

18 Answers18

179

Because sets have no ordering. Some implementations do (particularly those implementing the java.util.SortedSet interface), but that is not a general property of sets.

If you're trying to use sets this way, you should consider using a list instead.

Michael Myers
  • 178,094
  • 41
  • 278
  • 290
  • I'd revise that last comment to "you should use a list instead" – matt b Apr 20 '09 at 19:27
  • 10
    @matt b: No, I think he should consider it. Thinking is good. ;) – Michael Myers Apr 20 '09 at 19:28
  • 22
    "Consider" is the correct phrasing. There are two possible problems (a) He is using a set when he should be using something else, or (b) He is trying to do things with Sets that they don't support but that he could do a different way. It is good to *consider* which of these is the case. – kenj0418 Apr 20 '09 at 20:20
  • 6
    May be the simpler answer is to use a sorted set. ( I assume uniqueness played a part while choosing set). But I have a question tho, since SortedSet is ordered, why is that there is no get method in the api. – uncaught_exceptions Jan 19 '11 at 09:50
  • 1
    @Michael, it's a bad reason. He don't want the first element from the set, he wants an arbitrary element from the set. It has nothing to do with the order of the set. – Elazar Leibovich Sep 04 '11 at 12:29
  • The problem with this answer is that it misses the underlying question. There are situations where you are building up a (for wont of a better term) bag of values and (a) want to only have unique values and (b) there is a semantic difference between the bag winding up with a single entry vs multiple entries. So you can check for the 'single entry' condition but now you have to use an iterator or some other construct to get at the (now known to be) unique entry. – Jay Feb 10 '12 at 16:47
  • @Jay: That's right, and for that case you can use a SortedSet as Jonik suggested. – Michael Myers Feb 10 '12 at 17:01
  • There are three "popular" implementations of Set in Java: 1) HashSet, 2) TreeSet, and 3) LinkedHashSet. Two of the three providing ordering capability, so I would say that ordering is a general property of sets. – HDave Aug 03 '12 at 00:42
  • 6
    @HDave: No, the fact that multiple implementations of a data structure share a property does not make it a property of the data structure itself. Two of the three commonly used implementations of List (ArrayList and Vector) are random-access, but that does not make random access a property of Lists. – Michael Myers Aug 05 '12 at 04:29
  • 1
    I'm in a situation where I use a `Set` because I want no duplicate elements in my collection. For this I cannot use a `List`. I still need to get "any" element out of this `Set`, in order to sample from it. Is the best option for me to simply copy all the `Set` elements over to a `List` before I use it? – Roger Dec 16 '13 at 08:19
  • @Roger: Yes, your options (if you don't want to iterate through the whole array each time) basically are converting to a List and converting to an array. If you want a single random element, then [this answer](http://stackoverflow.com/a/124693/13531) is better. – Michael Myers Dec 19 '13 at 15:39
  • 1
    "you should consider using a list instead." Not that simple if I need something without duplicates. – Vituel Mar 23 '14 at 14:16
  • 1
    I am so sorry but which implementation provides that? All I can see is `TreeSet` and `ConcurrentSkipListSet`; both of which doesn't have any `get` method. – Sarp Kaya Jul 08 '15 at 13:52
  • why you required an order for an index-based access. This answer is no answer at all. E.g. Maps always return sets for their keySet and entrySet and still there is no explicit reason not to be able to access those by index. – MushyPeas Feb 19 '16 at 09:36
  • You are talking about get(index) by index. What about a get(Object) by equality? – Kumar Manish Apr 04 '17 at 16:46
  • @KumarManish: I assume you're talking about objects whose .equals() method may return true even if the objects are not identical in every way (otherwise Set.contains() is all that you need). In that case, you may consider using a HashMap instead. – Michael Myers Apr 04 '17 at 17:55
  • "Because sets have no ordering" that does not respond to the question... Set avoid of duplicated so why it have to be not ordered ? – Lucke Jun 19 '19 at 10:57
77

Actually this is a recurring question when writing JavaEE applications which use Object-Relational Mapping (for example with Hibernate); and from all the people who replied here, Andreas Petersson is the only one who understood the real issue and offered the correct answer to it: Java is missing a UniqueList! (or you can also call it OrderedSet, or IndexedSet).

Maxwing mentioned this use-case (in which you need ordered AND unique data) and he suggested the SortedSet, but this is not what Marty Pitt really needed.

This "IndexedSet" is NOT the same as a SortedSet - in a SortedSet the elements are sorted by using a Comparator (or using their "natural" ordering).

But instead it is closer to a LinkedHashSet (which others also suggested), or even more so to an (also inexistent) "ArrayListSet", because it guarantees that the elements are returned in the same order as they were inserted.

But the LinkedHashSet is an implementation, not an interface! What is needed is an IndexedSet (or ListSet, or OrderedSet, or UniqueList) interface! This will allow the programmer to specify that he needs a collection of elements that have a specific order and without duplicates, and then instantiate it with any implementation (for example an implementation provided by Hibernate).

Since JDK is open-source, maybe this interface will be finally included in Java 7...

Sorin Postelnicu
  • 1,169
  • 1
  • 9
  • 14
  • 3
    Great answer as far as it goes, but what do we do in the meantime? – HDave Aug 03 '12 at 00:44
  • sure it is. i used list as manytomany and onetomany ORM in hibernate before. i met a trouble(or defect) when a left join query involving more then 3 related entities, an exception was throwed. look here for more details (http://jroller.com/eyallupu/entry/hibernate_exception_simultaneously_fetch_multiple). to walk around this problem, using set as ORM mapping collection is necessary. but honestly to say, set is not convenient for accessing in programming, and also when you need an ordering collection. what we really need is "indexedset" like what Sorin Postelnicu said, SORT and UNIQUE – horaceman Oct 21 '12 at 14:14
  • 3
    Apache Commons Collections has [`ListOrderedSet`](https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/set/ListOrderedSet.html) which is what the OP needed 7 years ago (and I needed today). – Paul Jun 08 '16 at 20:31
  • @Paul: That is indeed something which looks really good. Unfortunately it still has 3 drawbacks: 1) It is a class, not an interface. 2) It's not in the JDK. 3) It's not what Hibernate queries are returning. – Sorin Postelnicu Jun 10 '16 at 13:27
  • Yeah, but other than those 3 major drawbacks it's perfect! :) In retrospect I should have posted my comment to the question and not your answer - I keyed off `What is needed is an IndexedSet (or ListSet, or OrderedSet, or UniqueList)...` and ignored `...interface`. Sorry about that! – Paul Jun 10 '16 at 14:45
29

Just adding one point that was not mentioned in mmyers' answer.

If I know I want the first item, I can use set.iterator().next(), but otherwise it seems I have to cast to an Array to retrieve an item at a specific index?

What are the appropriate ways of retrieving data from a set? (other than using an iterator)

You should also familiarise yourself with the SortedSet interface (whose most common implementation is TreeSet).

A SortedSet is a Set (i.e. elements are unique) that is kept ordered by the natural ordering of the elements or using some Comparator. You can easily access the first and last items using first() and last() methods. A SortedSet comes in handy every once in a while, when you need to keep your collection both duplicate-free and ordered in a certain way.

Edit: If you need a Set whose elements are kept in insertion-order (much like a List), take a look at LinkedHashSet.

Community
  • 1
  • 1
Jonik
  • 74,291
  • 66
  • 249
  • 356
25

This kind of leads to the question when you should use a set and when you should use a list. Usually, the advice goes:

  1. If you need ordered data, use a List
  2. If you need unique data, use a Set
  3. If you need both, use either: a SortedSet (for data ordered by comparator) or an OrderedSet/UniqueList (for data ordered by insertion). Unfortunately the Java API does not yet have OrderedSet/UniqueList.

A fourth case that appears often is that you need neither. In this case you see some programmers go with lists and some with sets. Personally I find it very harmful to see set as a list without ordering - because it is really a whole other beast. Unless you need stuff like set uniqueness or set equality, always favor lists.

Sorin Postelnicu
  • 1,169
  • 1
  • 9
  • 14
waxwing
  • 17,792
  • 8
  • 61
  • 81
17

I'm not sure if anybody has spelled it out exactly this way, but you need to understand the following:

There is no "first" element in a set.

Because, as others have said, sets have no ordering. A set is a mathematical concept that specifically does not include ordering.

Of course, your computer can't really keep a list of stuff that's not ordered in memory. It has to have some ordering. Internally it's an array or a linked list or something. But you don't really know what it is, and it doesn't really have a first element; the element that comes out "first" comes out that way by chance, and might not be first next time. Even if you took steps to "guarantee" a particular first element, it's still coming out by chance, because you just happened to get it right for one particular implementation of a Set; a different implementation might not work that way with what you did. And, in fact, you may not know the implementation you're using as well as you think you do.

People run into this ALL. THE. TIME. with RDBMS systems and don't understand. An RDBMS query returns a set of records. This is the same type of set from mathematics: an unordered collection of items, only in this case the items are records. An RDBMS query result has no guaranteed order at all unless you use the ORDER BY clause, but all the time people assume it does and then trip themselves up some day when the shape of their data or code changes slightly and triggers the query optimizer to work a different way and suddenly the results don't come out in the order they expect. These are typically the people who didn't pay attention in database class (or when reading the documentation or tutorials) when it was explained to them, up front, that query results do not have a guaranteed ordering.

skiphoppy
  • 83,104
  • 64
  • 169
  • 214
  • Heh, and of course the ordering usually changes right after the code goes into production, when it's too slow, so they add an index to speed up the query. Now the code runs fast, but gives the wrong answers. And nobody notices for three or four days...if you're lucky. If you're not lucky, nobody notices for a month... – TMN Apr 21 '09 at 00:36
  • I don't think he missed that (maybe he was sloppy with the notation). He don't want the first element from the set, he wants an arbitrary element from the set. You can give him an arbitrary element since `Set` is `Iterable`. – Elazar Leibovich Sep 04 '11 at 12:31
  • You are talking about get(index) by index. What about a get(Object) by equality? – Kumar Manish Apr 04 '17 at 16:45
10

some data structures are missing from the standard java collections.

Bag (like set but can contain elements multiple times)

UniqueList (ordered list, can contain each element only once)

seems you would need a uniquelist in this case

if you need flexible data structures, you might be interested in Google Collections

Andreas Petersson
  • 15,672
  • 11
  • 56
  • 91
7

That's true, element in Set are not ordered, by definition of the Set Collection. So they can't be access by an index.

But why don't we have a get(object) method, not by providing the index as parameter, but an object that is equal to the one we are looking for? By this way, we can access the data of the element inside the Set, just by knowing its attributes used by the equal method.

walls
  • 71
  • 1
  • 1
7

If you are going to do lots of random accesses by index in a set, you can get an array view of its elements:

Object[] arrayView = mySet.toArray();
//do whatever you need with arrayView[i]

There are two main drawbacks though:

  1. It's not memory efficient, as an array for the whole set needs to be created.
  2. If the set is modified, the view becomes obsolete.
fortran
  • 67,715
  • 23
  • 125
  • 170
5

That is because Set only guarantees uniqueness, but says nothing about the optimal access or usage patterns. Ie, a Set can be a List or a Map, each of which have very different retrieval characteristics.

jsight
  • 26,474
  • 23
  • 103
  • 138
5

The only reason I can think of for using a numerical index in a set would be for iteration. For that, use

for(A a : set) { 
   visit(a); 
}
Hugo
  • 3,801
  • 1
  • 27
  • 31
3

I ran into situations where I actually wanted a SortedSet with access via index (I concur with other posters that accessing an unsorted Set with an index makes no sense). An example would be a tree where I wanted the children to be sorted and duplicate children were not allowed.

I needed the access via index to display them and the set attributes came in handy to efficiently eliminate duplicates.

Finding no suitable collection in java.util or google collections, I found it straightforward to implement it myself. The basic idea is to wrap a SortedSet and create a List when access via index is required (and forget the list when the SortedSet is changed). This does of course only work efficiently when changing the wrapped SortedSet and accessing the list is separated in the lifetime of the Collection. Otherwise it behaves like a list which is sorted often, i.e. too slow.

With large numbers of children, this improved performance a lot over a list I kept sorted via Collections.sort.

buchweizen
  • 31
  • 1
2

Please note only 2 basic data structure can be accessed via index.

  • Array data structure can be accessed via index with O(1) time complexity to achieve get(int index) operation.
  • LinkedList data structure can also be accessed via index, but with O(n) time complexity to achieve get(int index) operation.

In Java, ArrayList is implemented using Array data structure.

While Set data structure usually can be implemented via HashTable/HashMap or BalancedTree data structure, for fast detecting whether an element exists and add non-existing element, usually a well implemented Set can achieve O(1) time complexity contains operation. In Java, HashSet is the most common used implementation of Set, it is implemented by calling HashMap API, and HashMap is implemented using separate chaining with linked lists (a combination of Array and LinkedList).

Since Set can be implemented via different data structure, there is no get(int index) method for it.

coderz
  • 4,405
  • 8
  • 40
  • 60
  • Finger trees (See Haskell's `Data.Sequence.lookup` function) also allow accessing via index (`O(1)` near the ends `O(log n)` near the middle, more accurately `O(min(log(k), log(n-k)))`), also binary trees do as well (See Haskell's `Data.Set.lookupIndex` function). So your initial assertion that "Please note only 2 basic data structure can be accessed via index" is not correct. – semicolon Dec 19 '16 at 08:36
1

The reason why the Set interface doesn't have a get index-type call or even something even more basic, such as first() or last(), is because it is an ambiguous operation, and therefore a potentially dangerous operation. If a method returns a Set, and you call, say first() method on it, what is the expected result, given that the a generic Set makes no guarantees on the ordering? The resultant object could very well vary between each call of the method, or it might not and lull you into a false sense of security, until the library you're using changes changes the implementation underneath and now you find that all your code breaks for no particular reason.

The suggestions about workarounds listed here are good. If you need indexed access, use a list. Be careful with using iterators or toArray with a generic Set, because a) there is no guarantee on the ordering and b) there is no guarantee that the ordering will not change with subsequent invocations or with different underlying implementations. If you need something in between, a SortedSet or a LinkedHashSet is what you want.

// I do wish the Set interface had a get-random-element though.

Dan
  • 75
  • 8
1

java.util.Set is a collection of un-ordered items. It doesn't make any sense if the Set has a get(int index), because Set doesn't has an index and also you only can guess the value.

If you really want this, code a method to get random element from Set.

Dil.
  • 1,788
  • 6
  • 33
  • 61
0

If you don't mind the set to be sorted then you may be interested to take a look at the indexed-tree-map project.

The enhanced TreeSet/TreeMap provides access to elements by index or getting the index of an element. And the implementation is based on updating node weights in the RB tree. So no iteration or backing up by a list here.

0

Set is an interface and some of its implementation classes are HashSet, TreeSet and LinkedHashSet. It uses HashMap under the hood to store values. Because HashMap does not preserve the order, it is not possible to get value by index.

You now must be thinking how Set is using HashMap since HashMap stores a key, value pair but the Set does not. valid question. when you add an element in Set, internally, it maintains a HashMap where the key is the element you want to enter in Set and the value is the dummy constant. Below is an internal implementation of add function. Hence, all the keys in the HashMap will have the same constant value.

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}
magnonyms
  • 1
  • 1
  • *All `Set`s implementations are using `HashMap` under the hood to store values* can you substantiate that claim for `TreeSet`? – greybeard Mar 21 '20 at 05:27
  • 1
    `the keys in the HashMap will have the same constant value` *the keys in the `HashMap` will* map to *one and the same immutable `Object`* – greybeard Mar 21 '20 at 05:33
  • @greybeard http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/8ed8e2b4b90e/src/share/classes/java/util/TreeSet.java#l254 – magnonyms Mar 21 '20 at 05:44
0

You can do new ArrayList<T>(set).get(index)

Janus Troelsen
  • 17,537
  • 13
  • 121
  • 177
  • This returns a List of Sets and get(index) returns a Set. Rather, I used: `new ArrayList(t).get(0)` I think there is valid opposition to the idea of getting a particular element from a Set by an index. But it would be nice if Set had an only() member function that, for Sets of size 1, provided easy access to the only element in the Set. This would save the aforementioned `new ArrayList` or `for (Foo foo : foos) { return foo; }` – Doug Moscrop Apr 12 '12 at 15:48
-3

To get element in a Set, i use to following one:

public T getElement(Set<T> set, T element) {
T result = null;
if (set instanceof TreeSet<?>) {
    T floor = ((TreeSet<T>) set).floor(element);
    if (floor != null && floor.equals(element))
    result = floor;
} else {
    boolean found = false;
    for (Iterator<T> it = set.iterator(); !found && it.hasNext();) {
    if (true) {
        T current = it.next();
        if (current.equals(element)) {
        result = current;
        found = true;
        }
    }
    }
}
return result;
}
lala
  • 3
  • 1
  • the function isn't what the question asked for. we need the index, not the value. what is your function doing anyway? looks like it just returns the element if it was equal to a element within. what does this do that contains() doesn't? – Janus Troelsen Feb 02 '12 at 22:52
  • Where is the `T` defined? Why `if (true)`? – quantum Nov 16 '12 at 21:24