How to tell if two arrays are permutations of each other (without the ability to sort them)

Go To StackoverFlow.com

1

If I have two different arrays and all I can do is check whether two elements in the arrays are equal (in other words, there is no comparison function (beyond equals) for the elements to sort them), is there any efficient way to check whether one array is a permutation of the other?

2012-04-04 01:43
by Tim
Would a O ( n ^2) algorithm suffice - Words Like Jared 2012-04-04 01:46
What's the requirement for the space complexity of the solution - Kiril 2012-04-04 01:48
@WordsLikeJared: Yes, that should be good. I can't really see this having a linear-time solution - Tim 2012-04-04 01:50
Well other things like n log(n) may be possible, though - Words Like Jared 2012-04-04 01:57
@WordsLikeJared: True - Tim 2012-04-04 02:14
Wait, so you're happy with an O(n^2) solution, but not with sorting? Why - Nick Johnson 2012-04-05 06:07
@NickJohnson: because the elements in my array don't necessarily have anything they can be compared by (except for equality). it's not necessarily going to be numbers, for example - Tim 2012-04-05 17:06
@Tim Fair enough. In that case, the hashtable solution should work fine, right - Nick Johnson 2012-04-06 00:15


4

Words Like Jared's brute force solution should work, but it is O(n^2).

If the elements are hashable, you can achieve O(n).

def isPermutation(A, B):
    """
    Computes if A and B are permutations of each other.
    This implementation correctly handles duplicate elements.
    """
    # make sure the lists are of equal length
    if len(A) != len(B):
        return False

    # keep track of how many times each element occurs.
    counts = {}
    for a in A:
        if a in counts: counts[a] = counts[a] + 1
        else: counts[a] = 1

    # if some element in B occurs too many times, not a permutation
    for b in B:
        if b in counts:
            if counts[b] == 0: return False
            else: counts[b] = counts[b] - 1
        else: return False

    # None of the elements in B were found too many times, and the lists are
    # the same length, they are a permutation
    return True

Depending on how the dictionary is implemented (as a hashset vs a treeset), this will take either O(n) for hashset or O(n log n) for treeset.

2012-04-04 02:13
by Nicu Stiurca
Search in a hash is not O(1), as far as I know. If that's the case, you have not gotten O(n) - Words Like Jared 2012-04-04 02:21
Well technically search is O(1+n/b) where n is the number of elements and b is the number of bins in the table. Most reasonable implementations of hash sets though choose b > n, so on average search is constant time because collisions are rare (given a good hash). Insertion is always O(1) though - Nicu Stiurca 2012-04-04 02:33
Well if this question allows for hashing, and if you specify average- and not worst-case, then I will agree that your algorithm runs, on average, O(n), but has not improved the theoretical worst-case of O(n^2), if Wiki is correct and hashes have a worst case of O(n) for search (given a O(n) space implementation). Now if you're willing to do the analysis of choosing a hash size function (# of bins) that has a space complexity >> O(n) (i.e. take into account initializing your hash memory into your algorithm), I'll read. I'm interested. : - Words Like Jared 2012-04-04 02:57
Related: <code>collections.Counter</code>. You're basically implementing its constructor along the way. : - Xavier Ho 2012-04-04 03:14
@Words Like Jared: Yes, worst case is still O(n^2), but you have to be very unlucky and/or have a very bad hash - Nicu Stiurca 2012-04-04 17:20
@XavierHo: Thanks, I always forget where to find Python's Counter functionality when I need it. Regardless, I think my solution illustrates the algorithm more clearly since this question is about an algorithm rather than specific implementations - Nicu Stiurca 2012-04-04 17:28
In practice, this solution is O(n) - there's really no point in worrying about pathological worst cases where every element hashes to the same value - Nick Johnson 2012-04-05 06:08


3

This implementation might be wrong, but the general idea should be correct. I am just starting python, so this may also be an unconventional or non-pythonic style.

def isPermutation(list1, list2):
    # make sure the lists are of equal length
    if (len(list1) is not len(list2)):
        return False

    # keep track of what we've used
    used = [False] * len(list1)

    for i in range(len(list1)):
        found = False

        for j in range(len(list1)):
            if (list1[i] is list2[j] and not used[j]):
                found = True
                used[j] = True
                break

        if (not found):
            return False

    return True
2012-04-04 01:56
by Words Like Jared
If it is python, its as simple as set(A)==set(B)Preet Kukreti 2012-04-04 01:58
I assume set() returns a mathematical set; i.e. no duplicates. I don't think OP said that there would always be no duplicates. That comparison you provided returning true would be a necessary, but insufficient condition. Furthermore, this was an algorithms question, not a code question. The code demonstrates the algorithm - Words Like Jared 2012-04-04 02:01
@Jared thanks for pointing that out. For some reason I just assumed the inputs would be unique - Preet Kukreti 2012-04-04 02:15


0

If hashing for your object type makes sense, you could use a temp hash set to insert all the items from array A. Then while iterating over array B, make sure each item is already in the temp hash set.

This should be faster ( O(n) ) than a naive nested O(n^2) loop. (Except for small or trivial data sets, where a simpler naive algo might outperform it)

Note that this will take O(n) extra memory, and that this approach will only work if you dont have duplicates (or dont want to count them as part of the comparison)

2012-04-04 01:49
by Preet Kukreti


0

Assuming the two arrays are equal length and an element could be appearing the arrays more than once, you could create another array of the same length of type boolean initialized to false.

Then iterate though one of the arrays and for each element check whether that element appears in the other array at a poistion where the corresponding boolean is false -- if it does, set the corresponding boolean to true. If all elements in the first array can be accounted for this way, the two arrays are equal, otherwise not (and you found at least one difference).

The memory requirement is O(n), the time complexity is O(n^2)

2012-04-04 02:02
by Attila


0

Because I can't comment yet (not enough rep), I'll just mention this here as a reply to one of the other answers: Python's set() doesn't work all that well with objects.

If you have objects that are hashable, this code should work for you:

def perm(a, b):
        dicta = {}; dictb = {}
    for i in a:
            if i in dicta: dicta[i] += 1
            else: dict[i] = 1
    for i in b:
        if i in dictb: dictb[i] += 1
        else: dict[i] = 1
    return dicta == dictb

Construct a hashmap of objects in a and the number of times they occur. For each element in b, if the element is not in the hashmap or the occurrences don't match, it is not a permutation. Otherwise, it is a permutation.

>>> perm([1,2,4], [1,4,2])
True
>>> perm([1,2,3,2,1], [1,2,1,2,3])
True
>>> perm([1,2,4], [1,2,2])
False
2012-04-04 02:10
by fwenom
Dictionaries are set-like structures. I think your solution presumes that there are no duplicates in the arrays - Words Like Jared 2012-04-04 02:13