Very hard sorting algorithm problem - O(n) time - Time complextiy_问答_开发者

Since the problem is long i can not describe it开发者_运维技巧 at title.

Imagine that we have 2 unsorted integer arrays. Both array lenght is n and they are containing interegers between 0 - n^765 (n power 765 maximum) .

I want to compare both arrays and find out whether they contain any same integer value or not with in O(n) time complexity.

no duplicates are possible in the same array

Any help and idea is appreciated.

What you want is impossible. Each element will be stored in up to log(n^765) bits, which is O(log n). So simply reading the contents of both arrays will take O(n*logn).

If you have a constant upper bound on the value of each element, You can solve this in O(n) average time by storing the elements of one array in a hash table, and then checking if the elements of the other array are contained in it.

Edit:

The solution you may be looking for is to use radix sort to sort your data, after which you can easily check for duplicate elements. You would look at your numbers in base n, and do 765 passes over your data. Each pass would use a bucket sort or counting sort to sort by a single digit (in base n). This process would take O(n) time in the worst case (assuming a constant upper bound on element size). Note that I doubt anyone would ever choose this over a hash table in practice.

By assuming multiplication and division is O(1):

Think about numbers, you can write them as:

Number(i) = A0 * n^765 + A1 * n^764 + .... + A764 * n + A765.

for coding number to this format, you should just do Number / n^i, Number % n^i, if you precompute, n^1, n^2, n^3, ... it can be done in O(n * 765)=> O(n) for all numbers. precomputation of n^i, can be done in O(i) since i at most is 765 it's O(1) for all items.

Now you can write Numbers(i) as array: Nembers(i) = (A0, A1, ..., A765) and know you can radix sort items :

first compare all A765, then ...., All of Ai's are in the range 0..n so for comparing Ai's you can use Counting sort (Counting sort is O(n)), so your radix sort is O(n * 765) which is O(n).

After radix sort you have two sorted array and you can simply find one similar item in O(n) or use merge algorithm (like merge sort) to find most possible similarity (not just one).

for generalization if the size of input items is O(n^C) it can be sorted in O(n) (C is fix number). but because the overhead of this way of sortings are big, prefer to using quicksort and similar algorithms. Simple sample of this question can be found in Introduction to Algorithm book, which asks if the numbers are in range (0..n^2) how to sort them in O(n).

Edit: for clarifying how you can find similar items in 2-sorted lists:

You have 2 sorted list, for example in merge sort how do you can merge two sorted list to one list? you will move from start of list 1, and list 2, and move your head pointer of list1 while head(list(1)) > head(list(2)), and after that do this for list2 and ..., so if there is a similar item your algorithm will stop (before reach the end of lists), or in the end of two lists your algorithm will stop.

it's as easy as bellow:

public int FindSimilarityInSortedLists(List<int> list1, List<int> list2)
{
   int i = 0;
   int j = 0;


   while (i < list1.Count && j < list2.Count)
   {
      if (list1[i] == list2[j])
         return list1[i];

      if (list1[i] < list2[j])
         i++;
      else
         j++;    
   }
   return -1; // not found
}

If memory was unlimited you could simply create a hashtable with the integers as keys and the values the number of times they are found. Then to do your "fast" look up you simple query for an integer, discover if its contained within the hash table, and if found check that the value is 1 or 2. That would take O(n) to load and O(1) to query.

I do not think you can do it O(n). You should check n values whether they are in the other array. This means you have n comparing operations at least if the other array has just 1 element. But as you have n element it the other array as well, you can do it just O(n*n)