How can I find the "least common part" among the paths on the same logical drive?_问答_开发者

My program has several paths to be watched,开发者_如何转开发 such as

C:\XML

C:\MyProg\Raw

C:\MyProg\Subset\MTSAT

C:\MyProg\Subset\GOESW

D:\Dataset\Composite

D:\Dataset\Global

E:\Dataset\Mosaic

I want to add 4 paths, namely C:\XML, C:\MyProg, D:\Dataset and E:\Dataset, to my CFolderWatch class instance for the purpose of folder watching insetad of all 7 above-mentioned paths as long as its "Include Subdirectory" switch is set to TRUE. Suppose all paths of being watched have been added to a vector container.

Therefore, my question is: How can I find the "least common part" among the paths on the same logical drive? Thank you in advance!

Detailed explanation to my question: 1. I got some user-defined directories. 2. I want to these directories to be watched. 3. Before watching, I want to do some preparatory job, for example, find the common part among the path(s) on the same logical drive to avoid possibly adding so many paths to my watching class. For instance, if there are 3 paths on the logical drive C: as follows: C:\test\Data\R1, C:\test\Data\R2, and C:\test\Data\R3, the common path is "C:\test\Data". So, we should add "C:\test\Data" to the watching module, not the three paths. What I mean the common path here is that it has at least one level of directory. If one path has no common path with the others, just returns unchanged. 4. First thing first, the algorithm should handle different logical drive(s). That is to say, all paths must be classified on the basis of their respective drive letter. Then, find the common path among the passed paths on the same logical drive letter.

There's no algorithm; you're using inconsistent logic.

Consider just the set

C:\XML
C:\MyProg\Raw
C:\MyProg\Subset\MTSAT
C:\MyProg\Subset\GOESW

There are at least 4 different solutions:

C:\
C:\XML and C:\MyProg\
C:\XML , C:\MyProg\Raw and C:\MyProg\Subset
C:\XML , C:\MyProg\Raw, C:\MyProg\Subset\MTSAT and C:\MyProg\Subset\GOESW

You fail to explain why 2. is the correct solution. It is possible to write an algorithm that finds solution N-1 given solution N as input, so you can get from the input (4) to (1) in three steps. But we don't understand why we should stop at (2).

A two-step algorithm:

Partition your directories by drive letter and first directory level.
For each partition, keep the longest prefix.

The longest prefix per partition can be easily obtained by counting how many prefix characters each directory has in common with its next sibling, and keeping the minimum. Using mismatch, for example.

As JB said I think this is a two step solution and I have tried some basic coding here in C#. If you want it in java I suppose you must be knowing how to use it :)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace So_Try1
{
      class Program
{
    public static int nthOccurrence(String str, char c, int n)
    {
        int pos = str.IndexOf(c, 0);
        while (n-- > 0 && pos != -1)
            pos = str.IndexOf(c, pos + 1);
        return pos;
    }

    static void Main(string[] args)
    {
        List<String> pathString = new List<string>();
        pathString.Add("C:\\XML");
    pathString.Add("C:\\MyProg\\Raw");
    pathString.Add("C:\\MyProg\\Subset\\MTSAT");
    pathString.Add("C:\\MyProg\\Subset\\GOESW");
        pathString.Add("D:\\Folder2\\Mosaic");
        pathString.Add("D:\\Folder2\\Mosaic\\SubFolder");
        pathString.Add("D:\\Dataset\\Composite");
        pathString.Add("D:\\Dataset\\Global");
        pathString.Add("F:\\Folder1\\Mosaic");
        pathString.Add("H:\\Folder2\\Mosaic");
        pathString.Add("D:\\Folder2\\Mosaic");
        pathString.Add("D:\\Folder2\\Mosaic\\SubFolder");
        pathString.Add("E:\\Dataset\\Mosaic"); 
        Dictionary<String, int> PathDict = new Dictionary<string,int>();

        foreach (String str in pathString)
        {
            int count = 0;
            foreach (char c in str)
                if (c == '\\') count++;
            while (count > 0)
            {
                int index = nthOccurrence(str, '\\', count);
                String tempPath;
                if (index < 0)
                {
                    //Console.WriteLine(str);
                    tempPath = str;
                }
                else
                {
                    //Console.WriteLine(str.Substring(0,index));
                    tempPath = str.Substring(0, index);
                }

                if (PathDict.ContainsKey(tempPath))
                {
                    PathDict[tempPath]++;
                }
                else
                {
                    foreach (var keys in PathDict.Keys)
                    {
                        if (tempPath.IndexOf(keys) > 0)
                        {
                            PathDict[keys]++;
                        }
                    }
                    PathDict.Add(tempPath, 1);
                }
                count--;
            }
        }
        foreach(var keyValue in PathDict){
            if(keyValue.Value > 1)
                Console.WriteLine(keyValue.Key);
            /*Console.WriteLine(keyValue.Key + " - " + keyValue.Value);*/
        }               
    }
}
}

This is a very simple program that assumes that you have your paths in a string and checking for duplicates. If for a large no.of paths you must adopt a much more efficient solution.