开发者

How to do run length encoding?

开发者 https://www.devze.com 2022-12-22 13:12 出处:网络
I have a long string for example it could be \"aaaaaabbccc\". Need to represent it as \"a6b2c3\". What\'s the best way to do this? I could do this in linear time by comparing characters and incrementi

I have a long string for example it could be "aaaaaabbccc". Need to represent it as "a6b2c3". What's the best way to do this? I could do this in linear time by comparing characters and incrementing counts and then replacing the counts in the array, using two indexes in one pass. Can you guys think of a better way than t开发者_运维百科his? Are any of the encoding techniques going to work here?


The common solution for this is RLE - Run-length encoding, the Wikipedia article has sample implementation code.


I don't think there is a faster way to solve it.

Informally you can think that a sub-linear complexity implies to do less comparisons that the number of the characters in the string you want to compress. But with a number of comparisons such small you can't be sure of some character, you can't know what they contain because you don't have enough information.. this means that you can't obtain a loseless compression.


I think you're asking, "Is there a better than linear way to do run-length encoding"? If so, the answer is no.


I have a implemented an encoding for bytes though. Hope it helps.

 public byte[] Encode(byte[] original)
            {
                // TODO: Write your encoder here
                if (original==null || original.Count() == 0) // Check for invalid inputs
                    return new byte[0];

                var encodedBytes = new List<byte>();         // Byte list to be returned
                byte run = 0x01;

                for (int i = 1; i < original.Length; i++)
                {
                    if (original[i] == original[i - 1])     // Keep counting the occurences till this condition is true
                        run++;
                    else                                    // Once false,  
                    {
                        encodedBytes.Add(run);              // add the total occurences followed by the 
                        encodedBytes.Add(original[i - 1]);  // actual element to the Byte List 
                        run = 0x01;                         // Reset the Occurence Counter  
                    }
                    if (i == original.Length - 1)          
                    {
                        encodedBytes.Add(run);
                        encodedBytes.Add(original[i]);
                    }
                }

               return  encodedBytes.Count()==0 ? new byte[0] : encodedBytes.ToArray<byte>();
            }

var a = new byte[]{0x01, 0x02, 0x03, 0x04};
var b = new byte[]{0x01, 0x01, 0x01, 0x02, 0x01, 0x03, 0x01, 0x04};
var EncodedA =  Encode(a);
var isAEqualB = EncodedA.SequenceEqual(b); should return true
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号