Sunday, 17 June 2012

Storing two integers in one integer: Encoding and decoding

Intro:

 

Let's start with what a basic understanding of how integers are stored in the computer ....

Whatever programming language you use, in almost all cases,  there is a range of the value you can store in an integer variable. But, in practical cases, this range is quite high and it is not reached with optimized calculations.
And the most interesting fact is that, irrespective of the value of the integer, it is internally converted into a binary number of a fixed length (32 or 64, depending on the system and mechanism of conversion). In case of relatively small integers, their binary equivalents are much smaller in length than that fixed length and hence, these binary equivalents are filled with zeroes in the left to make them of that fixed length (filling with zeroes is done to keep the decimal equivalent  same). Now this binary number, always being of a fixed length, consumes the same amount of memory space.

This is where the process described in this post will come to use...

This process takes benefit of the fact (described earlier) that small intgers are also converted into a binary of larger length (than optimally required to store it). We will try to convert two integers into one bigger integer which retains the both numbers in the order supplied (referred to as encoding, I really like that term :D) and later recover both the integers from that bigger integer in the order supplied (no prizes for guessing.. referred to as decoding).

Just for those interested, the approach I describe in the later section, was originally inspired by the use of the escape character, \  in JavaScript.

 

Encoding logic:


Note: I will be using zero as the "separating" number in this example and referring to the two original numbers as numb1 and numb2.

So, the rules for encoding goes below:

1. Whenever one of the digits in numb1 or numb2 is zero, one extra zero is added before (or after, it doesn't really matter) that original zero. But, what is important is that this newly added zeroes should be just left as they are and should NOT be subjected to the previous part of this rule, as that will cause an infinite loop.
Also, this step should be done only once for the whole number.
In short, whenever there is a zero in the original number, they become a pair of zeroes in the modified number.
For e.g. after applying this rule on 1203, it becomes 12003 and similarly, applying on 1002 makes it becomes100002.

2. To get the final encoded number, the modified first number is to written side by side to the of the modified second number and a single zero is to be inserted between those two. (The whole thing can be done more easily in the form of a string but remember, no spaces anywhere.)
For e.g. if numb1=202 and numb2=0, the encoded number becomes 2002000. (The bold zero is the single zero inserted between the two modified numbers 2002 and 00).
Similarly, if numb1=120 and numb2=5, the encoded number is 120005.

 This is how we get the encoded number and to be true, this final encoded number makes little sense to anyone. So,we need to decode it now to get back those original integers.

Decoding logic:

 

So, you think it'll be quite easy right?

Well, I can't disagree more.

Let's deal with the most easy case first and the harder next:

1. In the encoded number, there are no 3 successive zeroes and there is one and only one single zero.

A:  This is actually the case where numb1 and/or numb2 contains non-zero digit in the terminal position (the front terminal position doesn't count because a number can't start with zero. Even if you write 012, it is considered as 12 by the compiler).

This case can be decoded by the following logic:
  • 1.1 The encoded number is separated into two separate numbers at the position where a single zero is found, such that this zero is not considered within any of the separated numbers. The respective position (which number was in the front and which was in the back in the encoded number) of these two separated numbers must be remembered for the later step.
  • 1.2 Now, with both of these separated numbers separately, whenever there are two zeroes in a number side-by-side, one of them is removed. This process will leave us with the two original numbers.
  • 1.3 To get the original numbers in the order specified, the logic is that, whichever number was in the front in the encoded number leads to numb1 (after the previous two steps) and the other one is numb2.
  • Example: If the encoded number was 12001023, then the numb1 and numb2 decoded in this process should be respectively 1201 and 23.
2. There is no single zero in the encoded number or there are more than 2 successive zeroes.
A: This is actually the case where numb1 and/or numb2 contains zero in the terminal position (again, only the last position i.e. digit's position counts). For eg., numb1=20 and numb2=0.
So, in this case, numb1  and/or numb2 can be equal to zero.

Logic for decoding:
  • 2.1 When there are more than two successive zeroes in the encoded number, moving from left to right, the first two are considered as a pair and one of them is removed and this process continues for the rest of the digits of the number. 
  •     2.1.1 If a single zero is left out after completing this process and it is not in the terminal position, then apply logic 1.1, 1.2 and 1.3 to get numb1 and numb2.
  •     2.1.2 If a single zero is left out after completing this process (logic 2.1) and it is in the terminal position, then numb2=0 and numb1 is equal to this processed number.
  •     2.1.3 If no zero is left out after this process (logic 2.1), then this processed number is equal to numb2 and numb1 is equal to zero.
  • 2.2 When there are no zeroes, then numb1=0 and numb2=encoded number.
 

Conclusion:

 

That is pretty much the end of encoding-decoding logic, but there are some more things to say. 

First of all it's better to use the number with the least number of total appearance in numb1 and numb2 as the "separating" digit instead of zero.

And secondly, this whole process will work fine only when you are working with relatively smaller numbers, because else the encoded number may get bigger than the highest possible value that can be stored successfully in integer variables.

And thirdly, as you might have guessed, I have made a small app for this with Gamemaker : GML link and Exe link

So, till the next time, good bye.

4 comments:

  1. It seems that everybody is into this kind of stuff lately. Don’t really understand it though but thanks for trying to explain it. Appreciate you shedding light into this matter. Keep up your work.
    oklahoma city roofing

    ReplyDelete
  2. Thanks for your comment, but I am not really sure what you mean by "everyone".
    Can you mention where you have found similar stuff?

    ReplyDelete
  3. The comment by that guy was most likely generated by some spam generator. Those spam generator are getting smarter by day(using nlp probably), and are successfull too, I checked out his oklahoma city roofing link before coming to this conclusion, so clearly it worked on me. Anyways nice post though :)

    ReplyDelete