Labels

Thursday 17 October 2013

Bit Shift and ... in java

source: http://rodrigosasaki.com/2013/06/06/bit-shift-and-bitwise-operators/



Bit Shift and Bitwise Operators

Hello, everyone! :)
Today we’re gonna talk a little about Java’s Bit Shift and Bitwise operators, that allow us to work directly with binary values, which means that we’re working directly with bits
What are these ‘bits’?
Bit is the abbreviation to (BInary digiT), which is the smallest unit where we can store information. And as it name says, this unit can only store 2 different values, which are and 1.

Bitwise

When working with these bits we can perform some operations that are called bitwise.
And for each of these operations we have a correspondent Java Operator. So I’ll explain the operation itself, describe how it works and then tell you the corresponding operator. Shall we?
NOT 
The first operation is the easiest to understand. It’s called NOT. This is an unary operator, which means that it only has one operand, so it’s result depends entirely on a single value.
What this operation does is invert (negate) the value of a bit, and it’s quite simple, if it’s a 0 the result will be 1, and if it is an 1 the result will be 0.
The Java operator that performs this operation is the ~
We can do a quick test, when we run this code:
public static void main(String[] args){
    System.out.println(Integer.toBinaryString(Integer.MAX_VALUE));
}
The printed value is: 01111111111111111111111111111111
The value you get may not have that left-hand zero, but it exists nonetheless and it’s quite important, we’ll see more about him later.
Now we can invert the printed value like this:
public static void main(String[] args){
    System.out.println(Integer.toBinaryString(~Integer.MAX_VALUE));
}
And now the printed value is:  10000000000000000000000000000000.
As we can see, all the 1s became 0s and the initial 0 became 1.

AND

This operation receives 2 values and return a result that will vary depending on these operands. We follow the basic rule:
The result will be 1 if, and only if, both operands are 1.
We can use the following table as reference:
Based on that we can see that if either of the operands are 0, the result will also be 0.
The Java operator that performs an AND is the & and we can see that in the following example:
public static void main(String[] args) {
    System.out.println(0 & 0); // Prints 0
    System.out.println(0 & 1); // Prints 0
    System.out.println(1 & 0); // Prints 0
    System.out.println(1 & 1); // Prints 1
}

OR

This operation as well as the AND and the XOR (that we’ll talk about soon) receives  2 values. What changes is the condition that says if the result is 1 or 0. In this case our rule is this:
The result will be 1 whenever either of the operands is 1.
So what we end up with is this:
We can see that in fact if either of the operands is 1 (which means the same is true if they both are), that will also be the result.
The Java operator responsible for performing an OR operation is the | and we can see him in action below:
public static void main(String[] args){
    System.out.println(0 | 0); // Prints 0
    System.out.println(0 | 1); // Prints 1
    System.out.println(1 | 0); // Prints 1
    System.out.println(1 | 1); // Prints 1
}

XOR

The XOR usually is one of the operations that cause most confusion, because it’s usage is not as intuitive as the ones explained before, even though it’s extensively used in criptography and some other algorithms. It’s basic rule is this:
The result will be 1 if, and only if, one of the operands is 1.
Pay a lot of attention to the sentence above, it basically says that it will only reaturn a 1 when both operands have different values. When they both have the same value (even if both values are 1), the result will be a 0.

Check the cheatsheet:
And there’s also a Java operator for this operation, that is the ^
public static void main(String[] args){
    System.out.println(0 ^ 0); // Prints 0
    System.out.println(0 ^ 1); // Prints 1
    System.out.println(1 ^ 0); // Prints 1
    System.out.println(1 ^ 1); // Prints 0
}
You can play around with them and check all the different results, but remember that numbers in Java are not in binary form by default, so you may have to tinker with them a little to get the results you want.
Example: I want to know the result of an AND operation between the numbers 11010100 and 10101111, here’s the sample code for that particular operation, in version 6 of Java:
public static void main(String[] args){
    int n1 = Integer.valueOf("11010100", 2);
    int n2 = Integer.valueOf("10101111", 2);
    System.out.println(Integer.toBinaryString(n1 & n2));
}
For the Java 7 version we have a syntactic sugar that allows us to do this in a more direct way. We can say that a number is in base 2 by simply adding 0b ad a prefix to our binary pattern, Like this:
public static void main(String[] args){
    int n = 0b1010;
    System.out.println(n); // Prints 10
}
We can see that the code prints 10, because the binary number 1010 when showed in decimal format is in fact 10.
So the same example as before, rewritten for Java 7 is like this:
public static void main(String[] args){
    int n1 = 0b11010100;
    int n2 = 0b10101111;
    System.out.println(Integer.toBinaryString(n1 & n2));
}
The method Integer.toBinaryString still is necessary, because if we don’t use it, the value will be printed in base 10.
PS: If you’d like to read a bit more about number literals, you can read this.

Bit Shifting

Now that you already became a bitwise machine, it’s time we give you a new tool to play with. Let’s talk a little about Bit Shifting
If you think about the word Shift you’ll see that it’s a synonym to move, or dislocate. And that’s exactly what we’re about to do, dislocate bits. Fun, right?
In some moments of the explanation you may wonder: “But why am I going to want to shift a bit pattern? Where is this used in practice?”
Well, bit shifting is a very common technique applied into some more complex algorithms. As examples we have some mathematical operations (depending on your environment), CG, hash generation and so on.
To shift the bits the Java language offers urs 3 operators, and I’ll explain all of them here.
For all the operators we follow the same basic rule. The pattern to be shifted is given by the left-hand operand, and the number of bits that we’re gonna shift the pattern is given by the right-hand operand.
The first of these operators is the << that we can intuitively assume that it’ll be used to shift bits to the left, and that’s in fact what it does.
Let’s see an example:
You have the pattern 01101001 and you want to shift it one bit to the left and then see the result, so you just do it like this:
public static void main(String[] args){
    System.out.println(Integer.toBinaryString(0b01101001 << 1));
}
And when you print it you see the result 11010010, and you can see that it’s the same pattern, only shifted 1 bit to the left.
    01101001
 <<        1
    11010010
As you may have noticed, in case you want to shift it 3 bits, just send 3 as the second operand.
     01101001
<<          3
   1101001000
After that we have the operators that shift bits to the right, and there are 2 of those (>> and >>>), but to understand the difference between them, we need to understand some basics of how numbers work internally.
The problem here is related to the sign of the number, whether it’s positive or negative. To represent a negative number in base 2, we have a control bit, that is the bit that stays most to the left (remember the left bit I said was important?), so if a binary number starts with 1, we’re talking about a negative number, if it starts with a 0, we’re talking about a positive one.
An int in Java ranges from the pattern:
10000000000000000000000000000000 (in decimal it's the number -2147483648)
To the pattern:
01111111111111111111111111111111 (in decimal it's the number 2147483647)
We can see that the highest number that can be represented by an int starts with a zero, which we know now that this specific zero tells us it’s a positive number.
And the lowest number that can be represented by an int starts with a 1 followed by zeros.
So let’s get back to the operators. What’s the difference between them?
The first operator (>>) is used to shift a bit pattern to the right preserving the sign bit, which means that everything else gets shifted, but the control bit stays put. If it was a 0 it remains as 0, and if it was a 0 it remains as 1.

Now the second operator (>>>) does not preserve the sign bit, everything gets shifted, no matter where it is. And a 0 is put at the left-most bit of the pattern.
Check out the examples:
    10000000000000000000000000000111
>>                                 2
    11100000000000000000000000000001

    10000000000000000000000000000111
>>>                                2
    00100000000000000000000000000001
On the first operator (>>) we can see that the 1 is preserved at the left, whilst in the second that value is shifted.
Now that’s play with it a little. In a forum I saw an user that wanted to store the information of a Date (day, month year), in a 3-byte array. Now how would he do that?
With a 3-byte array we have 24 bits to work with
00000000 00000000 00000000
  byte1    byte2    byte3
And the information will be divided like this:
  • The first 5 bits of byte1 will store the day
  • The last 3 bits of byte1 and the first bit of byte2 will store the month
  • The rest of the bits will store the year
 day month     year
00000000 00000000 00000000
  byte1    byte2    byte3
All of this takes a little bit of Bit Shifting and Bitwise operations, but as a final result, we get a class like this:
public class Date{
    byte[] d;
    public Date(){
        d = new byte[3];
    }
    public Data(int day, int month, int year){
        this();
        setDay(day);
        setMonth(month);
        setYear(year);
    }
    public void setDay(int day){
        d[0] = (byte) (day << 3);
    }
    public int getDay(){
        return d[0] >>> 3 & 0x1F;
    }
    public void setMonth(int month){
        d[0] = (byte) (d[0] | month >>> 1);
        d[1] = (byte) (month << 7);
    }
    public int getMonth(){
        return (d[0] & 0x7) << 1 | d[1] >>> 7 & 0x1;
    }
    public void setYear(int year){
        d[1] = (byte) (d[1] | year >>> 8);
        d[2] = (byte) year;
    }
    public int getYear(){
        return (d[1] << 8 & 0x7FFF) | d[2] & 0xFF;
    }
}
Now I recommend you all play a little with this class, run tests to try to understand what happens and why it happens, so the concepts can really sink in. 
That’s all for today. See you next time :)

No comments:

Post a Comment