My Technical Blog: May 2010

Tuesday, May 18, 2010

The Java IAQ: Infrequently Answered Questions

Q: What is an Infrequently Answered Question?

A question is infrequently answered either because few people know the answer or because it is about an obscure, subtle point (but a point that may be crucial to you). I thought I had invented the term, but it also shows up at the very informative About.com Urban Legends site. There are lots of Java FAQs around, but this is the only Java IAQ. (There are a few Infrequently Asked Questions lists, including a satirical one on C.)

Q:The code in a `finally` clause will never fail to execute, right?

Well, hardly ever. But here's an example where the finally code will not execute, regardless of the value of the boolean choice:

try {
    if (choice) {
      while (true) ;
    } else {
      System.exit(1);
    }
  } finally {
    code.to.cleanup();
  }

Q:Within a method `m` in a class `C`, isn't `this.getClass()` always `C`?

No. It's possible that for some object x that is an instance of some subclass C1 of C either there is no C1.m() method, or some method on x called super.m(). In either case, this.getClass()is C1, not C within the body of C.m(). If C is final, then you're ok.

Q: I defined an `equals` method, but `Hashtable` ignores it. Why?

equals methods are surprisingly hard to get right. Here are the places to look first for a problem:

You defined the wrong equals method. For example, you wrote:

public class C {
  public boolean equals(C that) { return id(this) == id(that); }
}

But in order for table.get(c) to work you need to make the equals method take an Object as the argument, not a C:

public class C {
  public boolean equals(Object that) { 
    return (that instanceof C) && id(this) == id((C)that); 
  } 
}

Why? The code for Hashtable.get looks something like this:

public class Hashtable {
  public Object get(Object key) {
    Object entry;
    ...
    if (entry.equals(key)) ...
  }
}

Now the method invoked by entry.equals(key) depends upon the actual run-time type of the object referenced by entry, and the declared, compile-time type of the variable key. So when you as a user call table.get(new C(...)), this looks in class C for the equals method with argument of type Object. If you happen to have defined an equals method with argument of type C, that's irrelevent. It ignores that method, and looks for a method with signature equals(Object), eventually finding Object.equals(Object). If you want to over-ride a method, you need to match argument types exactly. In some cases, you may want to have two methods, so that you don't pay the overhead of casting when you know you have an object of the right class:

public class C {
  public boolean equals(Object that) {
    return (this == that) 
            || ((that instanceof C) && this.equals((C)that)); 
  }

  public boolean equals(C that) { 
    return id(this) == id(that); // Or whatever is appropriate for class C
  } 
}

You didn't properly implement equals as an equality predicate: equals must be symmetric, transitive, and reflexive. Symmetric means a.equals(b) must have the same value asb.equals(a). (This is the one most people mess up.) Transitive means that if a.equals(b) and b.equals(c) then a.equals(c) must be true. Reflexive means that a.equals(a) must be true, and is the reason for the (this == that) test above (it's also often good practice to include this because of efficiency reasons: testing for == is faster than looking at all the slots of an object, and to partially break the recursion problem on objects that might have circular pointer chains).
You forgot the hashCode method. Anytime you define a equals method, you should also define a hashCode method. You must make sure that two equal objects have the same hashCode, and if you want better hashtable performance, you should try to make most non-equal objects have different hashCodes. Some classes cache the hash code in a private slot of an object, so that it need be computed only once. If that is the case then you will probably save time in equals if you include a line that says if (this.hashSlot != that.hashSlot) return false.

You didn't handle inheritance properly. First of all, consider if two objects of different class can be equal. Before you say "NO! Of course not!" consider a class Rectangle with width andheight fields, and a Box class, which has the above two fields plus depth. Is a Box with depth == 0 equal to the equivalent Rectangle? You might want to say yes. If you are dealing with a non-final class, then it is possible that your class might be subclassed, and you will want to be a good citizen with respect to your subclass. In particular, you will want to allow an extender of your class C to use your C.equals method using super as follows:

public class C2 extends C {

  int newField = 0;

  public boolean equals(Object that) {
    if (this == that) return true;
    else if (!(that instanceof C2)) return false;
    else return this.newField == ((C2)that).newField && super.equals(that);
  }

}

To allow this to work, you have to be careful about how you treat classes in your definition of C.equals. For example, check for that instanceof C rather than that.getClass() == C.class. See the previous IAQ question to learn why. Use this.getClass() == that.getClass() if you are sure that two objects must be of the same class to be considered equals.

You didn't handle circular references properly. Consider:

public class LinkedList {

  Object contents;
  LinkedList next = null;

  public boolean equals(Object that) {
    return (this == that) 
      || ((that instanceof LinkedList) && this.equals((LinkedList)that)); 
  }

  public boolean equals(LinkedList that) { // Buggy!
   return Util.equals(this.contents, that.contents) &&
          Util.equals(this.next, that.next); 
  }

}

Here I have assumed there is a Util class with:

public static boolean equals(Object x, Object y) {
    return (x == y) || (x != null && x.equals(y));
  }

I wish this method were in Object; without it you always have to throw in tests against null. Anyway, the LinkedList.equals method will never return if asked to compare two LinkedLists with circular references in them (a pointer from one element of the linked list back to another element). See the description of the Common Lisp function list-length for an explanation of how to handle this problem in linear time with only two words of extra storge. (I don't give the answer here in case you want to try to figure it out for yourself first.)

Q: I tried to forward a method to super, but it occasionally doesn't work. Why?

This is the code in question, simplified for this example:

/** A version of Hashtable that lets you do
 * table.put("dog", "canine");, and then have
 * table.get("dogs") return "canine". **/

public class HashtableWithPlurals extends Hashtable {

  /** Make the table map both key and key + "s" to value. **/
  public Object put(Object key, Object value) {
    super.put(key + "s", value);
    return super.put(key, value);
  }
}

You need to be careful when passing to super that you fully understand what the super method does. In this case, the contract for Hashtable.put is that it will record a mapping between the key and the value in the table. However, if the hashtable gets too full, then Hashtable.put will allocate a larger array for the table, copy all the old objects over, and then recursively re-calltable.put(key, value). Now, because Java resolves methods based on the runtime type of the target, in our example this recursive call within the code for Hashtable will go toHashtableWithPlurals.put(key, value), and the net result is that occasionally (when the size of the table overflows at just the wrong time), you will get an entry for "dogss" as well as for "dogs" and "dog". Now, does it state anywhere in the documentation for put that doing this recursive call is a possibility? No. In cases like this, it sure helps to have source code access to the JDK.

Q: Why does my Properties object ignore the defaults when I do a `get`?

You shouldn't do a get on a Properties object; you should do a getProperty instead. Many people assume that the only difference is that getProperty has a declared return type of String, while get is declared to return an Object. But actually there is a bigger difference: getProperty looks at the defaults. get is inherited from Hashtable, and it ignores the default, thereby doing exactly what is documented in the Hashtable class, but probably not what you expect. Other methods that are inherited from Hashtable (like isEmpty and toString) will also ignore defaults. Example code:

Properties defaults = new Properties();
defaults.put("color", "black");

Properties props = new Properties(defaults);

System.out.println(props.get("color") + ", " + 
props.getProperty(color));
// This prints "null, black"

Is this justified by the documentation? Maybe. The documentation in Hashtable talks about entries in the table, and the behavior of Properties is consistent if you assume that defauls are not entries in the table. If for some reason you thought defaults were entries (as you might be led to believe by the behavior of getProperty) then you will be confused.

Q:Inheritance seems error-prone. How can I guard against these errors?

The previous two questions show that a programmer neeeds to be very careful when extending a class, and sometimes just in using a class that extends another class. Problems like these two lead John Ousterhout to say "Implementation inheritance causes the same intertwining and brittleness that have been observed when goto statements are overused. As a result, OO systems often suffer from complexity and lack of reuse." (Scripting, IEEE Computer, March 1998) and Edsger Dijkstra to allegedly say "Object-oriented programming is an exceptionally bad idea which could only have originated in California." (from a collection of signature files). I don't think there's a general way to insure being safe, but there are a few things to be aware of:

Extending a class that you don't have source code for is always risky; the documentation may be incomplete in ways you can't foresee.
Calling super tends to make these unforeseen problems jump out.
You need to pay as much attention to the methods that you don't over-ride as the methods that you do. This is one of the big fallacies of Object-Oriented design using inheritance. It is true that inheritance lets you write less code. But you still have to think about the code you don't write.
You're especially looking for trouble if the subclass changes the contract of any of the methods, or of the class as a whole. It is difficult to tell when a contract is changed, since contracts are informal (there is a formal part in the type signature, but the rest appears only in comments). In the Properties example, it is not clear if a contract is being broken, because it is not clear if the defaults are to be considered "entries" in the table or not.

Q:What are some alternatives to inheritance?

Delegation is an alternative to inheritance. Delegation means that you include an instance of another class as an instance variable, and forward messages to the instance. It is often safer than inheritance because it forces you to think about each message you forward, because the instance is of a known class, rather than a new class, and because it doesn't force you to accept all the methods of the super class: you can provide only the methods that really make sense. On the other hand, it makes you write more code, and it is harder to re-use (because it is not a subclass).For the HashtableWithPlurals example, delegation would give you this (note: as of JDK 1.2, Dictionary is considered obsolete; use Map instead):

/** A version of Hashtable that lets you do
 * table.put("dog", "canine");, and then have
 * table.get("dogs") return "canine". **/

public class HashtableWithPlurals extends Dictionary {

  Hashtable table = new Hashtable();

  /** Make the table map both key and key + "s" to value. **/
  public Object put(Object key, Object value) {
    table.put(key + "s", value);
    return table.put(key, value);
  }

  ... // Need to implement other methods as well
}

The Properties example, if you wanted to enforce the interpretation that default values are entries, would be better done with delegation. Why was it done with inheritance, then? Because the Java implementation team was rushed, and took the course that required writing less code.

Q: Why are there no global variables in Java?

Global variables are considered bad form for a variety of reasons:

Adding state variables breaks referential transparency (you no longer can understand a statement or expression on its own: you need to understand it in the context of the settings of the global variables).
State variables lessen the cohesion of a program: you need to know more to understand how something works. A major point of Object-Oriented programming is to break up global state into more easily understood collections of local state.
When you add one variable, you limit the use of your program to one instance. What you thought was global, someone else might think of as local: they may want to run two copies of your program at once.

For these reasons, Java decided to ban global variables.

Q: I still miss global variables. What can I do instead?

That depends on what you want to do. In each case, you need to decide two things: how many copies of this so-called global variable do I need? And where would be a convenient place to put it? Here are some common solutions:

If you really want only one copy per each time a user invokes Java by starting up a Java virtual machine, then you probably want a static instance variable. For example, you have a MainWindow class in your application, and you want to count the number of windows that the user has opened, and initiate the "Really quit?" dialog when the user has closed the last one. For that, you want:	// One variable per class (per JVM) public Class MainWindow { static int numWindows = 0; ... // when opening: MainWindow.numWindows++; // when closing: MainWindow.numWindows--; }
In many cases, you really want a class instance variable. For example, suppose you wrote a web browser and wanted to have the history list as a global variable. In Java, it would make more sense to have the history list be an instance variable in the Browser class. Then a user could run two copies of the browser at once, in the same JVM, without having them step on each other.	// One variable per instance public class Browser { HistoryList history = new HistoryList(); ... // Make entries in this.history }
Now suppose that you have completed the design and most of the implementation of your browser, and you discover that, deep down in the details of, say, the Cookie class, inside the Http class, you want to display an error message. But you don't know where to display the message. You could easily add an instance variable to the Browser class to hold the display stream or frame, but you haven't passed the current instance of the browser down into the methods in the Cookie class. You don't want to change the signatures of many methods to pass the browser along. You can't use a static variable, because there might be multiple browsers running. However, if you can guarantee that there will be only one browser running per thread (even if each browser may have multiple threads) then there is a good solution: store a table of thread-to-browser mappings as a static variable in the Browser class, and look up the right browser (and hence display) to use via the current thread:	// One "variable" per thread public class Browser { static Hashtable browsers = new Hashtable(); public Browser() { // Constructor browsers.put(Thread.currentThread(), this); } ... public void reportError(String message) { Thread t = Thread.currentThread(); ((Browser)Browser.browsers.get(t)) .show(message) } }
Finally, suppose you want the value of a global variable to persist between invocations of the JVM, or to be shared among multiple JVMs in a network of machines. Then you probably should use a database which you access through JDBC, or you should serialize data and write it to a file.

Q: Can I write `sin(x)` instead of `Math.sin(x)`?

Short answer: Before Java 1.5, no. As of Java 1.5, yes, using static imports; you can now write import static java.lang.Math.* and then use sin(x) with impunity. But note the warning from Sun: "So when should you use static import? Very sparingly!"Here are some of the options that could be used before Java 1.5:

If you only want a few methods, you can put in calls to them within your own class:	public static double sin(double x) { return Math.sin(x); } public static double cos(double x) { return Math.cos(x); } ... sin(x)
Static methods take a target (thing to the left of the dot) that is either a class name, or is an object whose value is ignored, but must be declared to be of the right class. So you could save three characters per call by doing:	// Can't instantiate Math, so it must be null. Math m = null; ... m.sin(x)
java.lang.Math is a final class, so you can't inherit from it, but if you have your own set of static methods that you would like to share among many of your own classes, then you can package them up and inherit them:	public abstract class MyStaticMethods { public static double mysin(double x) { ... } } public class MyClass1 extends MyStaticMethods { ... mysin(x) }

Peter van der Linden, author of Just Java, recommends against both of the last two practices in his FAQ. I agree with him that Math m = null is a bad idea in most cases, but I'm not convinced that the MyStaticMethods demonstrates "very poor OOP style to use inheritance to obtain a trivial name abbreviation (rather than to express a type hierarchy)." First of all, trivial is in the eye of the beholder; the abbreviation may be substantial. (See an example of how I used this approach to what I thought was good effect.) Second, it is rather presumptuous to say that this is very bad OOP style. You could make a case that it is bad Java style, but in languages with multiple inheritance, this idiom would be more acceptable.
Another way of looking at it is that features of Java (and any language) necessarily involve trade-offs, and conflate many issues. I agree it is bad to use inheritance in such a way that you mislead the user into thinking that MyClass1 is inheriting behavior from MyStaticMethods, and it is bad to prohibit MyClass1 from extending whatever other class it really wants to extend. But in Java the class is also the unit of encapsulation, compilation (mostly), and name scope. The MyStaticMethod approach scores negative points on the type hierarchy front, but positive points on the name scope front. If you say that the type hierarchy view is more important, I won't argue with you. But I will argue if you think of a class as doing only one thing, rather than many things at once, and if you think of style guides as absolute rather than as trade-offs.

Q: Is null an Object?

Absolutely not. By that, I mean (null instanceof Object) is false. Some other things you should know about null:

You can't call a method on null: x.m() is an error when x is null and m is a non-static method. (When m is a static method it is fine, because it is the class of x that matters; the value is ignored.)
There is only one null, not one for each class. Thus, ((String) null == (Hashtable) null), for example.
It is ok to pass null as an argument to a method, as long as the method is expecting it. Some methods do; some do not. So, for example, System.out.println(null) is ok, butstring.compareTo(null) is not. For methods you write, your javadoc comments should say whether null is ok, unless it is obvious.
In JDK 1.1 to 1.1.5, passing null as the literal argument to a constructor of an anonymous inner class (e.g., new SomeClass(null) { ...} caused a compiler error. It's ok to pass an expression whose value is null, or to pass a coerced null, like new SomeClass((String) null) { ...}
There are at least three different meanings that null is commonly used to express:
- Uninitialized. A variable or slot that hasn't yet been assigned its real value.
- Non-existant/not applicable. For example, terminal nodes in a binary tree might be represented by a regular node with null child pointers.
- Empty. For example, you might use null to represent the empty tree. Note that this is subtly different from the previous case, although some people make the mistake of confusing the two cases. The difference is whether null is an acceptable tree node, or whether it is a signal to not treat the value as a tree node. Compare the following three implementations of binary tree nodes with an in-order print method:

// null means not applicable
// There is no empty tree.

class Node {
  Object data;
  Node left, right;

  void print() {
    if (left != null)
      left.print();
    System.out.println(data);
    if (right != null)
      right.print();
  }
}

// null means empty tree
// Note static, non-static methods

class Node {
  Object data;
  Node left, right;

  void static print(Node node) {
    if (node != null) node.print();
  }

  void print() {
    print(left);
    System.out.println(data);
    print(right);
  }
}

// Separate class for Empty
// null is never used

interface Node { void print(); }

class DataNode implements Node{
  Object data;
  Node left, right;

  void print() {
    left.print();
    System.out.println(data);
    right.print();
  }
}

class EmptyNode implements Node { 
  void print() { }
}

Q: How big is an Object? Why is there no `sizeof`?

C has a sizeof operator, and it needs to have one, because the user has to manage calls to malloc, and because the size of primitive types (like long) is not standardized. Java doesn't need a sizeof, but it would still have been a convenient aid. Since it's not there, you can do this:

static Runtime runtime = Runtime.getRuntime();
...
long start, end;
Object obj;
runtime.gc();
start = runtime.freememory();
obj = new Object(); // Or whatever you want to look at
end =  runtime.freememory();
System.out.println("That took " + (start-end) + " 
bytes.");

This method is not foolproof, because a garbage collection could occur in the middle of the code you are instrumenting, throwing off the byte count. Also, if you are using a just-in-time compiler, some bytes may come from generating code.
You might be surprised to find that an Object takes 16 bytes, or 4 words, in the Sun JDK VM. This breaks down as follows: There is a two-word header, where one word is a pointer to the object's class, and the other points to the instance variables. Even though Object has no instance variables, Java still allocates one word for the variables. Finally, there is a "handle", which is another pointer to the two-word header. Sun says that this extra level of indirection makes garbage collection simpler. (There have been high performance Lisp and Smalltalk garbage collectors that do not use the extra level for at least 15 years. I have heard but have not confirmed that the Microsoft JVM does not have the extra level of indirection.)
An empty new String() takes 40 bytes, or 10 words: 3 words of pointer overhead, 3 words for the instance variables (the start index, end index, and character array), and 4 words for the empty char array. Creating a substring of an existing string takes "only" 6 words, because the char array is shared. Putting an Integer key and Integer value into a Hashtable takes 64 bytes (in addition to the four bytes that were pre-allocated in the Hashtable array): I'll let you work out why.

Q: In what order is initialization code executed? What should I put where?

Instance variable initialization code can go in three places within a class:

In an instance variable initializer for a class (or a superclass).

class C {
    String var = "val";

In a constructor for a class (or a superclass).

public C() { var = "val"; }

In an object initializer block. This is new in Java 1.1; its just like a static initializer block but without the keyword static.

{ var = "val"; }
}

The order of evaluation (ignoring out of memory problems) when you say new C() is:

Call a constructor for C's superclass (unless C is Object, in which case it has no superclass). It will always be the no-argument constructor, unless the programmer explicitly codedsuper(...) as the very first statement of the constructor.
Once the super constructor has returned, execute any instance variable initializers and object initializer blocks in textual (left-to-right) order. Don't be confused by the fact that javadoc andjavap use alphabetical ordering; that's not important here.
Now execute the remainder of the body for the constructor. This can set instance variables or do anything else.

In general, you have a lot of freedom to choose any of these three forms. My recommendation is to use instance variable initailizers in cases where there is a variable that takes the same value regardless of which constructor is used. Use object initializer blocks only when initialization is complex (e.g. it requires a loop) and you don't want to repeat it in multiple constructors. Use a constructor for the rest.Here's another example:

Program:

class A {
    String a1 = ABC.echo(" 1: a1");
    String a2 = ABC.echo(" 2: a2");
    public A() {ABC.echo(" 3: A()");}
}

class B extends A {
    String b1 = ABC.echo(" 4: b1");
    String b2;
    public B() { 
        ABC.echo(" 5: B()"); 
        b1 = ABC.echo(" 6: b1 reset"); 
        a2 = ABC.echo(" 7: a2 reset"); 
    }
}

class C extends B {
    String c1; 
    { c1 = ABC.echo(" 8: c1"); }
    String c2;
    String c3 = ABC.echo(" 9: c3");

    public C() { 
        ABC.echo("10: C()"); 
        c2 = ABC.echo("11: c2");
        b2 = ABC.echo("12: b2");
    }
}

public class ABC {
    static String echo(String arg) {
        System.out.println(arg);
        return arg;
    }

    public static void main(String[] args) { 
        new C(); 
    }
}

Output:

1: a1
 2: a2
 3: A()
 4: b1
 5: B()
 6: b1 reset
 7: a2 reset
 8: c1
 9: c3
10: C()
11: c2
12: b2

Q: What about class initialization?

It is important to distinguish class initialization from instance creation. An instance is created when you call a constructor with new. A class C is initialized the first time it is actively used. At that time, the initialization code for the class is run, in textual order. There are two kinds of class initialization code: static initializer blocks (static { ... }), and class variable initializers (static String var = ...).Active use is defined as the first time you do any one of the following:

Create an instance of C by calling a constructor;
Call a static method that is defined in C (not inherited);
Assign or access a static variable that is declared (not inherited) in C. It does not count if the static variable is initialized with a constant expression (one involving only primitive operators (like + or ||), literals, and static final variables), because these are initialized at compile time.

Here is an example:

Program:

class A {
    static String a1 = ABC.echo(" 1: a1");
    static String a2 = ABC.echo(" 2: a2");
}

class B extends A {
    static String b1 = ABC.echo(" 3: b1");
    static String b2;
    static { 
        ABC.echo(" 4: B()"); 
        b1 = ABC.echo(" 5: b1 reset"); 
        a2 = ABC.echo(" 6: a2 reset"); 
    }
}

class C extends B {
    static String c1; 
    static { c1 = ABC.echo(" 7: c1"); }
    static String c2;
    static String c3 = ABC.echo(" 8: c3");

    static { 
        ABC.echo(" 9: C()"); 
        c2 = ABC.echo("10: c2");
        b2 = ABC.echo("11: b2");
    }
}

public class ABC {
    static String echo(String arg) {
        System.out.println(arg);
        return arg;
    }

    public static void main(String[] args) { 
        new C(); 
    }
}

Output:

1: a1
 2: a2
 3: b1
 4: B()
 5: b1 reset
 6: a2 reset
 7: c1
 8: c3
 9: C()
10: c2
11: b2

Q: I have a class with six instance variables, each of which could be initialized or not. Should I write 64 constructors?

Of course you don't need (2⁶) constructors. Let's say you have a class C defined as follows:

public class C { int a,b,c,d,e,f; }

Here are some things you can do for constructors:

Guess at what combinations of variables will likely be wanted, and provide constructors for those combinations. Pro: That's how it's usually done. Con: Difficult to guess correctly; lots of redundant code to write.
Define setters that can be cascaded because they return this. That is, define a setter for each instance variable, then use them after a call to the default constructor:
public C setA(int val) { a = val; return this; } ... new C().setA(1).setC(3).setE(5);
Pro: This is a reasonably simple and efficient approach. A similar idea is discussed by Bjarne Stroustrop on page 156 of The Design and Evolution of C++. Con: You need to write all the little setters, they aren't JavaBean-compliant (since they return this, not void), they don't work if there are interactions between two values.
Use the default constructor for an anonymous sub-class with a non-static initializer:
new C() {{ a = 1; c = 3; e = 5; }}
Pro: Very concise; no mess with setters. Con: The instance variables can't be private, you have the overhead of a sub-class, your object won't actually have C as its class (although it will still be an instanceof C), it only works if you have accessible instance variables, and many people, including experienced Java programmers, won't understand it. Actually, its quite simple: You are defining a new, unnamed (anonymous) subclass of C, with no new methods or variables, but with an initialization block that initializes a, c, and e. Along with defining this class, you are also making an instance. When I showed this to Guy Steele, he said "heh, heh! That's pretty cute, all right, but I'm not sure I would advocate widespread use..." As usual, Guy is right. (By the way, you can also use this to create and initialize a vector. You know how great it is to create and initialize, say, a String array with new String[] {"one", "two", "three"}. Now with inner classes you can do the same thing for a vector, where previously you thought you'd have to use assignement statements: new Vector(3) {{add("one"); add("two"); add("three")}}.)

You can switch to a language that directly supports this idiom.. For example, C++ has optional arguments. So you can do this:

class C {
public: C(int a=1, int b=2, int c=3, int d=4, int e=5);
}
...
new C(10); // Construct an instance with defaults for b,c,d,e

Common Lisp and Python have keyword arguments as well as optional arguments, so you can do this:

C(a=10, c=30, e=50)            # Construct an instance; use defaults for b and d.

Q:When should I use constructors, and when should I use other methods?

The glib answer is to use constructors when you want a new object; that's what the keyword new is for. The infrequent answer is that constructors are often over-used, both in when they are called and in how much they have to do. Here are some points to consider

Modifiers: As we saw in the previous question, one can go overboard in providing too many constructors. It is usually better to minimize the number of constructors, and then provide modifier methods, that do the rest of the initialization. If the modifiers return this, then you can create a useful object in one expression; if not, you will need to use a series of statements. Modifiers are good because often the changes you want to make during construction are also changes you will want to make later, so why duplicate code between constructors and methods.
Factories: Often you want to create something that is an instance of some class or interface, but you either don't care exactly which subclass to create, or you want to defer that decision to runtime. For example, if you are writing a calculator applet, you might wish that you could call new Number(string), and have this return a Double if string is in floating point format, or a Long if string is in integer format. But you can't do that for two reasons: Number is an abstract class, so you can't invoke its constructor directly, and any call to a constructor must return a new instance of that class directly, not of a subclass. A method which returns objects like a constructor but that has more freedom in how the object is made (and what type it is) is called afactory. Java has no built-in support or conventions for factories, but you will want to invent conventions for using them in your code.

Caching and Recycling: A constructor must create a new object. But creating a new object is a fairly expensive operation. Just as in the real world, you can avoid costly garbage collection by recycling. For example, new Boolean(x) creates a new Boolean, but you should almost always use instead (x ? Boolean.TRUE : Boolean.FALSE), which recycles an existing value rather than wastefully creating a new one. Java would have been better off if it advertised a method that did just this, rather than advertising the constructor. Boolean is just one example; you should also consider recycling of other immutable classes, including Character, Integer, and perhaps many of your own classes. Below is an example of a recycling factory for Numbers. If I had my choice, I would call this Number.make, but of course I can't add methods to the Number class, so it will have to go somewhere else.

public Number numberFactory(String str) throws NumberFormatException {
    try {
      long l = Long.parseLong(str);
      if (l >= 0 && l < cachedLongs.length) {
        int i = (int)l;
        if (cachedLongs[i] != null) return cachedLongs[i];
        else return cachedLongs[i] = new Long(str);
      } else {
        return new Long(l);
      }
    } catch (NumberFormatException e) {
      double d = Double.parseDouble(str);
      return d == 0.0 ? ZERO : d == 1.0 ? ONE : new Double(d);
    }
  }

  private Long[] cachedLongs = new Long[100];
  private Double ZERO = new Double(0.0);
  private Double ONE = new Double(1.0);

We see that new is a useful convention, but that factories and recycling are also useful. Java chose to support only new because it is the simplest possibility, and the Java philosophy is to keep the language itself as simple as possible. But that doesn't mean your class libraries need to stick to the lowest denominator. (And it shouldn't have meant that the built-in libraries stuck to it, but alas, they did.)

Q: Will I get killed by the overhead of object creation and GC?

Suppose the application has to do with manipulating lots of 3D geometric points. The obvious Java way to do it is to have a class Point with doubles for x,y,z coordinates. But allocating and garbage collecting lots of points can indeed cause a performance problem. You can help by managing your own storage in a resource pool. Instead of allocating each point when you need it, you can allocate a large array of Points at the start of the program. The array (wrapped in a class) acts as a factory for Points, but it is a socially-conscious recycling factory. The method callpool.point(x,y,z) takes the first unused Point in the array, sets its 3 fields to the specified values, and marks it as used. Now you as a programmer are responsible for returning Points to the pool once they are no longer needed. There are several ways to do this. The simplest is when you know you will be allocating Points in blocks that are used for a while, and then discarded. Then you do int pos = pool.mark() to mark the current position of the pool. When you are done with the section of code, you call pool.restore(pos) to set the mark back to the position. If there are a few Points that you would like to keep, just allocate them from a different pool. The resource pool saves you from garbage collection costs (if you have a good model of when your objects will be freed) but you still have the initial object creation costs. You can get around that by going "back to Fortran": using arrays of x,y and z coordinates rather than individual point objects. You have a class of Points but no class for an individual point. Consider this resource pool class:

public class PointPool {
  /** Allocate a pool of n Points. **/
  public PointPool(int n) {
    x = new double[n];
    y = new double[n];
    z = new double[n];
    next = 0;
  }
  public double x[], y[], z[];

  /** Initialize the next point, represented as in integer index. **/
  int point(double x1, double y1, double z1) { 
    x[next] = x1; y[next] = y1; z[next] = z1;
    return next++; 
  }

  /** Initialize the next point, initilized to zeros. **/
  int point() { return point(0.0, 0.0, 0.0); }

  /** Initialize the next point as a copy of a point in some pool. **/
  int point(PointPool pool, int p) {
    return point(pool.x[p], pool.y[p], pool.z[p]);
  }

  public int next;
}

You would use this class as follows:

PointPool pool = new PointPool(1000000);
PointPool results = new PointPool(100);
...
int pos = pool.next;
doComplexCalculation(...);
pool.next = pos;

...

void doComplexCalculation(...) {
  ...
  int p1 = pool.point(x, y, z);
  int p2 = pool.point(p, q, r);
  double diff = pool.x[p1] - pool.x[p2];
  ...
  int p_final = results.point(pool,p1);
  ...
}

Allocating a million points took half a second for the PointPool approach, and 6 seconds for the straightforward approach that allocates a million instances of a Point class, so that's a 12-fold speedup.
Wouldn't it be nice if you could declare p1, p2 and p_final as Point rather than int? In C or C++, you could just do typedef int Point, but Java doesn't allow that. If you're adventurous, you can set up make files to run your files through the C preprocessor before the Java compiler, and then you can do #define Point int.

Q: I have a complex expression inside a loop. For efficiency, I'd like the computation to be done only once. But for readability, I want it to stay inside the loop where it is used. What can I do?

Let's assume an example where match is a regular expression pattern match routine, and compile compiles a string into a finite state machine that can be used by match:

for(;;) {
  ...
  String str = ...
  match(str, compile("a*b*c*"));
  ...
}

Since Java has no macros, and little control over time of execution, your choices are limited here. One possibility, although not very pretty, is to use an inner interface with a variable initializer:

for(;;) {
  ...
  String str = ...
  interface P1 {FSA f = compile("a*b*c*);}
  match(str, P1.f);
  ...
}

The value for P1.f gets initialized on the first use of P1, and is not changed, since variables in interfaces are implicitly static final. If you don't like that, you can switch to a language that gives you better control. In Common Lisp, the character sequence #. means to evaluate the following expression at read (compile) time, not run time. So you could write:

(loop
  ...
  (match str #.(compile "a*b*c*"))
  ...)

Q: What other operations are surprisingly slow?

Where do I begin? Here are a few that are most useful to know about. I wrote a timing utility that runs snippets of code in a loop, reporting the results in terms of thousands of iterations per second (K/sec) and microseconds per iteration (uSecs). Timing was done on a Sparc 20 with the JDK 1.1.4 JIT compiler. I note the following:

These were all done in 1998. Compilers have changed since then.
Counting down (i.e. for (int i=n; i>0; i--)) is twice as fast as counting up: my machine can count down to 144 million in a second, but up to only 72 million.
Calling Math.max(a,b) is 7 times slower than (a > b) ? a : b. This is the cost of a method call.
Arrays are 15 to 30 times faster than Vectors. Hashtables are 2/3 as fast as Vectors.

Using bitset.get(i) is 60 times slower than bits & 1 << i. This is the cost of a synchronized method call, mostly. Of course, if you want more than 64 bits, you can't use my bit-twiddling example. Here's a chart of times for getting and setting elements of various data structures:

K/sec     uSecs          Code           Operation 
=========  ======= ====================  ===========
  147,058    0.007 a = a & 0x100;        get element of int bits
      314    3.180 bitset.get(3);        get element of Bitset
   20,000    0.050 obj = objs[1];        get element of Array
    5,263    0.190 str.charAt(5);        get element of String
      361    2.770 buf.charAt(5);        get element of StringBuffer
      337    2.960 objs2.elementAt(1);   get element of Vector
      241    4.140 hash.get("a");        get element of Hashtable

      336    2.970 bitset.set(3);        set element of Bitset
    5,555    0.180 objs[1] = obj;        set element of Array
      355    2.810 buf.setCharAt(5,' ')  set element of StringBuffer
      308    3.240 objs2.setElementAt(1  set element of Vector
      237    4.210 hash.put("a", obj);   set element of Hashtable

Java compilers are very poor at lifting constant expressions out of loops. The C/Java for loop is a bad abstraction, because it encourages re-computation of the end value in the most typical case. So for(int i=0; i is three times slower than int len = str.length(); for(int i=0; i

Q: Can I get good advice from books on Java?

There are a lot of Java books out there, falling into three classes:Bad. Most Java books are written by people who couldn't get a job as a Java programmer (since programming almost always pays more than book writing; I know because I've done both). These books are full of errors, bad advice, and bad programs. These books are dangerous to the beginner, but are easily recognized and rejected by a programmer with even a little experience in another language.
Excellent. There are a small number of excellent Java books. I like the official specification and the books by Arnold and Gosling, Marty Hall, and Peter van der Linden. For reference I like theJava in a Nutshell series and the online references at Sun (I copy the javadoc APIs and the language specification and its amendments to my local disk and bookmark them in my browser so I'll always have fast access.)
Iffy. In between these two extremes is a collection of sloppy writing by people who should know better, but either haven't taken the time to really understand how Java works, or are just rushing to get something published fast. One such example of half-truths is Edward Yourdon's Java and the new Internet programming paradigm from Rise and Resurrection of the American Programmer[footnote on Yourdon]. Here's what Yourdon says about how different Java is:

"Functions have been eliminated" It's true that there is no "function" keyword in Java. Java calls them methods (and Perl calls them subroutines, and Scheme calls them procedures, but you wouldn't say these languages have eliminated functions). One could reasonably say that there are no global functions in Java. But I think it would be more precise to say that there arefunctions with global extent; its just that they must be defined within a class, and are called "static method C.f" instead of "function f".
"Automatic coercions of data types have been eliminated" It's true that there are limits in the coercions that are made, but they are far from eliminated. You can still say (1.0 + 2) and 2 will be automatically coerced to a double. Or you can say ("one" + 2) and 2 will be coerced to a string.
"Pointers and pointer arithmetic have been eliminated" It's true that explicit pointer arithmetic has been eliminated (and good riddance). But pointers remain; in fact, every reference to an object is a pointer. (That's why we have NullPointerException.) It is impossible to be a competent Java programmer without understanding this. Every Java programmer needs to know that when you do:
```
int[] a = {0, 1, 2};
    int[] b = a;
    b[0] = 99;
```
then a[0] is 99 because a and b are pointers (or references) to the same object.
"Because structures are gone, and arrays and strings are represented as objects, the need for pointers has largely disappeared." This is also misleading. First of all, structures aren't gone, they're just renamed "classes". What is gone is programmer control over whether structure/class instances are allocated on the heap or on the stack. In Java all objects are allocated on the heap. That is why there is no need for syntactic markers (such as *) for pointers--if it references an object in Java, it's a pointer. Yourdan is correct in saying that having pointers to the middle of a string or array is considered good idiomatic usage in C and assembly language (and by some people in C++), but it is neither supported nor missed in other languages.
Yourdon also includes a number of minor typos, like saying that arrays have a length() method (instead of a length field) and that modifiable strings are represented by StringClass(instead of StringBuffer). These are annoying, but not as harmful as the more basic half-truths.

http://norvig.com/java-iaq.html

Monday, May 17, 2010

Core Java Interview Question 1

Q. Different ways to create objects in Java

ANS : There are following different ways to create objects in java:

1. Using new keyword
This is the most common way to create an object in java. I read somewhere that almost 99% of objects are created in this way.

MyObject object = new MyObject();

2. Using Class.forName()
If we know the name of the class & if it has a public default constructor we can create an object in this way.

MyObject object = (MyObject) Class.forName("subin.rnd.MyObject").newInstance();

Class.forName().newInstance() is using the reflection API to create an object.

3. Using clone()The clone() can be used to create a copy of an existing object.

MyObject anotherObject = new MyObject();
MyObject object = anotherObject.clone();

4. Using object deserialization
Object deserialization is nothing but creating an object from its serialized form.

ObjectInputStream inStream = new ObjectInputStream(anInputStream );
MyObject object = (MyObject) inStream.readObject();

5.Using classloader

this.getClass().getClassLoader().loadClass(“com.amar.myobject”).newInstance();

6. Using factory methods

Ex:- NumberFormat obj=NumberFormat.getInstance();

Now you know how to create an object. But its advised to create objects only when it is necessary to do so.

Q.What happens to the static fields of a class during serialization? Are these fields serialized as a part of each serialized object ?
Ans : Yes the static fields do get serialized. If the static field is an object then it must have implemented Serializable interface. The static fields are serialized as a part of every object. But the commonness of the static fields across all the instances is maintained even after serialization.

Q.Objects are passed by value or by reference ?
Ans : Java only supports pass by value. With objects, the object reference itself is passed by value and so both the original reference and parameter copy both refer to the same object.

Q. What if I do not provide the String array as the argument to the method?

Ans :Program compiles but throws a runtime error “NoSuchMethodError”.

Q.What if I write static public void instead of public static void?
Ans : Program compiles and runs properly.

Q.What if the static modifier is removed from the signature of the main method?
Ans : Program compiles. But at runtime throws an error “NoSuchMethodError”

Q. What if the main method is declared as private?
Ans : The program compiles properly but at runtime it will give “Main method not public.” message.

Q.Which package is always imported by default?
Ans : The java.lang package is always imported by default.

Q.Is “abc” a primitive value?
Ans :The String literal “abc” is not a primitive value. It is a String object.

Q.What modifiers can be used with a local inner class?
Ans : A local inner class may be final or abstract.

Q.Which class should you use to obtain design information about an object?
Ans : The Class class is used to obtain information about an object’s design.

Q.What is the purpose of the System class?
Ans : The purpose of the System class is to provide access to system resources
System.out.println.......

Q.For which statements does it make sense to use a label?
Ans : The only statements for which it makes sense to use a label are those statements that can enclose a break or continue statement.

Q.Does a class inherit the constructors of its superclass?
Ans : NO A class does not inherit constructors from any of its super classes.

Q.Can a Byte object be cast to a double value?
Ans : No, an object cannot be cast to a primitive value.

Abstract Class Vs Interface

When to use an Abstract Class and an Interface

When to prefer an interface

Lets say you have an interface for a Director and another interface for a Actor.
public interface Actor{ Performance say(Line l); }
public interface Director{ Movie direct(boolean goodmovie); }
In reality, there are Actors who are also Directors. If we are using interfaces rather than abstract classes, we can implement both Actor and Director. We could even define an ActorDirector interface that extends both like this:
public interface ActorDirector extends Actor, Director{ ... }
We could achieve the same thing using abstract classes. Unfortunately the alternative would require up to 2^n (where n is the number of attributes) possible combinations in order to support all possibilities.

When to prefer an Abstract class
Abstract classes allow you to provide default functionality for the subclasses. Common knowledge at this point. Why is this extremely important though? If you plan on updating this base class throughout the life of your program, it is best to allow that base class to be an abstract class. Why? Because you can make a change to it and all of the inheriting classes will now have this new functionality. If the base class will be changing often and an interface was used instead of an abstract class, we are going to run into problems. Once an interface is changed, any class that implements that will be broken. Now if its just you working on the project, that’s no big deal. However, once your interface is published to the client, that interface needs to be locked down. At that point, you will be breaking the clients code.
Speaking from personal experiences, frameworks is a good place to show when and where to use both an abstract class and an interface. Another general rule is if you are creating something that provides common functionality to unrelated classes, use an interface. If you are creating something for objects that are closely related in a hierarchy, use an abstract class. An example of this would be something like a business rules engine. This engine would take in multiple BusinessRules as classes perhaps? Each one of these classes will have an analyze function on it.
public interface BusinessRule{ Boolean analyze(Object o); }
This can be used ANYWHERE. It can be used to verify the state of your application. Verify data is correct. Verify that the user is logged in. Each one of these classes just needs to implement the analyze function, which will be different for each rule.
Where as if we were creating a generic List object, the use of abstract classes would be better. Every single List object is going to display the data in a list in some form or another. The base functionality would be to have it go through its dataprovider and build that list. If we want to change that List object, we just extend it, override our build list function, change what we want and call super.buildList();
Almost everyone knows that interfaces means you are just defining a list of functions and that abstract classes has the option of providing default functionality. The snags come when you drop the ‘why would I use one over the other?’. Abstract classes and interfaces are some of the most important fundamentals of object oriented programming. Just knowing the differences between the two is not enough. When you can look at a situation and make a strong recommendation, you will known you have a much stronger knowledge of object oriented programming. Also it helps during interviews.

Q.Can Abstract Class have constructors? Can interfaces have constructors?Abstract class’s can have a constructor, but you cannot access it through the object, since you cannot instantiate abstract class. To access the constructor create a sub class and extend the abstract class which is having the constructor.
Q.If interface & abstract class have same methods and those methods contain no implementation, which one would you prefer?Obviously one should ideally go for an interface, as we can only extend one class. Implementing an interface for a class is very much effective rather than extending an abstract class because we can extend some other useful class for this subclass

Q.What is the difference between Abstract class and Interface
1. Abstract class is a class which contain one or more abstract methods, which has to be implemented by sub classes. An abstract class can contain no abstract methods also i.e. abstract class may contain concrete methods. A Java Interface can contain only method declarations and public static final constants and doesn’t contain their implementation. The classes which implement the Interface must provide the method definition for all the methods present.
2. Abstract class definition begins with the keyword “abstract” keyword followed by Class definition. An Interface definition begins with the keyword “interface”.
3. Abstract classes are useful in a situation when some general methods should be implemented and specialization behavior should be implemented by subclasses. Interfaces are useful in a situation when all its properties need to be implemented by subclasses
4. All variables in an Interface are by default - public static final while an abstract class can have instance variables.
5. An interface is also used in situations when a class needs to extend an other class apart from the abstract class. In such situations its not possible to have multiple inheritance of classes. An interface on the other hand can be used when it is required to implement one or more interfaces. Abstract class does not support Multiple Inheritance whereas an Interface supports multiple Inheritance.
6. An Interface can only have public members whereas an abstract class can contain private as well as protected members.
7. A class implementing an interface must implement all of the methods defined in the interface, while a class extending an abstract class need not implement any of the methods defined in the abstract class.
8. The problem with an interface is, if you want to add a new feature (method) in its contract, then you MUST implement those method in all of the classes which implement that interface. However, in the case of an abstract class, the method can be simply implemented in the abstract class and the same can be called by its subclass
9. Interfaces are slow as it requires extra indirection to to find corresponding method in in the actual class. Abstract classes are fast
10.Interfaces are often used to describe the peripheral abilities of a class, and not its central identity, E.g. an Automobile class might
implement the Recyclable interface, which could apply to many otherwise totally unrelated objects.
There is no difference between a fully abstract class (all methods declared as abstract and all fields are public static final) and an interface.If the various objects are all of-a-kind, and share a common state and behavior, then tend towards a common base class. If all they share is a set of method signatures, then tend towards an interface.
Similarities:Neither Abstract classes nor Interface can be instantiated.
Q.What does it mean that a method or class is abstract?An abstract class cannot be instantiated. Only its subclasses can be instantiated. A class that has one or more abstract methods must be declared abstract. A subclass that does not provide an implementation for its inherited abstract methods must also be declared abstract. You indicate that a class is abstract with the abstract keyword like this:
public abstract class AbstractClass
Abstract classes may contain abstract methods. A method declared abstract is not actually implemented in the class. It exists only to be overridden in subclasses. Abstract methods may only be included in abstract classes. However, an abstract class is not required to have any abstract methods, though most of them do. Each subclass of an abstract class must override the abstract methods of its superclasses or itself be declared abstract. Only the method’s prototype is provided in the class definition. Also, a final method can not be abstract and vice versa. Methods specified in an interface are implicitly abstract.It has no body. For example,What is a cloneable interface and how many methods does it contain?
An Interface are implicitly abstract and public. Interfaces with empty bodies are called marker interfaces having certain property or behavior. Examples:java.lang.Cloneable,java.io.Serializable,java.util.EventListener. An interface body can contain constant declarations, method prototype declarations, nested class declarations, and nested interface declarations.
Interfaces provide support for multiple inheritance in Java. A class that implements the interfaces is bound to implement all the methods defined in Interface.
Q.Can you make an instance of an abstract class?Abstract classes can contain abstract and concrete methods. Abstract classes cannot be instantiated directly i.e. we cannot call the constructor of an abstract class directly nor we can create an instance of an abstract class by using “Class.forName().newInstance()” (Here we get java.lang.InstantiationException). However, if we create an instance of a class that extends an Abstract class, compiler will initialize both the classes. Here compiler will implicitly call the constructor of the Abstract class. Any class that contain an abstract method must be declared “abstract” and abstract methods can have definitions only in child classes. By overriding and customizing the abstract methods in more than one subclass makes “Polymorphism” and through Inheritance we define body to the abstract methods. Basically an abstract class serves as a template. Abstract class must be extended/subclassed for it to be implemented. A class may be declared abstract even if it has no abstract methods. This prevents it from being instantiated. Abstract class is a class that provides some general functionality but leaves specific implementation to its inheriting classes.
Q.What is meant by “Abstract Interface”?Firstly, an interface is abstract. That means you cannot have any implementation in an interface. All the methods declared in an interface are abstract methods or signatures of the methods.
Q.How to define an Interface?In Java Interface defines the methods but does not implement them. Interface can include constants.A class that implements the interfaces is bound to implement all the methods defined in Interface.