If you have ever needed to work with Bayesian networks and conditional probabilities, you may have searched around for some libraries you can use. This article describes how to use two libraries with Java API support – Netica and JavaBayes – to set up a simple Bayesian network and calculate inferences.

Netica is a commercial product with support for multiple programming languages and has demo versions available for download. Its API seems very straightforward to use but its Java implementation relies on native libraries which can cause issues depending on which platform you are on. Some platforms (e.g. Linux) seem to be more regularly updated than others.

JavaBayes is an open source application/library written in pure Java but has not been updated for a long time (source code was written for Java 1.0.2 – yikes!). ItsAPI is also not as straightforward to use. For this exercise, the JavaBayes source was modified to live in its own package and to compile with Java 6. The modified source can be found here.

Both have applications/user interfaces that you can use to manually create/view a network and run queries on it but this article will focus on just the API capabilities for generating a network and calculating inference. The sample network that we will be creating is the DogProblem network described here. All source code/supporting libraries can be found here.

To create a simple network, the basic steps are to:

- Create the nodes in the network with their links
- Assign probabilities
- Provide the evidence/observations for the nodes with known values
- Run the inference calculations in order to obtain the probability of the node that you are interested in.

The Netica implementation is very straightforward but the JavaBayes implementation is more complicated since its API is used more by its user interface than as a standalone API. To compensate for this, some helper methods were created (shown at the very end).

To create nodes, there is some small setup to create the network itself and then for each node we give the name of the node, a string value for when the node is in its ‘true’ state, and a string value for when it is in its ‘false’ state. After the nodes are created, the nodes that affect the probabilities of other nodes are linked to form a directed graph.

Creating nodes with Netica:

// Setup Node.setConstructorClass("norsys.neticaEx.aliases.Node"); new Environ(null); Net net = new Net(); net.setName("Demo"); // Setup nodes with states Node hearBark = new Node("hearBark", "hearBark, quiet", net); Node dogOut = new Node("dogOut", "dogOut, dogIn", net); Node bowelProblem = new Node("bowelProblem", "bowelProblem, noBowelProblem", net); Node familyOut = new Node("familyOut", "familyOut, familyIn", net); Node lightOn = new Node("lightOn", "lightOn, lightOff", net); // Add links child.addLink(parent) where parent usually causes child to happen lightOn.addLink(familyOut); dogOut.addLink(familyOut); dogOut.addLink(bowelProblem); hearBark.addLink(dogOut);

Creating nodes with JavaBayes:

InferenceGraph ig = new InferenceGraph(); // Setup nodes with states InferenceGraphNode hearBark = createNode(ig, "hearBark", "hearBark", "quiet"); InferenceGraphNode dogOut = createNode(ig, "dogOut", "dogOut", "dogIn"); InferenceGraphNode bowelProblem = createNode(ig, "bowelProblem", "bowelProblem", "noBowelProblem"); InferenceGraphNode familyOut = createNode(ig, "familyOut", "familyOut", "familyIn"); InferenceGraphNode lightOn = createNode(ig, "lightOn", "lightOn", "lightOff"); // Add links (parent, child) where parent usually causes child to happen ig.create_arc(familyOut, lightOn); ig.create_arc(familyOut, dogOut); ig.create_arc(bowelProblem, dogOut); ig.create_arc(dogOut, hearBark);

Once we have the nodes, we set up the various probabilities. Most probabilities are conditional based on the values of connected nodes but some ‘leaf’ nodes are given unconditional probabilities. A node that is conditional on one other node is pretty straightforward. For example, the probability that you will hear a bark given that the dog is out is .70. Since the hearBark node only has two values, this means that the probability that you will not hear a bark given that the dog is out is .30. In the same way, the probability that you will hear a bark given that the dog is not out is .01 (perhaps some frogs make noises that sound like dog barks) and the probability you will not hear a bark is 0.99. A node that is conditional on multiple nodes is a little more complicated since one has to account for each combination of probabilities (so two nodes would mean four different combinations).

Assigning probabilities with Netica:

// Setup conditional probabilities hearBark.setCPTable("dogOut", .70, .30); hearBark.setCPTable("dogIn", .01, .99); dogOut.setCPTable("familyOut", "bowelProblem", .99, .01); dogOut.setCPTable("familyOut", "noBowelProblem", .90, .10); dogOut.setCPTable("familyIn", "bowelProblem", .97, .03); dogOut.setCPTable("familyIn", "noBowelProblem", .30, .70); lightOn.setCPTable("familyOut", .60, .40); lightOn.setCPTable("familyIn", .05, .95); // Setup 'leaf' probabilities familyOut.setCPTable(.15, .85); bowelProblem.setCPTable(.01, .99); net.compile();

Assigning probabilities with JavaBayes:

// Setup conditional probabilities setProbabilityValues(hearBark, "dogOut", .70, .30); setProbabilityValues(hearBark, "dogIn", .01, .99); setProbabilityValues(dogOut, "familyOut", "bowelProblem", .99, .01); setProbabilityValues(dogOut, "familyOut", "noBowelProblem", .90, .10); setProbabilityValues(dogOut, "familyIn", "bowelProblem", .97, .03); setProbabilityValues(dogOut, "familyIn", "noBowelProblem", .30, .70); setProbabilityValues(lightOn, "familyOut", .60, .40); setProbabilityValues(lightOn, "familyIn", .05, .95); // Setup 'leaf' probabilities setProbabilityValues(familyOut, .15, .85); setProbabilityValues(bowelProblem, .01, .99);

Now that we have the network set up, we can ask it to provide us with the probability of a node having a certain value just based on the ‘leaf’ node probabilities we provided and the conditional probabilities we gave to the other nodes. So, we ask it to calculate the probability that the light is on. Once we obtain this value, we provide some evidence to the network based on our observations. In this case we say that we heard a bark and that the dog does not have a bowel problem. We then ask the network again to calculate the probability that the light is on.

Providing evidence and calculating belief with Netica:

// Figure out probability of light being on with no evidence given (just based off of probabilities) double belief = lightOn.getBelief("lightOn"); System.out.println("The probability of the light being on is " + belief); // Enter evidence, certain things that we observe hearBark.finding().enterState("hearBark"); bowelProblem.finding().enterState("noBowelProblem"); // Recalculate probability of light being on given evidence belief = lightOn.getBelief("lightOn"); System.out.println("The probability of the light being on given a bark was heard " + "and no bowel problem is " + belief);

Providing evidence and calculating belief with JavaBayes:

double belief = getBelief(ig, lightOn); System.out.println("The probability of the light being on is " + belief); // Enter evidence, certain things that we observe hearBark.set_observation_value("hearBark"); bowelProblem.set_observation_value("noBowelProblem"); // Recalculate probability of light being on given evidence belief = getBelief(ig, lightOn); System.out.println("The probability of the light being on given a bark was heard " + "and no bowel problem is " + belief);

The output of the Netica program when it is run gives:

The probability of the light being on is 0.13249997794628143

The probability of the light being on given a bark was heard and no bowel problem is 0.23651915788650513

and the output of the JavaBayes program gives:

The probability of the light being on is 0.1325

The probability of the light being on given a bark was heard and no bowel problem is 0.23651916875671802

Both come within the same answer for a fair number of significant digits.

There are the helper methods used with JavaBayes to make its API more similar to Netica’s:

/** * Helper function to create node since not as straightforward with JavaBayes * to get a pointer back to the node that is being added */ private static InferenceGraphNode createNode( InferenceGraph ig, String name, String trueVariable, String falseVariable) { ig.create_node(0, 0); InferenceGraphNode node = (InferenceGraphNode) ig.get_nodes().lastElement(); node.set_name(name); ig.change_values(node, new String[] {trueVariable, falseVariable}); return node; } /** * Sets probabilities for a leaf node */ private static void setProbabilityValues(InferenceGraphNode node, double trueValue, double falseValue) { node.set_function_values(new double[] {trueValue, falseValue}); } /** * Returns the index of the variable for the parent that has the given variable */ private static int getVariableIndex(InferenceGraphNode node, String parentVariable) { for (InferenceGraphNode parent : (Vector<InferenceGraphNode>) node.get_parents()) { int variableIndex = 0; for (String variable : parent.get_values()) { if (variable.equals(parentVariable)) { return variableIndex; } variableIndex++; } } return 0; } /** * Returns the total number of values for the parent that has the given variable */ private static int getTotalValues(InferenceGraphNode node, String parentVariable) { for (InferenceGraphNode parent : (Vector<InferenceGraphNode>) node.get_parents()) { for (String variable : parent.get_values()) { if (variable.equals(parentVariable)) { return parent.get_number_values(); } } } return 0; } /** * Sets probabilities for a node that has a parent */ private static void setProbabilityValues(InferenceGraphNode node, String parentVariable, double trueValue, double falseValue) { int variableIndex = getVariableIndex(node, parentVariable); int totalValues = getTotalValues(node, parentVariable); double[] probabilities = node.get_function_values(); probabilities[variableIndex] = trueValue; probabilities[variableIndex + totalValues] = falseValue; node.set_function_values(probabilities); } /** * Sets probabilities for a node that has two parents */ private static void setProbabilityValues(InferenceGraphNode node, String firstParentVariable, String secondParentVariable, double trueValue, double falseValue) { int variableIndex = (getVariableIndex(node, firstParentVariable) * 2) + getVariableIndex(node, secondParentVariable); int totalValues = getTotalValues(node, firstParentVariable) + getTotalValues(node, secondParentVariable); double[] probabilities = node.get_function_values(); probabilities[variableIndex] = trueValue; probabilities[variableIndex + totalValues] = falseValue; node.set_function_values(probabilities); } /** * Gets the belief/true result from the inference of the given node */ private static double getBelief(InferenceGraph ig, InferenceGraphNode node) { QBInference qbi = new QBInference(ig.get_bayes_net(), false); qbi.inference(node.get_name()); return qbi.get_result().get_value(0); }

As you can see, with a few helpers added for the JavaBayes implementation, a simple Bayesian network can be created to calculate inference with both libraries. If you are just interested in using the APIs and not the user interfaces, the Netica library is easier to use out of the box and supports multiple languages but has a commercial license and also uses native libraries for its Java implementation. To play around with both of these libraries, feel free to run the examples yourself using the source. Note only the Windows native library is included for Netica but others are available on the Netica site.