edu.buffalo.nsf.xmlcqa.data.dtd
Class NDFA

java.lang.Object
  extended byedu.buffalo.nsf.xmlcqa.data.dtd.NDFA
All Implemented Interfaces:
Constants
Direct Known Subclasses:
RegExpr

public abstract class NDFA
extends java.lang.Object
implements Constants

A class representing an nondeterminisic finite state automata (NDFA). The number of states of the NDFA can be obtained by getStateCount(). The states are identified by natural numbers starting with 0. The method getFinalStates() indicates the final states. The transition relation is dividad into two parts: epsilon transitions (obtained with getEpsilonDelta()) and transitions facilitated with a tag (obtained wiht getDelta()). Although we distinguish between epsilon and tag transitions, the transitios facilitated with tags should be closed under epsilon transitions (i.e. if there is a transition from a state i to a state j using tag t and there is an epsilon transition from the state j to a state k, then there should be a transition from i to k with the tag t)

An NDFA can be viewed as a directed graph, with its vertices being states and a pair of states being connected with an edge labeled with a tag if there is a transition from the first state to the second when reading the given tag. This graph is called here the transition graph of the given NDFA.

In the context of a DTD, crossing an edge of the transition graph can be charged with a cost corresponding to the minimal size of a document satifying the DTD and whose root is labeled with the tag of the edge (see the method DTD.getMinTreeSize(Tag)). Implementation of this class should provide methos for finding all cheapest paths between every pair of vertices by means of methods: getMinCostDelta(DTD), getFstIndTrDelta(DTD), and getFstTagTrDelta(DTD).
The first method getMinCostDelta(DTD) returns a two dimensional array containing minimal costs of paths in the transition graph between every pair of vertices. The array returned by getFstIndTrDelta(DTD) allows to construct all minimal path between any two vertices. The method getFstTagTrDelta(DTD) provides tags that the edges of the paths are labeled with. And so for a DTD d and two vertices i and j, the minimal cost of getting from i to j is getMinCostDelta(d)[i][j]. The values getFstIndTrDelta(d)[i][j] and getFstTagTrDelta(d)[i][j] consist of eaqual in length vectors of state identifiers and tags respectively. Assuming that the length of those vectors is n, denote the elements of those two vectors by k_1, ..., k_n (the states) and t_1, ..., t_n (the tags). For any m <= n, if i,v_1,...v_s,k_m is a ceapest path from i to k_m, then i,v_1,...,v_s,k_m,j is a minimal path between from i to j, and the edge from k_m to j is labeled with t_m. This resoning captures all cheapest paths from i to j.

Author:
staworko

Field Summary
static RegExpr EPSILON
          A constant representing epsilon regular expression.
 
Fields inherited from interface edu.buffalo.nsf.xmlcqa.Constants
INF
 
Constructor Summary
NDFA()
           
 
Method Summary
abstract  boolean[][][] getDelta()
          Returns the tag controled part of the transition relation.
abstract  boolean[][] getEpsilonDelta()
          Returns the epsilon part of the transition relation.
abstract  boolean[] getFinalStates()
          Returns a boolean vector specifying final states.
abstract  int[][][] getFstIndTrDelta(DTD d)
          Returns a two dimensional array containing vectors of suffixes of the cheapest paths between each pair of vertices.
abstract  Tag[][][] getFstTagTrDelta(DTD d)
          Returns a two dimentional array containing vectors of suffizes of labels of the cheapest paths between each pair of vertices.
abstract  int[][] getMinCostDelta(DTD d)
          Returns an array containing costs of cheapest paths between every pari of states.
abstract  int getStateCount()
          Returns the number of states used by the underlying NDFA.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EPSILON

public static final RegExpr EPSILON
A constant representing epsilon regular expression. This regular expression defines set consisting of only one word: the empty word.

Constructor Detail

NDFA

public NDFA()
Method Detail

getDelta

public abstract boolean[][][] getDelta()
Returns the tag controled part of the transition relation. getDelta()[t][i][j] is true if and only if there is transition with a tag of identifier equal to t, from the state i to the state j


getEpsilonDelta

public abstract boolean[][] getEpsilonDelta()
Returns the epsilon part of the transition relation. getEpsilonDelta()[i][j] is true if and only if there is an epsilon transition from the state i to the state j.


getFinalStates

public abstract boolean[] getFinalStates()
Returns a boolean vector specifying final states. A state i is final if and only if getFinalStates()[i] is true.


getFstIndTrDelta

public abstract int[][][] getFstIndTrDelta(DTD d)
Returns a two dimensional array containing vectors of suffixes of the cheapest paths between each pair of vertices.


getFstTagTrDelta

public abstract Tag[][][] getFstTagTrDelta(DTD d)
Returns a two dimentional array containing vectors of suffizes of labels of the cheapest paths between each pair of vertices.


getMinCostDelta

public abstract int[][] getMinCostDelta(DTD d)
Returns an array containing costs of cheapest paths between every pari of states. The costs of the edges depend on the values of DTD.getMinTreeSize(Tag).


getStateCount

public abstract int getStateCount()
Returns the number of states used by the underlying NDFA. The states are number with indexes 0 and higher (similar to arrays).