opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Scheidecker <>
Subject Fwd: How to convert the tree string representation into a tree object
Date Wed, 05 Feb 2014 20:28:52 GMT
Hello all,

If you have something like this:

Statement : On Tuesday, John Smith bought a Honda Accord

(TOP (SBAR (IN On) (S (NP (NNP Tuesday,) (NNP John) (NNP Smith)) (VP (VBD
bought) (NP (DT a) (NNP Honda) (NNP Accord))))))

I have generated that from the Parse class Show() method as bellow:

   * Displays this parse using Penn Treebank-style formatting.
  public void show() {
    StringBuffer sb = new StringBuffer(text.length()*4);

My question is: How can I translate (TOP (SBAR (IN On) (S (NP (NNP
Tuesday,) (NNP John) (NNP Smith)) (VP (VBD bought) (NP (DT a) (NNP Honda)
(NNP Accord)))))) into a tree object?

I think I would parse that with ( -> new node, (-> new child, ) -> end node.

I was thinking on doing something like this, but instead of returning a
string buffer I would return a tree object:

   * Appends the specified string buffer with a string representation of
this parse.
   * @param sb A string buffer into which the parse string can be appended.
  public void show(StringBuffer sb) {
    int start;
    start = span.getStart();
    if (!type.equals(AbstractBottomUpParser.TOK_NODE)) {
      sb.append(type).append(" ");
      //System.out.print(label+" ");
      //System.out.print(head+" ");
      //System.out.print(df.format(prob)+" ");
    for (Iterator<Parse> i = parts.iterator(); i.hasNext();) {
      Parse c =;
      Span s = c.span;
      if (start < s.getStart()) {
        //System.out.println("pre "+start+" "+s.getStart());
        sb.append(encodeToken(text.substring(start, s.getStart())));
      start = s.getEnd();
    if (start < span.getEnd()) {
      sb.append(encodeToken(text.substring(start, span.getEnd())));
    if (!type.equals(AbstractBottomUpParser.TOK_NODE)) {

But is there some example that someone might have already done and you
could refer to me?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message