nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markap14 <...@git.apache.org>
Subject [GitHub] nifi pull request #1214: NIFI-2876 refactored demarcators into a common abst...
Date Thu, 23 Feb 2017 14:40:32 GMT
Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1214#discussion_r102718539
  
    --- Diff: nifi-commons/nifi-utils/src/main/java/org/apache/nifi/stream/io/util/StreamDemarcator.java
---
    @@ -102,99 +83,53 @@ public StreamDemarcator(InputStream is, byte[] delimiterBytes, int
maxDataSize,
          * @throws IOException if unable to read from the stream
          */
         public byte[] nextToken() throws IOException {
    -        byte[] data = null;
    +        byte[] token = null;
             int j = 0;
    -
    -        while (data == null && this.buffer != null) {
    -            if (this.index >= this.readAheadLength) {
    +        nextTokenLoop:
    +        while (token == null && this.bufferLength != -1) {
    +            if (this.index >= this.bufferLength) {
                     this.fill();
                 }
    -            if (this.index >= this.readAheadLength) {
    -                data = this.extractDataToken(0);
    -                this.buffer = null;
    -            } else {
    -                byte byteVal = this.buffer[this.index++];
    -                if (this.delimiterBytes != null && this.delimiterBytes[j] ==
byteVal) {
    -                    if (++j == this.delimiterBytes.length) {
    -                        data = this.extractDataToken(this.delimiterBytes.length);
    +            if (this.bufferLength != -1) {
    +                byte byteVal;
    +                int i;
    +                for (i = this.index; i < this.bufferLength; i++) {
    +                    byteVal = this.buffer[i];
    +
    +                    boolean delimiterFound = false;
    +                    if (this.delimiterBytes != null && this.delimiterBytes[j]
== byteVal) {
    --- End diff --
    
    This seems to be buggy. If this.delimiterBytes[j] == byteVal, we increment j. But the
next byte does not match, we have already incremented j and it won't get reset. As a result,
if we find all bytes in the delimiter in the proper order, we return that token, even if the
bytes are not contiguous. Please add the following unit test to the test case and you will
see the failure:
    
    ```
        @Test
        public void testOnPartialMatchThenSubsequentPartialMatch() throws IOException {
            final byte[] inputData = "A Great Big Boy".getBytes(StandardCharsets.UTF_8);
            final byte[] delimBytes = "AB".getBytes(StandardCharsets.UTF_8);
    
            try (final InputStream is = new ByteArrayInputStream(inputData);
                final StreamDemarcator demarcator = new StreamDemarcator(is, delimBytes, 4096))
{
    
                final byte[] bytes = demarcator.nextToken();
                assertArrayEquals(inputData, bytes);
    
                assertNull(demarcator.nextToken());
            }
        }
    
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message