flume-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From neerja khattar <neerjakhat...@cloudera.com>
Subject Re: Review Request 47098: FLUME-2620 File channel throws NullPointerException if a header value is null
Date Fri, 13 May 2016 04:43:17 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47098/
-----------------------------------------------------------

(Updated May 13, 2016, 4:43 a.m.)


Review request for Flume.


Changes
-------

fixed all the changes.


Repository: flume-git


Description
-------

The issue is when the header value is null it throws null pointer exception and flume stops
processing further events.
For example:
[{
  "headers" : {
             "timestamp" : "434324343",
             "host" : null
             },
  "body" : "random_body"
  }]
  
  The solution to fix this is:
  
  1. If the header has a null value in the json, flume will replace it with a replacement
string.
  2. The default value for a replacement string is an empty string.
  3. To overwrite default string, set "handler.nullReplacementHeader" property in flume config.


Diffs (updated)
-----

  flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSource.java b520b03 
  flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSourceConfigurationConstants.java
86caf7d 
  flume-ng-core/src/main/java/org/apache/flume/source/http/JSONHandler.java 197f66a 
  flume-ng-core/src/main/java/org/apache/flume/source/http/NullHeaderReplacement.java PRE-CREATION

  flume-ng-core/src/test/java/org/apache/flume/source/http/TestJSONHandler.java 455781c 

Diff: https://reviews.apache.org/r/47098/diff/


Testing
-------

The following are the test cases:

1. Header has null value in json and handler.nullReplacementHeader is not set in flume config.
The default value will be used to replace null.  

[{
  "headers" : {
             "timestamp" : "434324343",
             "host" : null
             },
  "body" : "random_body"
  }]
  
  Output in hdfs : {timestamp=434324343, host=} random_body  
  
  2. Header is not null in json and handler.nullReplacementHeader is not set in flume config.
The replacement implementation doesnt come in to consideration.
   
  [{
  "headers" : {
             "timestamp" : "434324343",
             "host" : 1
             },
  "body" : "random_body"
  }]
  Output in hdfs : {timestamp=434324343, host=1} random_body 
  
  3. Header has null value in json and handler.nullReplacementHeader=abc is set in flume config.
The null value in header will be replaced by abc.

  
  [{
  "headers" : {
             "timestamp" : "434324343",
             "host" : null
             },
  "body" : "random_body"
  }]
  
 
  Output in hdfs {timestamp=434324343, host=abc} random_body 
  
  4. Header has null value in json and handler.nullReplacementHeader=1 is set in flume config.
The null value in header will be replaced by 1 as a string .
  
  [{
  "headers" : {
             "timestamp" : "434324343",
             "host" : null
             },
  "body" : "random_body"
  }]
  
 
  Output in hdfs: {timestamp=434324343, host=1} random_body
  
  5. Header is not null in json and handler.nullReplacementHeader is also set in flume config.
The replacement implementation doesnt come in to consideration.
   
  [{
  "headers" : {
             "timestamp" : "434324343",
             "host" : 1
             },
  "body" : "random_body"
  }]
  Output in hdfs : {timestamp=434324343, host=1} random_body


File Attachments
----------------

flume-2620
  https://reviews.apache.org/media/uploaded/files/2016/05/09/0eff1d56-caf3-4d36-bb45-6b9e7fd6a1ff__FLUME-2620-1.patch


Thanks,

neerja khattar


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message