bahir-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From c-w <...@git.apache.org>
Subject [GitHub] bahir pull request #43: [BAHIR-117] Expand filtering options for TwitterInpu...
Date Thu, 18 May 2017 16:20:02 GMT
Github user c-w commented on a diff in the pull request:

    https://github.com/apache/bahir/pull/43#discussion_r117292632
  
    --- Diff: streaming-twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala
---
    @@ -85,10 +85,8 @@ class TwitterReceiver(
             }
           })
     
    -      val query = new FilterQuery
    -      if (filters.size > 0) {
    -        query.track(filters.mkString(","))
    -        newTwitterStream.filter(query)
    +      if (query.isDefined) {
    +        newTwitterStream.filter(query.get)
           } else {
    --- End diff --
    
    As I mentioned in the PR description, the limitation of hiding the FilterQuery from the
user is that we are only able to filter the Twitter stream via disjunctive keyword queries:
    
    ```scala
    // this will give us any Tweet that contains "foo", "bar" or "baz"
    val tweets = TwitterUtils.createStream(ssc, Seq("foo", "bar", "baz"));
    ```
    
    However, the Twitter stream API also supports many other types of filtering, including:
    - Receive Tweets that are tagged at a particular location (ref: [locations](https://dev.twitter.com/streaming/overview/request-parameters#locations))
    - Receive Tweets created by specific users (ref: [follow](https://dev.twitter.com/streaming/overview/request-parameters#follow))
    - Receive Tweets that match a conjunction of keywords (ref: [track with spaces](https://dev.twitter.com/streaming/overview/request-parameters#track))
    
    Refer to Twitter's [official documentation](https://dev.twitter.com/streaming/overview/request-parameters)
for a full list of all supported filters.
    
    By exposing the FilterQuery, we enable users to make use of all of these powerful filters
and any future filters that Twitter may introduce.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message