datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eyal Allweil via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 27820: Setup for Macros in DataFu. Basic setup, no automated testing. Need feedback.
Date Thu, 14 Sep 2017 09:20:02 GMT


> On Nov. 14, 2014, 2:40 a.m., Matthew Hayes wrote:
> > datafu-pig/src/main/macros/nlp/tf_idf.pig
> > Lines 72 (patched)
> > <https://reviews.apache.org/r/27820/diff/1/?file=756916#file756916line72>
> >
> >     Shouldn't this be SUM?

As far as I can tell, it's OK that this is COUNT, if we're counting documents (and as I understand
it TF-IDF we're dividing by documents for the IDF part, not actual occurences.


- Eyal


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27820/#review61348
-----------------------------------------------------------


On Nov. 10, 2014, 8:33 p.m., Russell Jurney wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27820/
> -----------------------------------------------------------
> 
> (Updated Nov. 10, 2014, 8:33 p.m.)
> 
> 
> Review request for DataFu, pig, Joseph Adler, Jakob Homan, Matthew Hayes, and Sam Shah.
> 
> 
> Repository: datafu
> 
> 
> Description
> -------
> 
> DATAFU-61 - Add TF-IDF Macro to DataFu
> 
> 
> Diffs
> -----
> 
>   datafu-pig/src/main/macros/nlp/tf_idf.pig PRE-CREATION 
>   datafu-pig/src/test/macros/nlp/test_tf_idf.pig PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/27820/diff/1/
> 
> 
> Testing
> -------
> 
> Works for me, but testing not automated. See https://issues.apache.org/jira/browse/DATAFU-61
> 
> 
> Thanks,
> 
> Russell Jurney
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message