labs-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tomm...@apache.org
Subject svn commit: r1708601 - in /labs/yay/trunk/core/src: main/java/org/apache/yay/core/BackPropagationLearningStrategy.java test/java/org/apache/yay/core/WordVectorsTest.java test/resources/word2vec/sentences.txt
Date Wed, 14 Oct 2015 13:37:08 GMT
Author: tommaso
Date: Wed Oct 14 13:37:08 2015
New Revision: 1708601

URL: http://svn.apache.org/viewvc?rev=1708601&view=rev
Log:
minor improvements

Modified:
    labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
    labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java
    labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt

Modified: labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
(original)
+++ labs/yay/trunk/core/src/main/java/org/apache/yay/core/BackPropagationLearningStrategy.java
Wed Oct 14 13:37:08 2015
@@ -147,7 +147,11 @@ public class BackPropagationLearningStra
           }
         }
       }
-      updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights);
+      if (updatedParameters[l] != null) {
+        updatedParameters[l].setSubMatrix(updatedWeights, 0, 0);
+      } else {
+        updatedParameters[l] = new Array2DRowRealMatrix(updatedWeights);
+      }
     }
     return updatedParameters;
   }

Modified: labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java
URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java (original)
+++ labs/yay/trunk/core/src/test/java/org/apache/yay/core/WordVectorsTest.java Wed Oct 14
13:37:08 2015
@@ -146,7 +146,6 @@ public class WordVectorsTest {
 
     ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream(new File("target/sg-vectors.bin")));
     MatrixUtils.serializeRealMatrix(vectorsMatrix, os);
-
   }
 
   private String hotDecode(Double[] doubles, List<String> vocabulary) {

Modified: labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt
URL: http://svn.apache.org/viewvc/labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt?rev=1708601&r1=1708600&r2=1708601&view=diff
==============================================================================
--- labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt (original)
+++ labs/yay/trunk/core/src/test/resources/word2vec/sentences.txt Wed Oct 14 13:37:08 2015
@@ -1,8 +1,8 @@
 The word2vec software of Tomas Mikolov and colleagues has gained a lot of traction lately
and provides state-of-the-art word embeddings
 The learning models behind the software are described in two research papers
 We found the description of the models in these papers to be somewhat cryptic and hard to
follow
-While the motivations and presentation may be obvious to the neural-networks language-modeling
crowd we had to struggle quite a bit to figure out the rationale behind the equations
-This note is an attempt to explain the negative sampling equation in “Distributed Representations
of Words and Phrases and their Compositionality” by Tomas Mikolov Ilya Sutskever Kai
Chen Greg Corrado and Jeffrey Dean
+While the motivations and presentation may be obvious to the neural-networks language-mofdeling
crowd we had to struggle quite a bit to figure out the rationale behind the equations
+This note is an attempt to explain the negative sampling equation in Distributed Representations
of Words and Phrases and their Compositionality by Tomas Mikolov Ilya Sutskever Kai Chen Greg
Corrado and Jeffrey Dean
 The departure point of the paper is the skip-gram model
 In this model we are given a corpus of words w and their contexts c
 We consider the conditional probabilities p(c|w) and given a corpus Text the goal is to set
the parameters θ of p(c|w;θ) so as to maximize the corpus probability
@@ -11,7 +11,7 @@ In this paper we present several extensi
 By subsampling of the frequent words we obtain significant speedup and also learn more regular
word representations
 We also describe a simple alternative to the hierarchical softmax called negative sampling
 An inherent limitation of word representations is their indifference to word order and their
inability to represent idiomatic phrases
-For example the meanings of “Canada” and “Air” cannot be easily combined
to obtain “Air Canada”
+For example the meanings of Canada and Air cannot be easily combined to obtain Air Canada
 Motivated by this example we present a simple method for finding phrases in text and show
that learning good vector representations for millions of phrases is possible
 The similarity metrics used for nearest neighbor evaluations produce a single scalar that
quantifies the relatedness of two words
 This simplicity can be problematic since two given words almost always exhibit more intricate
relationships than can be captured by a single number
@@ -23,4 +23,15 @@ Unsupervised word representations are ve
 However most of these models are built with only local context and one representation per
word
 This is problematic because words are often polysemous and global context can also provide
useful information for learning word meanings
 We present a new neural network architecture which 1) learns word embeddings that better
capture the semantics of words by incorporating both local and global document context and
2) accounts for homonymy and polysemy by learning multiple embeddings per word
-We introduce a new dataset with human judgments on pairs of words in sentential context and
evaluate our model on it showing that our model outperforms competitive baselines and other
neural language models
\ No newline at end of file
+We introduce a new dataset with human judgments on pairs of words in sentential context and
evaluate our model on it showing that our model outperforms competitive baselines and other
neural language models
+Information Retrieval (IR) models need to deal with two difficult issues vocabulary mismatch
and term dependencies
+Vocabulary mismatch corresponds to the difficulty of retrieving relevant documents that do
not contain exact query terms but semantically related terms
+Term dependencies refers to the need of considering the relationship between the words of
the query when estimating the relevance of a document
+A multitude of solutions has been proposed to solve each of these two problems but no principled
model solve both
+In parallel in the last few years language models based on neural networks have been used
to cope with complex natural language processing tasks like emotion and paraphrase detection
+Although they present good abilities to cope with both term dependencies and vocabulary mismatch
problems thanks to the distributed representation of words they are based upon such models
could not be used readily in IR where the estimation of one language model per document (or
query) is required
+This is both computationally unfeasible and prone to over-fitting
+Based on a recent work that proposed to learn a generic language model that can be modified
through a set of document-specific parameters we explore use of new neural network models
that are adapted to ad-hoc IR tasks
+Within the language model IR framework we propose and study the use of a generic language
model as well as a document-specific language model
+Both can be used as a smoothing component but the latter is more adapted to the document
at hand and has the potential of being used as a full document language model
+We experiment with such models and analyze their results on TREC-1 to 8 datasets
\ No newline at end of file



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@labs.apache.org
For additional commands, e-mail: commits-help@labs.apache.org


Mime
View raw message