lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Breslin, Dan" <>
Subject Job Opportunity at (Seattle)
Date Wed, 18 Apr 2007 17:12:44 GMT's Darwin team is looking for exceptional software engineers
to develop algorithms and build systems to automatically detect
duplicate products for sale in the catalog. 

Merchants on provide information about the products they want
to sell. Amazon attempts to match these product data submissions to
items in its catalog so that it can display offers for the same product
on a single page.  Poorly structured or incomplete data makes this
problem very challenging and often results in duplicate products getting
created in the catalog.  These duplicate products are shown in search
results and end up confusing customers, leading to a bad customer
experience. The Darwin team detects these duplicate products in the catalog using an innovative mix of Information Retrieval,
Data Mining and Text Analysis algorithms and human intelligence
harnessed via the Amazon Mechanical Turk. We then automatically merge
products detected as duplicates together, improving customer experience
and the quality of the catalog.

We are a highly-motivated, co-operative and fun loving team who thrive
on solving challenging problems with innovation. As part of this team
you will be analyzing data, developing new algorithms, building
large-scale distributed software systems in Java using open source
technologies such as Apache Lucene and JBoss and other
proprietary technologies. 


The ideal candidate will have the following qualifications: 


1.                  Advanced degree in Computer Science, Math or related
field with 2+ years of experience.

2.                  Past experience in at least one of the following
areas - Search, Data Mining, Text Analysis or Machine Learning. 

3.                  Desire to analyze data while developing solutions to

4.                  Strong desire to build high-performance,
highly-available and scalable distributed systems. 

5.                  Strong design and coding skills in Java/C++ on Unix

6.                  Familiarity with Perl and a good understanding of

7.                  Be highly innovative, flexible and self-directed. 

8.                  Excellent written and verbal communication skills. 

Full-time opportunity at located in Seattle, WA. 

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message