From dev-return-18201-archive-asf-public=cust-asf.ponee.io@manifoldcf.apache.org Wed Jul 25 23:57:49 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 6A6C518062C for ; Wed, 25 Jul 2018 23:57:49 +0200 (CEST) Received: (qmail 14421 invoked by uid 500); 25 Jul 2018 21:57:48 -0000 Mailing-List: contact dev-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@manifoldcf.apache.org Delivered-To: mailing list dev@manifoldcf.apache.org Received: (qmail 14408 invoked by uid 99); 25 Jul 2018 21:57:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Jul 2018 21:57:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 67E5A1A123E for ; Wed, 25 Jul 2018 21:57:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.888 X-Spam-Level: * X-Spam-Status: No, score=1.888 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id QG09qcGD3Dw1 for ; Wed, 25 Jul 2018 21:57:45 +0000 (UTC) Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id CD8145F18F for ; Wed, 25 Jul 2018 21:57:44 +0000 (UTC) Received: by mail-ed1-f52.google.com with SMTP id h1-v6so1675eds.1 for ; Wed, 25 Jul 2018 14:57:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=4RabmNflrtsZ6Fkn5I0zeu8bjYT0Aoid+ZJv5ntnYgI=; b=Vwt+UuFJOgsyaIr3E0tR5pw1F/d3d+xnmBhHl6kQvHzX1eLUotlZe+QkpiQuAZ9PWN V0205OuncIWPTlbDg1kXRciJhpsG6EvvpClsvhuY2ZFM91TLBAjglp7Pm7PSD2DpO1dw r5lWcVNd8lgkf9f25Kfi8Ferxp+YVEvw2bgPvDcMXQe4KLnTE2HFgW5hfuPVQiHz1Sp3 GuHQ2YzV7mnlhMdv7519vFWGqdqomDL6NJM9xVsJ4seBK+mvD2fbcgFmgrBZ+hJDPSMC 1HMqfHvlZZHz4oOGlKEufjAOVzqXAEAWulfjYCeUPtkVC/nmZvAiufCQtQvjEYHEDtix c6sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=4RabmNflrtsZ6Fkn5I0zeu8bjYT0Aoid+ZJv5ntnYgI=; b=djp/EyE/192qTtwroAwdc/yjs2OgF0935yh5upCI+ztS2PKgfUuCQL/hUgeRe3iDs/ lo96DsiuKxFNcd/rUgmrKXKCeCUOWQh0zcjT5XLizD9GwAAmdf5Ky8vHcMl+mhhpZmfr nC1248wpgmiCgW5cUSew/BcCtd5xOEzg3Qbj0S5he1GrGjIAEZB0Yi3W+n8jf4IAOz+2 2WirCBJZ/ruoY1dqi5QbuXlfQqtHbwVjpOjlYnlkjav0YBnFxsSTKgbTOK9O7Bz5dhCL HmXSqIYv0QMagIeN779JfHWaYESlZb4DhtJzWmI8gnWvJ9JhXrH8JndKBHcVnVWNl1nZ bPUQ== X-Gm-Message-State: AOUpUlG8lNUBj0BsKkPLFZoVdMP/oZ1Pl6ZNEZPX4U8DTxjjnwHKEFyr uEbCGdDPljiYNk881+WtvVaNIzQ5EKCzl7zB3GWBBnlp X-Google-Smtp-Source: AAOMgpecfqBghCWwcGGNRMiyTOEETqGaAOvZAyN8Sx4J8+XMi9lcVqfGg4X+8T5nN+zH2dHL4XPe1aB1Ga4TARwPC5E= X-Received: by 2002:a50:fb91:: with SMTP id e17-v6mr25744211edq.308.1532555858158; Wed, 25 Jul 2018 14:57:38 -0700 (PDT) MIME-Version: 1.0 From: Gustavo Beneitez Date: Wed, 25 Jul 2018 23:57:26 +0200 Message-ID: Subject: Create a new ACTIVITY_FETCH from a transformation To: dev@manifoldcf.apache.org Content-Type: multipart/alternative; boundary="00000000000086293d0571d9f7e6" --00000000000086293d0571d9f7e6 Content-Type: text/plain; charset="UTF-8" Hi all, I need to extract and analyse crawled urls because they may contain certain parameters such as "?redirectURL=" that could point to new Documents to be fetched and indexed. First I was trying to create a subclass that extends public class RedirectExtractor extends org.apache.manifoldcf.agents.transformation.BaseTransformationConnector and add a "RedirectExtractor" transformation step to the fetch process in ManifoldCF, but it only allows me to modify current Document, not to create a new FETCH from the extracted parameter. I was investigating manifoldCF source code and I found something that may be in hand activities.recordActivity(null,ACTIVITY_FETCH, null,urlValue,Integer.toString(-2),"Robots exclusion",null); from the IProcessActivity interface, which is used by the Connectors. I didn't want to create a new connector since it is a bit complex but, do you see an alternative or this is the only way? Thanks in advance. --00000000000086293d0571d9f7e6--