From users-return-10760-archive-asf-public=cust-asf.ponee.io@pdfbox.apache.org Thu Jan 25 22:03:44 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 65F5B180651 for ; Thu, 25 Jan 2018 22:03:44 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 56213160C3D; Thu, 25 Jan 2018 21:03:44 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A540A160C17 for ; Thu, 25 Jan 2018 22:03:43 +0100 (CET) Received: (qmail 54255 invoked by uid 500); 25 Jan 2018 21:03:37 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 54234 invoked by uid 99); 25 Jan 2018 21:03:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Jan 2018 21:03:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 77F3FC1154 for ; Thu, 25 Jan 2018 21:03:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.679 X-Spam-Level: X-Spam-Status: No, score=0.679 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id sDqFS7J8WnNB for ; Thu, 25 Jan 2018 21:03:34 +0000 (UTC) Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id CFE415FBFC for ; Thu, 25 Jan 2018 21:03:33 +0000 (UTC) Received: by mail-wm0-f53.google.com with SMTP id g1so17175710wmg.2 for ; Thu, 25 Jan 2018 13:03:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:references:in-reply-to:subject:date:message-id:mime-version :content-transfer-encoding:thread-index:content-language; bh=6gitGrQiQefvXmUZwOc0ldmtzBPkCifpgT5hvoT3alQ=; b=jbHiYKdaCc0xV8gvLb4lHIVldcg2Q/Qhp4Yvqde9aIffEkDpntRr/PLslDmBoV+6tz ui0X0+BifvePIhReXgJp3E7H0ISgaUqJdsaoIQZ4BMPJBWGhrMLjEOVJtkAxJfNWyWaT lwj2RUJuhmUUD7ugoKCWf6GGAQfYJbj/7ordfwQW3ow9OJ4wwSA7lvwt9LDwjhxkOAjV uas/oMLFtIpADRUJEIYabOvB3dmBy0mcs9ZssVk1C9i/dpqS0dFUJ08OkPTlwHAsDykR uT7AowfWqNqz4FEXAgFb83PnT1nx2dn4vMgoyVso/cVDcTG/VrN7Ab9bWz/mthr7sWbB 39Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:references:in-reply-to:subject:date :message-id:mime-version:content-transfer-encoding:thread-index :content-language; bh=6gitGrQiQefvXmUZwOc0ldmtzBPkCifpgT5hvoT3alQ=; b=kK5Yj2Q5D1c1jNuzViyaXUxTwLNZRYUkjRAvaeRQG9CBhhTDTvqaVJMT57aZ8RXbPW Kr6YO1z7wWEKElAoTjmvEvEyX6wJ/vTRHsXzHePJmopbwHNIwbeuXn/HUZH20LYhMeKT Wxx2NAOdAZTE5ozYAqu/BG3rki3kSIv6z2IZXosKdq2NcxBMe7ct58n/Yujzfj4w1MCW M2V1oV+gtDY0YdE46Bvhu0hQDxcx5EeglS2Um/gqVGPH0tx1AqsYYKbSkFfnBoB8lmTD wlfRe038nVVcbeOCmv3qkf0ml3H6/JqjzmS4Ozw/Xdg6nMi2XxTbof8jel7/on2ki4Ua Z8cQ== X-Gm-Message-State: AKwxyteHst6o3GX8IMB07GSDcNiub8oWMNwUWiB2RWv1QfTBHCdt2w5I P0mV5E+lOxPwKf0k8Uq4FMG+JA== X-Google-Smtp-Source: AH8x2251oo0i9tw9+C8aDXfOKIDqPQYJBDxWJ/azUf1KFnAm0IRfgAWsuJYRfSsxx0TTrNq+VQksGA== X-Received: by 10.28.10.6 with SMTP id 6mr8351816wmk.1.1516914213073; Thu, 25 Jan 2018 13:03:33 -0800 (PST) Received: from LAPTOP7UEFS44D ([156.212.211.35]) by smtp.gmail.com with ESMTPSA id r64sm2336153wma.15.2018.01.25.13.03.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Jan 2018 13:03:32 -0800 (PST) From: "Hesham Gneady" To: References: <9c07d01d395e7$9aeb6b20$d0c24160$@gmail.com> <0468664e-fefd-51c3-3681-52d2bac5f64f@t-online.de> In-Reply-To: Subject: RE: Wrong space parsed pdf Date: Thu, 25 Jan 2018 23:03:31 +0200 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQJC7XyaOoNsCaxMefZQ7+/H9XSDxgH/6rTWAhi2gFgCW4IFRqJyw7/A Content-Language: en-us X-Antivirus: Avast (VPS 180125-6, 01/25/2018), Outbound message X-Antivirus-Status: Clean Excellent! Best regards, Hesham ---------------------------------------------------------------------------- ---------------------- Included Message: Am 25.01.2018 um 21:33 schrieb Hesham Gneady: > I have reported this because the PDF appeared normal to me. If there > is a way to read the text in the PDF in a right way I hope you could > help me with that. See this issue: https://issues.apache.org/jira/browse/PDFBOX-3970 You need to replace LegacyPDFStreamEngine.java with the file from this issue (start reading at "This seems to be a moving target.") and build. Then the text of your file is extracted properly. Tilman --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org