From users-return-11584-archive-asf-public=cust-asf.ponee.io@pdfbox.apache.org Wed Mar 20 15:00:08 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id CB41E18062C for ; Wed, 20 Mar 2019 16:00:07 +0100 (CET) Received: (qmail 92510 invoked by uid 500); 20 Mar 2019 15:00:06 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 92499 invoked by uid 99); 20 Mar 2019 15:00:06 -0000 Received: from mail-relay.apache.org (HELO mailrelay1-lw-us.apache.org) (207.244.88.152) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Mar 2019 15:00:06 +0000 Received: from Christophers-MacBook-Pro-2.local (pool-108-28-187-75.washdc.fios.verizon.net [108.28.187.75]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 3B8619493 for ; Wed, 20 Mar 2019 15:00:05 +0000 (UTC) Subject: Re: Choosing a font for non-ASCII characters To: users@pdfbox.apache.org References: <76b65cad-bef6-946f-2539-41332ecba18b@lehmi.de> <1ea68adb-11c5-d3f6-9f58-67c1d5d1df3c@christopherschultz.net> <1173962a-f9a3-4e2e-ae10-dc0f467a83ba@t-online.de> <0B8BE5B1-54E7-4CE6-8D91-4FCC59E565E2@texture.com> <010101693efa9d86-8a889543-37a1-4aec-97d6-29d188c50b20-000000@us-west-2.amazonses.com> <212186df-492c-a433-437b-d63edd92f7c6@christopherschultz.net> <25eebe11-fc00-aa97-16f3-c890e411ea53@t-online.de> <6f5eed7d-6a22-8a3a-5895-0a6624e80ba2@t-online.de> <43df15c0-9670-c4a7-c32e-a88c093b8d7e@t-online.de> From: Christopher Schultz Openpgp: preference=signencrypt Autocrypt: addr=chris@christopherschultz.net; prefer-encrypt=mutual; keydata= mQINBE+pgz4BEADd7qAWgqXcNltlB3aow0UneRmNSVjHKgekgs0ZXxG9l50Athksr/3bL/yg bxFB00JcM9W+UxLhKHiMSyzfeBHn9l9wAlLFKs0S91KXTUnRwGFtvgstvGROoqPgTVREklnm yW/KpzOwqSrQ5xHcogaT+XWlXmRbtFypi52Z5HGWlFWWgwx0vKBWHmQayPtCif0v1RDxfdV9 zziodn0TnpfBQsEgf9TDAjkNT8f0ecwTnhSihTDm1W5HCK7Pm5DfUtree1Oh6Ncz2ljlUO0b 3Lai9pX48eZOj7WQXPefkcv2AoUvdELkQKw3klM5YNXbXPf1KAjky+q4DQ1ydD6LkK+9cI3S TeMesTlk/tytOsaN2NH2k87sEpcumbH0AcmPFEnIYUfm4KzWdKlYA6mbV3Pk3tHSuayyJovj h/7Y7BG9p2l7D60r49hzrTPG8VxNkSliNLcSjI3QjYpfhSlqmqXyVKzdzirK1HPr1xfJStig RpLP9nWarZjoXng9N0etGwtH/8roeDPYA8x9ba1KXy/1g/i+RLx2ms+rueCpnFZxU3GZNUSp RfpdUbwCN3Zm1w5Z6SI8X2aSnWWeYzU6HMsV+P4PROnFsgxDeOpyWhyEaaVLXQtOYwcHneHb n56vSG50TkAuHs5kk/3/YDPSsqjsUPOuhKgFMh3iqMTh5DMdSwARAQABtDJDaHJpc3RvcGhl ciBTY2h1bHR6IDxjaHJpc0BjaHJpc3RvcGhlcnNjaHVsdHoubmV0PokCOgQTAQgAJAIbLwUL CQgHAwUVCgkICwUWAgMBAAIeAQIXgAUCT6mETgIZAQAKCRDzrVyUpn9wflkxD/9IsahRqHTV /hH5nuPqVO692cQqHvPtMPO6lDb4909VN5T1i+1hFr80P0KVDL6EI78lDBJ2TThWI0o5vFdm sRlei59wsgTvkKTph5QwwOWl7OyzUDX3WbKhkNQdGf4I+/g/1s2bHaRoG30ELdL7cwUPCPrW 0KQwBy7Rtr0WbdujKOw9b/UcgyXEOE1wNcorq/E1o5/6BRYIcFQOO4sjHjGcChOpSg5ms4zb s+Xv3gOtLrbmOPRTXdvBxwJA6kkfQFHvI42kXYghTdqhBVPnHYPqUeavRsb+Yz3ghkZhj35i GfaGyXNwFBikCYjzIaj44NOkT1pU50MgIbjSJ+xoHnC20T942kekqp6wzqUM19Pa9ohsEdA1 Sf6/A7RmpZRrxSIY02ZVnGccnVjglnylVcnxrNAZC3ebxCeZPQ09FBR0Uqlsrdt7A3hlEP2F aoMTSa+hYqfWBGB7uZhcJZIsZspxm8J0txeOzYNSFDl7mF134ShRsq6dpSugCdcdeSWKliBz q0U8sIabOFLMxM0hbwkn2RG4OaurJLWXQf+7IhA/J8TizjkbdxLmR2PiTiVtrx484mpWpbF8 po/em0q/reFnL+JtOM6qlJE/Q4B6PfkchhU5vKPfmGw98t9guyw5G8YSR1rR+SOowHg4T/i2 Rezz1idKmoFpPdNFRPlOAC+d67kCDQRPqYM+ARAAzEItVpzvcgZB+faUWi54lJoA8GnVxXEe OQY+7wk/P5i9GtL0UVXC53j2F87BDVXGalKgVjEVdNY3Cyx+dJ2os65gjxd6ZK18zc6N7YZB Z00XNU9nTz5XImZzHn4VmeXYMQrKO/981nCNPlV6CVdgGg9wl1Ij5Sh8SSTb8kWSo1ngx+XX 4yJNUbfSh32yMPVGI7ZcoZLm9gdgTOOnuEkeeGs/lPvYN+1Cv/YtvkPybSOSWSdHxIVU4Iko 6V7IkM1amjdwKfoeg+CLhZsbY7VLAzVtGvaF5z4rtJtCfTfhbYD0wS8afEBcvsew1HdtYDT5 AJqojeZBGDuY7JCgALc3HCy34Zzk+mi1qwvrm5i/CBMuIvjxB2MkzhHQNUD20fzdRcoIgw4J IzbqZLlOpVFehDXzKT/h5vh+Uv7s6Rz5gP5i0Rkcghw00mRBvuN8mpQnLt4hYL22cNh/tk0L Fxda7ZaPehu7ug4E5FEB0Ifm1KV18P7Kpfu8tiSLz7rl++x73o4uv4bk1ZnjO/jFsx0KLGwq VxR276ZIwsV4WpLYvJ5fR0kqqd/TOKXGSEA0eGxWTeb/fNtkYemRVoasB1+rqjh/Rz0p20o8 elkqDhpzzhrMNzEMYkLySu7npWCeWW4Nd6097+OG9BCLO+ndGmAcupdu6WMEj2UlWsQxuCYC PgsAEQEAAYkEPgQYAQgACQUCT6mDPgIbLgIpCRDzrVyUpn9wfsFdIAQZAQgABgUCT6mDPgAK CRAc8Ck/pTykWO6WD/0XlAG4D4GwzzuOfh7DG6cm/I0vmASEJkY5ghStW4GUbYosgS/btyj/ YPWzVh4HWMvuA6YYKCuz/CM3h34dR25XmHqUdOyJOCnMJ3psdv5YsytgnEdvINZALlDdBX3G sfytgS0KnVjAc92LfJOxHAsZf4zE3SU28FMX7jCgeqO3YrvkHsZ8dzzgw3QYT0J3NcYfkflb DPBXBDGrvdUuea/w6F17pctdRdt7jE3JiLFq2F9ehXOSsIwecUlqVYiCRuxblD4cJ6gKMn0y 8zllW4GyIbf/+mNLkpKoMPYnptDvcEojluHtwbkSfF5AwgJbm6pfs9a2vpGBVko+dBXGh4/T 3qNYxeGEAsI0psEJu3EZN9dYv/ZOb69DUJ6SwEKp/L7lU7C8HoLx/MpKtuJO9OS5uuAhdBSi GqfaN9zP2NxPXSwnexVK2exy/h5sUevDsnBEHmyxe5GRSrIilyijLtlYhq2W7G95poxIFZuL Db98R+7VR9Yl9uOZ6kRBJmzp9X2oB8MDHoKe4QEuiRx/5/DNxB8i2QoTWN/BfluTSfVpO5rf jSXlaUuFOnouBrWdmbaBdg+47m4IGEz129Zdf+y+ISexQ6P16ZY1oYxYlbQSaEwk0+TJ4B0C uvMHwPF3SDH2LeRx+mK2OvwnVulvj2+WdW/rIVgwhwbKmBLj40R+Uq4zD/4iRxJ5PF1ynjxR po3Izp/ZrYWrPgtBg0jUZ8DdlAiRHCFGPpccK8RvBWXmtzF4XQsV39aPBqcE3W6IcTnIMrDi 6mnqealpfiUq+4RGNfRFN9wtgViZLy/FRWi76k+vo/Jmp7/K9JblGX48D2JL9FX0w5PXkpE4 abmY1OASQUiwoJ4n1asxwEonSaWeYbI7X5IqdvevGyfYdSn4VEywdrYGtWjsWlZ/DPofPwsI bQXGY6o+wg9lDAk2L2nVTa05XuyOooUPwKLD0WrLOIxLmcbVv/tgJG03/uI4iDitSofTKnpz E+xdpfFIyw1Mb8PO4WJi0gpHmmLUbG8AMLS+8wSDFwIA4TXQFy9suXXzLuuzML+G5h9Mo5D6 q5HsIe59lhdwk7oEPZJ1NWLfLavTENQg5ObS2YT1KaFskFxxgtcU0aBytAxTjkgGRB8UunXl NJeCuTIAUxXw41P93V4Khigc5dEOG1kEDoq0dAlAE7AbL6Vzc/Go+UwivtUil3sXADOyM9PT JjLNnye+2V0ywQncJ1AG6sxICpPKzv8oYP6xwurEuKnF8DAWEHEwT+Fb277Idv1v8uMGvltp coe7olE0O+TRUtMEwtEp4g4m8ym1rJI/yfwXtHkS8QcVBA9LRqcWEna1VPlT1pk3BSq/1xQa F/4OLScBfV2JbF93sN0SLw== Message-ID: <66749e3b-8739-7c7c-7776-7d1f6ceae362@christopherschultz.net> Date: Wed, 20 Mar 2019 11:00:04 -0400 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 In-Reply-To: <43df15c0-9670-c4a7-c32e-a88c093b8d7e@t-online.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Tilman, On 3/20/19 03:55, Tilman Hausherr wrote: > Am 19.03.2019 um 22:08 schrieb Christopher Schultz: Tilman, > > On 3/19/19 16:23, Tilman Hausherr wrote: >>>> Am 19.03.2019 um 19:45 schrieb Christopher Schultz: Tilman, >>>> >>>> So I'm starting to look toward making my code better now that >>>> it's actually working. Right now, my code looks like this: >>>> >>>> if(!isAnsiEncoding(strippedText)) { font = >>>> getFullUnicodeFont(); } >>>> >>>> Where one font simply replaces the other for strings that >>>> aren't available the the built-in font(s). >>>> >>>> I'd like to support emoji and stuff like that. I can find a >>>> font (or fonts) for that, but I think the only way I can do >>>> that with the existing API is something like this: >>>> >>>> Font[] fonts = new Font[] { builtIn, arialUnicode, emoji }; >>>> >>>> for(Font font : fonts) { try { page.setFont(font); >>>> page.showText(text); } catch (IllegalArgumentException iae) { >>>> // Try the next font } } >>>> >>>> That will "work" but it will not work if, for example, I need >>>> to print text that includes both Chinese characters (from >>>> arialUnicode font) and also emoji (from the hypothetical >>>> "emoji" font). >>>> >>>> If there any way to tell PDFBox to "pick the right font (from >>>> some list) for each character"? >>>> >>>> >>>>> No, that is why I created the EmbeddedMultipleFonts.java >>>>> example which I mentioned earlier in the thread. That one >>>>> can switch within strings. > Right, it basically does the same thing as I have above, but for a > bunch of increasingly-widening substrings, and it uses exceptions > for flow control. Yuck. > > I'd have to look more into what PDFont.encode does, but I'm > guessing that it wouldn't be too hard to build methods into the > PDFFont class that look something like this: > > /** * Returns true if this PDFont can render the whole string. */ > public boolean canEncode(String s); > > /** * Returns the longest String that can be successfully encoded > by this * PDFont, beginning at the beginning of {s}. If the whole > String {s} * is encodable, then {s} will be returned. If only a > part of {s} * is encodable, then the return value of this method > will be such that: * * > s.startsWith(getLongestEncodablePrefix(s)) == true * * * If the > first character of the string is not encodable in this PDFont, * an > empty string (or null?) will be returned. */ public String > getLongestEncodablePrefix(String s); > > >> That would just push what you called "Yuck" further downwards, or >> we would have to maintain code twice, one for checking whether >> something can encoded, and one for actually doing it. And this >> for all the 6, maybe 7 font types. Code reuse? >> Instead of going forward with your project with the working code >> provided, you're arguing about design issues. You are operating under the impression that I haven't already modified my own code to work. I have. I'm volunteering to help improve your product. You don't have to get so upset when someone offers help. - -chris -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlySVXQACgkQHPApP6U8 pFgBRQ/+NR6U1Btl12Oof9fM4tn77UNUgQ7qVPmrsW4ev/He1J/TlqNXcxUGhnG6 ZYZYlrjCmzLQ9oB2mMqfuG55gN/FPziYZwegVDFiU1O/40Rsdan1aW5BQnM14qWN z1+kBW0awOABdguMvpwjsMaGpxVFBMdMeHsxVQmmMD8LozOOuI2yJBEvCna8mvqS iFiPUC53sIxdTAKvnFZHIUoDYLlXTuuwd28gbJSDC+6G6YpeF+aRBqUj0vqc2bfk 9abJ4BZYOztysPrc/NWE97HBLxsYIhROZGsdVUTVhs8VgBsdzG7qXg9UhrWzTYPy YdtrldUFxb1WuJ/UQZZIPlAikPwlbI6S45Hzy1YlnBkWa8vqR4f0QLh3X458Zzxc /ZF+CbKaNe/BWDkBANZANmUf1TjArnIQp5jo4QsYgq2m6BfTbLeMfYDTRap92NpA M3kJQ0fU8gl39VWKk6DubeOWdkD+o/BusN/gOpg4z3YINH2TeHIf1w1u6k+lsg6B fGg4e71Hg556LkuT5eDgChXfMj35PXOVJ6qnhM+HZ2Z2bgY3U+bV/Hnrk9bKOVFa MlHPt48V/M1/AuTJ4PLBjXp9XNak0vxIRI0YMaUnQ3oZZgabVkG0SPAsdrYwEGuZ cQyMPMciLQIjQcExVGVwtaUD+ooMDAfQMHHRb9qeBJ0c/E30ung= =QRFg -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org