Return-Path: X-Original-To: apmail-openoffice-dev-archive@www.apache.org Delivered-To: apmail-openoffice-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ED51A19082 for ; Sun, 17 Apr 2016 20:35:03 +0000 (UTC) Received: (qmail 45555 invoked by uid 500); 17 Apr 2016 20:35:03 -0000 Delivered-To: apmail-openoffice-dev-archive@openoffice.apache.org Received: (qmail 45478 invoked by uid 500); 17 Apr 2016 20:35:03 -0000 Mailing-List: contact dev-help@openoffice.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@openoffice.apache.org Delivered-To: mailing list dev@openoffice.apache.org Received: (qmail 45467 invoked by uid 99); 17 Apr 2016 20:35:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Apr 2016 20:35:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 7E82C1A02CD for ; Sun, 17 Apr 2016 20:35:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.772 X-Spam-Level: * X-Spam-Status: No, score=1.772 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.972] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id HN2q-F7zaM5X for ; Sun, 17 Apr 2016 20:34:59 +0000 (UTC) Received: from bongo.tulip.relay.mailchannels.net (bongo.tulip.relay.mailchannels.net [23.83.218.21]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id E40B45F59E for ; Sun, 17 Apr 2016 20:34:57 +0000 (UTC) X-Sender-Id: a2hosting|x-authuser|himself@orcmid.com Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 313C361AA9 for ; Sun, 17 Apr 2016 20:34:48 +0000 (UTC) Received: from a2s42.a2hosting.com (ip-10-42-131-234.us-west-2.compute.internal [10.42.131.234]) by relay.mailchannels.net (Postfix) with ESMTPA id 681D161776 for ; Sun, 17 Apr 2016 20:34:47 +0000 (UTC) X-Sender-Id: a2hosting|x-authuser|himself@orcmid.com Received: from a2s42.a2hosting.com (a2s42.a2hosting.com [10.213.1.241]) (using TLSv1 with cipher DHE-RSA-AES256-SHA) by 0.0.0.0:2500 (trex/5.6.11); Sun, 17 Apr 2016 20:34:47 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: a2hosting|x-authuser|himself@orcmid.com X-MailChannels-Auth-Id: a2hosting X-MC-Loop-Signature: 1460925287645:4253129628 X-MC-Ingress-Time: 1460925287645 Received: from 75-172-98-31.tukw.qwest.net ([75.172.98.31]:33641 helo=Astraendo2) by a2s42.a2hosting.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.86_1) (envelope-from ) id 1artP2-000egA-OY for dev@openoffice.apache.org; Sun, 17 Apr 2016 16:34:45 -0400 Reply-To: From: "Dennis E. Hamilton" To: References: <20160417164444.2A0863A0099@svn01-us-west.apache.org> In-Reply-To: <20160417164444.2A0863A0099@svn01-us-west.apache.org> Subject: RE: svn commit: r1739628 - in /openoffice/trunk/main: connectivity/source/drivers/flat/ETable.cxx tools/source/stream/stream.cxx Date: Sun, 17 Apr 2016 13:34:51 -0700 Organization: NuovoDoc Message-ID: <006201d198e8$9868e1a0$c93aa4e0$@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 thread-index: AQJ22rrhJK48akGTrM5KawrgmtFTxZ5D69fg Content-Language: en-us X-AuthUser: himself@orcmid.com Does the rule about using "" to make a single quote inside a quoted = field also apply? - Dennis > -----Original Message----- > From: damjan@apache.org [mailto:damjan@apache.org] > Sent: Sunday, April 17, 2016 09:45 > To: commits@openoffice.apache.org > Subject: svn commit: r1739628 - in /openoffice/trunk/main: > connectivity/source/drivers/flat/ETable.cxx > tools/source/stream/stream.cxx >=20 > Author: damjan > Date: Sun Apr 17 16:44:43 2016 > New Revision: 1739628 >=20 > URL: http://svn.apache.org/viewvc?rev=3D1739628&view=3Drev > Log: > Make CSV line parsers consistent with CSV field parsers. >=20 > Our CSV field parsing algorithms treats fields starting with a quote > (immediately at the beginning of the row, or after the field = delimiter) > as > quoted. A quoted field ends at the corresponding closing quote, and = any > remaining text between the closing quote and the next field delimeter = or > end > of line is appended to the text already extracted from the field, but > not > processed further. Any quotes in this extra text are taken verbatim - > they > do not quote anything. >=20 > Our CSV line parsers were big hacks - they essentially read and > concatenate > lines until an even number of quote characters is found, and then feed > this > through the CSV field parsers. >=20 > This patch rewrites the line parsers to work exactly how the field > parsers > work. Text such as: > "another" ",something else > is now correctly parsed by both Calc and Base as: > [another "],[something else] > instead of breaking all further parsing. >=20 > Patch by: me >=20 >=20 > Modified: > openoffice/trunk/main/connectivity/source/drivers/flat/ETable.cxx > openoffice/trunk/main/tools/source/stream/stream.cxx >=20 > Modified: > openoffice/trunk/main/connectivity/source/drivers/flat/ETable.cxx > URL: > = http://svn.apache.org/viewvc/openoffice/trunk/main/connectivity/source/d > = rivers/flat/ETable.cxx?rev=3D1739628&r1=3D1739627&r2=3D1739628&view=3Ddif= f > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D > --- openoffice/trunk/main/connectivity/source/drivers/flat/ETable.cxx > (original) > +++ openoffice/trunk/main/connectivity/source/drivers/flat/ETable.cxx > Sun Apr 17 16:44:43 2016 > @@ -907,14 +907,64 @@ sal_Bool OFlatTable::readLine(QuotedToke > return sal_False; >=20 > QuotedTokenizedString sLine =3D line; // check if the string > continues on next line > - while( (sLine.GetString().GetTokenCount(m_cStringDelimiter) % 2) = !=3D > 1 ) > + xub_StrLen nLastOffset =3D 0; > + bool isQuoted =3D false; > + bool isFieldStarting =3D true; > + while (true) > { > - m_pFileStream->ReadByteStringLine(sLine,nEncoding); > - if ( !m_pFileStream->IsEof() ) > + bool wasQuote =3D false; > + const sal_Unicode *p; > + p =3D sLine.GetString().GetBuffer(); > + p +=3D nLastOffset; > + > + while (*p) > + { > + if (isQuoted) > + { > + if (*p =3D=3D m_cStringDelimiter) > + wasQuote =3D !wasQuote; > + else > + { > + if (wasQuote) > + { > + wasQuote =3D false; > + isQuoted =3D false; > + if (*p =3D=3D m_cFieldDelimiter) > + isFieldStarting =3D true; > + } > + } > + } > + else > + { > + if (isFieldStarting) > + { > + isFieldStarting =3D false; > + if (*p =3D=3D m_cStringDelimiter) > + isQuoted =3D true; > + else if (*p =3D=3D m_cFieldDelimiter) > + isFieldStarting =3D true; > + } > + else if (*p =3D=3D m_cFieldDelimiter) > + isFieldStarting =3D true; > + } > + ++p; > + } > + > + if (wasQuote) > + isQuoted =3D false; > + > + if (isQuoted) > { > - line.GetString().Append('\n'); > - line.GetString() +=3D sLine.GetString(); > - sLine =3D line; > + nLastOffset =3D sLine.Len(); > + m_pFileStream->ReadByteStringLine(sLine,nEncoding); > + if ( !m_pFileStream->IsEof() ) > + { > + line.GetString().Append('\n'); > + line.GetString() +=3D sLine.GetString(); > + sLine =3D line; > + } > + else > + break; > } > else > break; >=20 > Modified: openoffice/trunk/main/tools/source/stream/stream.cxx > URL: > = http://svn.apache.org/viewvc/openoffice/trunk/main/tools/source/stream/s > tream.cxx?rev=3D1739628&r1=3D1739627&r2=3D1739628&view=3Ddiff > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > =3D=3D=3D=3D=3D=3D > --- openoffice/trunk/main/tools/source/stream/stream.cxx (original) > +++ openoffice/trunk/main/tools/source/stream/stream.cxx Sun Apr 17 > 16:44:43 2016 > @@ -1128,38 +1128,59 @@ sal_Bool SvStream::ReadCsvLine( String& > { > const sal_Unicode* pSeps =3D rFieldSeparators.GetBuffer(); > xub_StrLen nLastOffset =3D 0; > - xub_StrLen nQuotes =3D 0; > + bool isQuoted =3D false; > + bool isFieldStarting =3D true; > while (!IsEof() && rStr.Len() < STRING_MAXLEN) > { > + bool wasQuote =3D false; > bool bBackslashEscaped =3D false; > - const sal_Unicode *p, *pStart; > - p =3D pStart =3D rStr.GetBuffer(); > + const sal_Unicode *p; > + p =3D rStr.GetBuffer(); > p +=3D nLastOffset; > while (*p) > { > - if (nQuotes) > + if (isQuoted) > { > if (*p =3D=3D cFieldQuote && !bBackslashEscaped) > - ++nQuotes; > - else if (bAllowBackslashEscape) > + wasQuote =3D !wasQuote; > + else > { > - if (*p =3D=3D '\\') > - bBackslashEscaped =3D !bBackslashEscaped; > - else > - bBackslashEscaped =3D false; > + if (bAllowBackslashEscape) > + { > + if (*p =3D=3D '\\') > + bBackslashEscaped =3D = !bBackslashEscaped; > + else > + bBackslashEscaped =3D false; > + } > + if (wasQuote) > + { > + wasQuote =3D false; > + isQuoted =3D false; > + if (lcl_UnicodeStrChr( pSeps, *p )) > + isFieldStarting =3D true; > + } > } > } > - else if (*p =3D=3D cFieldQuote && (p =3D=3D pStart || > - lcl_UnicodeStrChr( pSeps, p[-1]))) > - nQuotes =3D 1; > - // A quote character inside a field content does not > start > - // a quote. > + else > + { > + if (isFieldStarting) > + { > + isFieldStarting =3D false; > + if (*p =3D=3D cFieldQuote) > + isQuoted =3D true; > + else if (lcl_UnicodeStrChr( pSeps, *p )) > + isFieldStarting =3D true; > + } > + else if (lcl_UnicodeStrChr( pSeps, *p )) > + isFieldStarting =3D true; > + } > ++p; > } >=20 > - if (nQuotes % 2 =3D=3D 0) > - break; > - else > + if (wasQuote) > + isQuoted =3D false; > + > + if (isQuoted) > { > nLastOffset =3D rStr.Len(); > String aNext; > @@ -1167,6 +1188,8 @@ sal_Bool SvStream::ReadCsvLine( String& > rStr +=3D sal_Unicode(_LF); > rStr +=3D aNext; > } > + else > + break; > } > } > return nError =3D=3D SVSTREAM_OK; --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@openoffice.apache.org For additional commands, e-mail: dev-help@openoffice.apache.org