Return-Path: X-Original-To: apmail-flink-issues-archive@minotaur.apache.org Delivered-To: apmail-flink-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 88CF01746A for ; Wed, 11 Mar 2015 14:28:05 +0000 (UTC) Received: (qmail 31149 invoked by uid 500); 11 Mar 2015 14:28:05 -0000 Delivered-To: apmail-flink-issues-archive@flink.apache.org Received: (qmail 31103 invoked by uid 500); 11 Mar 2015 14:28:05 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 31094 invoked by uid 99); 11 Mar 2015 14:28:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2015 14:28:05 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 11 Mar 2015 14:28:03 +0000 Received: (qmail 30811 invoked by uid 99); 11 Mar 2015 14:27:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2015 14:27:38 +0000 Date: Wed, 11 Mar 2015 14:27:38 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@flink.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (FLINK-1512) Add CsvReader for reading into POJOs. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/FLINK-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356930#comment-14356930 ] ASF GitHub Bot commented on FLINK-1512: --------------------------------------- Github user fhueske commented on a diff in the pull request: https://github.com/apache/flink/pull/426#discussion_r26215940 --- Diff: flink-java/src/test/java/org/apache/flink/api/java/io/CsvInputFormatTest.java --- @@ -684,4 +693,178 @@ private void testRemovingTrailingCR(String lineBreakerInFile, String lineBreaker } } + @Test + public void testPojoType() throws Exception { + File tempFile = File.createTempFile("CsvReaderPojoType", "tmp"); + tempFile.deleteOnExit(); + tempFile.setWritable(true); + + OutputStreamWriter wrt = new OutputStreamWriter(new FileOutputStream(tempFile)); + wrt.write("123,AAA,3.123,BBB\n"); + wrt.write("456,BBB,1.123,AAA\n"); + wrt.close(); + + @SuppressWarnings("unchecked") + TypeInformation typeInfo = (TypeInformation) TypeExtractor.createTypeInfo(PojoItem.class); + CsvInputFormat inputFormat = new CsvInputFormat(new Path(tempFile.toURI().toString()), typeInfo); + + Configuration parameters = new Configuration(); + inputFormat.configure(parameters); + + inputFormat.setDelimiter('\n'); + inputFormat.setFieldDelimiter(','); + + FileInputSplit[] splits = inputFormat.createInputSplits(1); + + inputFormat.open(splits[0]); + + PojoItem item = new PojoItem(); + inputFormat.nextRecord(item); + + assertEquals(123, item.field1); + assertEquals("AAA", item.field2); + assertEquals(Double.valueOf(3.123), item.field3); + assertEquals("BBB", item.field4); + + inputFormat.nextRecord(item); + + assertEquals(456, item.field1); + assertEquals("BBB", item.field2); + assertEquals(Double.valueOf(1.123), item.field3); + assertEquals("AAA", item.field4); + } + + @Test + public void testPojoTypeWithPrivateField() throws Exception { + File tempFile = File.createTempFile("CsvReaderPojoType", "tmp"); + tempFile.deleteOnExit(); + tempFile.setWritable(true); + + OutputStreamWriter wrt = new OutputStreamWriter(new FileOutputStream(tempFile)); + wrt.write("123,AAA,3.123,BBB\n"); + wrt.write("456,BBB,1.123,AAA\n"); + wrt.close(); + + @SuppressWarnings("unchecked") + TypeInformation typeInfo = (TypeInformation) TypeExtractor.createTypeInfo(PrivatePojoItem.class); + CsvInputFormat inputFormat = new CsvInputFormat(new Path(tempFile.toURI().toString()), typeInfo); + + Configuration parameters = new Configuration(); + inputFormat.configure(parameters); + + inputFormat.setDelimiter('\n'); + inputFormat.setFieldDelimiter(','); + + FileInputSplit[] splits = inputFormat.createInputSplits(1); + + inputFormat.open(splits[0]); + + PrivatePojoItem item = new PrivatePojoItem(); + inputFormat.nextRecord(item); + + assertEquals(123, item.field1); + assertEquals("AAA", item.field2); + assertEquals(Double.valueOf(3.123), item.field3); + assertEquals("BBB", item.field4); + + inputFormat.nextRecord(item); + + assertEquals(456, item.field1); + assertEquals("BBB", item.field2); + assertEquals(Double.valueOf(1.123), item.field3); + assertEquals("AAA", item.field4); + } + + @Test + public void testPojoTypeWithMappingInformation() throws Exception { + File tempFile = File.createTempFile("CsvReaderPojoType", "tmp"); + tempFile.deleteOnExit(); + tempFile.setWritable(true); + + OutputStreamWriter wrt = new OutputStreamWriter(new FileOutputStream(tempFile)); + wrt.write("123,3.123,AAA,BBB\n"); + wrt.write("456,1.123,BBB,AAA\n"); + wrt.close(); + + @SuppressWarnings("unchecked") + TypeInformation typeInfo = (TypeInformation) TypeExtractor.createTypeInfo(PojoItem.class); + CsvInputFormat inputFormat = new CsvInputFormat(new Path(tempFile.toURI().toString()), typeInfo); + inputFormat.setFieldsMap(new String[]{"field1", "field3", "field2", "field4"}); + + Configuration parameters = new Configuration(); + inputFormat.configure(parameters); + + inputFormat.setDelimiter('\n'); + inputFormat.setFieldDelimiter(','); + + FileInputSplit[] splits = inputFormat.createInputSplits(1); + + inputFormat.open(splits[0]); + + PojoItem item = new PojoItem(); + inputFormat.nextRecord(item); + + assertEquals(123, item.field1); + assertEquals("AAA", item.field2); + assertEquals(Double.valueOf(3.123), item.field3); + assertEquals("BBB", item.field4); + + inputFormat.nextRecord(item); + + assertEquals(456, item.field1); + assertEquals("BBB", item.field2); + assertEquals(Double.valueOf(1.123), item.field3); + assertEquals("AAA", item.field4); + } + --- End diff -- Can you add a few tests where the CSV file has more fields than the POJO and - fill all fields of the POJO with some fields of the file (incl. first and last) - fill a subset of the POJO fields with some fields of the file Add tests for correct error messages: - number of POJO fields != number of selected CSV fields - selected POJO field does not exist - POJO field type is not a Java Primitive (+String) which the format cannot parse (check available FieldParsers) > Add CsvReader for reading into POJOs. > ------------------------------------- > > Key: FLINK-1512 > URL: https://issues.apache.org/jira/browse/FLINK-1512 > Project: Flink > Issue Type: New Feature > Components: Java API, Scala API > Reporter: Robert Metzger > Assignee: Chiwan Park > Priority: Minor > Labels: starter > > Currently, the {{CsvReader}} supports only TupleXX types. > It would be nice if users were also able to read into POJOs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)