flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-933) Add an input format to read primitive types directly (not through tuples)
Date Fri, 27 Jun 2014 13:51:26 GMT

    [ https://issues.apache.org/jira/browse/FLINK-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045948#comment-14045948
] 

ASF GitHub Bot commented on FLINK-933:
--------------------------------------

Github user qmlmoon commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/47#discussion_r14293149
  
    --- Diff: stratosphere-java/src/test/java/eu/stratosphere/api/java/io/PrimitiveInputFormatTest.java
---
    @@ -0,0 +1,170 @@
    +/***********************************************************************************************************************
    + * Copyright (C) 2010-2013 by the Stratosphere project (http://stratosphere.eu)
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this
file except in compliance with
    + * the License. You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software distributed under
the License is distributed on
    + * an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the
    + * specific language governing permissions and limitations under the License.
    + **********************************************************************************************************************/
    +
    +package eu.stratosphere.api.java.io;
    +
    +import static org.junit.Assert.assertEquals;
    +import static org.junit.Assert.assertNull;
    +import static org.junit.Assert.assertTrue;
    +import static org.junit.Assert.fail;
    +
    +import java.io.File;
    +import java.io.FileWriter;
    +import java.io.IOException;
    +
    +import org.apache.log4j.Level;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import eu.stratosphere.configuration.Configuration;
    +import eu.stratosphere.core.fs.FileInputSplit;
    +import eu.stratosphere.core.fs.Path;
    +import eu.stratosphere.util.LogUtils;
    +
    +public class PrimitiveInputFormatTest {
    +
    +	private static final Path PATH = new Path("an/ignored/file/");
    +
    +	@BeforeClass
    +	public static void initialize() {
    +		LogUtils.initializeDefaultConsoleLogger(Level.WARN);
    +	}
    +
    +	@Test
    +	public void testStringInput() {
    +		try {
    +			final String fileContent = "abc|def||";
    +			final FileInputSplit split = createTempFile(fileContent);
    +
    +			final PrimitiveInputFormat<String> format = new PrimitiveInputFormat<String>(PATH,
'|', String.class);
    +
    +			final Configuration parameters = new Configuration();
    +			format.configure(parameters);
    +			format.open(split);
    +
    +			String result = null;
    +
    +			result = format.nextRecord(result);
    +			assertEquals("abc", result);
    +
    +			result = format.nextRecord(result);
    +			assertEquals("def", result);
    +
    +			result = format.nextRecord(result);
    +			assertEquals("", result);
    +
    +			result = format.nextRecord(result);
    +			assertNull(result);
    +			assertTrue(format.reachedEnd());
    +		}
    +		catch (Exception ex) {
    +			ex.printStackTrace();
    +			fail("Test failed due to a " + ex.getClass().getName() + ": " + ex.getMessage());
    +		}
    +	}
    +
    +
    +
    +	@Test
    +	public void testIntegerInput() throws IOException {
    +		try {
    +			final String fileContent = "111|222|";
    +			final FileInputSplit split = createTempFile(fileContent);
    +
    +			final PrimitiveInputFormat<Integer> format = new PrimitiveInputFormat<Integer>(PATH,'|',
Integer.class);
    +
    +			format.configure(new Configuration());
    +			format.open(split);
    +
    +			Integer result = null;
    +			result = format.nextRecord(result);
    +			assertEquals(Integer.valueOf(111), result);
    +
    +			result = format.nextRecord(result);
    +			assertEquals(Integer.valueOf(222), result);
    +
    +			result = format.nextRecord(result);
    +			assertNull(result);
    +			assertTrue(format.reachedEnd());
    +		}
    +		catch (Exception ex) {
    +			fail("Test failed due to a " + ex.getClass().getName() + ": " + ex.getMessage());
    +		}
    +	}
    +
    +	@Test
    +	public void testDoubleInputLinewise() throws IOException {
    +		try {
    +			final String fileContent = "1.21\n2.23\n";
    +			final FileInputSplit split = createTempFile(fileContent);
    +
    +			final PrimitiveInputFormat<Double> format = new PrimitiveInputFormat<Double>(PATH,
Double.class);
    +
    +			format.configure(new Configuration());
    +			format.open(split);
    +
    +			Double result = null;
    +			result = format.nextRecord(result);
    +			assertEquals(Double.valueOf(1.21), result);
    +
    +			result = format.nextRecord(result);
    +			assertEquals(Double.valueOf(2.23), result);
    +
    +			result = format.nextRecord(result);
    +			assertNull(result);
    +			assertTrue(format.reachedEnd());
    +		}
    +		catch (Exception ex) {
    +			fail("Test failed due to a " + ex.getClass().getName() + ": " + ex.getMessage());
    +		}
    +	}
    +
    +	private FileInputSplit createTempFile(String content) throws IOException {
    --- End diff --
    
    Yes, you are right. I also updated this in CsvInputFormatTest in the latest commit


> Add an input format to read primitive types directly (not through tuples)
> -------------------------------------------------------------------------
>
>                 Key: FLINK-933
>                 URL: https://issues.apache.org/jira/browse/FLINK-933
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Stephan Ewen
>            Assignee: Mingliang Qi
>            Priority: Minor
>              Labels: easyfix, features, starter
>
> Right now, reading primitive types goes either through custom formats (work intensive),
or through CSV inputs. The latter return tuples.
> To read a sequence of primitives, you need to go though Tuple1, which is clumsy.
> I would suggest to add an input format to read primitive types line wise (or otherwise
delimited), and also add a method to the environment for that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message