camel-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claus Straube <claus.stra...@catify.com>
Subject Re: Camel performance tuning
Date Mon, 12 Nov 2012 13:16:16 GMT
Have you tried a higher completion size? For us 750 was the best.

On 09.11.2012 19:59, Gonzalo Vasquez wrote:
> Ok, I've included an aggregator in the splitter, as follows:
>
> 		<camel:route id="pager" autoStartup="true">
> 			<camel:from
> 				uri="file:///tmp/in?charset=Windows-1252&amp;move=${file:parent}/../paged/${file:name.noext}.paged.ack&amp;preMove=${file:name.noext}-${date:now:yyyyMMddHHmmssSSS}.${file:ext}"
/>
> 			<camel:log message="Iniciando paging" />
> 			<camel:setHeader headerName="start">
> 				<camel:simple>${date:now:mm}:${date:now:ss}.${date:now:SSS}</camel:simple>
> 			</camel:setHeader>
> 			<camel:split streaming="true" parallelProcessing="false">
> 				<camel:tokenize token="\n" />
> 				<!-- <camel:log message="${property.CamelSplitIndex}" /> -->
> 				<camel:to uri="bean:pager" />
> 				<camel:aggregate strategyRef="aggregatorStrategy">
> 					<camel:correlationExpression>
> 						<camel:simple>${file:name}</camel:simple>
> 					</camel:correlationExpression>
> 					<camel:completionSize>
> 						<camel:constant>250</camel:constant>
> 					</camel:completionSize>
> 					<camel:to
> 						uri="file:///tmp/paged?charset=utf8&amp;fileName=${file:name.noext}.paged&amp;fileExist=Append"
/>
> 				</camel:aggregate>
> 			</camel:split>
> 			<camel:log
> 				message="Elapsed: ${header.start} - ${date:now:mm}:${date:now:ss}.${date:now:SSS}"
/>
> 		</camel:route>
>
>
> And the AggregationStrategy:
>
> 	<bean id="aggregatorStrategy" class="cl.altiuz.reports.etl.ConcatAggregationStrategy"
/>
>
>
> I've also added some headers & logging to calculate elapsed time.
>
> Pre-aggregator the elapsed time was about 30 seconds (for the 5MB test file), and now
is about half (15 secs), I can see clearly the improvement, but not as much as expected.
>
> Any extra tips? I''ve included the custom AggregationStrategy I had to create, as all
I needed was appending/concatenating body contents.
>
>
>
> Gonzalo Vásquez Sáez
> Gerente Investigación y Desarrollo (R&D)
> Altiuz Soluciones Tecnológicas de Negocios Ltda.
> Av. Nueva Tajamar 555 Of. 802, Las Condes
> (56-2) 335 2461
> gvasquez@altiuz.cl
> http://www.altiuz.cl
>   
>
>
>
> El 09-11-2012, a las 15:09, Christian Müller <christian.mueller@gmail.com> escribió:
>
>> Using Hypersonic, Hadoop or Mongo for such a use case is "over engineering"
>> the requirement and will end up in much more complicated solution - IMO.
>>
>> Best,
>> Christian
>>
>> On Fri, Nov 9, 2012 at 6:57 PM, <Ramkumar.Iyer@cognizant.com> wrote:
>>
>>> You may also want to check out Hadoop and map reduce
>>>
>>>
>>>
>>> http://camel.apache.org/hdfs.html
>>>
>>>
>>>
>>> with respect to point a and b.
>>>
>>>
>>>
>>> You can have an index on the record and the “reduce” job can serialize on
>>> the index.
>>>
>>>
>>>
>>> *From:* Gonzalo Vasquez [mailto:gvasquez@altiuz.cl]
>>> *Sent:* Friday, November 09, 2012 10:16 PM
>>> *To:* users@camel.apache.org
>>> *Subject:* Re: Camel performance tuning
>>>
>>>
>>>
>>> Thanks for your answer, my comments:
>>>
>>>
>>>
>>> a) a 5M file could be loaded into memory, but I have streaming enabled as
>>> file size could be in the range of GB. Notwithstanding, I'll check what
>>> Hypersonic & Mongo are, as I'm not aware of them.
>>>
>>> b) Parallel processing is set to false, because records must preserve
>>> order on the output file
>>>
>>> c) Don't see the point here
>>>
>>> d) See a)
>>>
>>> e) what about async processing? There's no "long running process" here
>>>
>>>
>>>
>>> Thanks again.-
>>>
>>>
>>>
>>> *Gonzalo Vásquez Sáez*
>>>
>>> *Gerente Investigación y Desarrollo (R&D)*
>>> *Altiuz* Soluciones Tecnológicas de Negocios Ltda.
>>> Av. Nueva Tajamar 555 Of. 802, Las Condes
>>> (56-2) 335 2461
>>> *gvasquez@altiuz.c <gcoppa@altiuz.com>l*
>>>
>>> *http://www.altiuz.cl*
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> El 09-11-2012, a las 13:12, <Ramkumar.Iyer@cognizant.com> escribió:
>>>
>>>
>>>
>>>   I am really new to Camel but here are some options you can try
>>>
>>>
>>>
>>> a)      Can you load the 5 MB file to memory before splitting it ? That
>>> way IO will not be a problem. Probably put it in something like Hypersonic
>>> or Mongo
>>>
>>> b)      Why is parallel  processing false ? Are the records related to
>>> each other ? If true you can take advantage of multicore
>>>
>>> c)       Is it possible to first split the files into chunks and then use
>>> process the chunks independently ?
>>>
>>> d)      Can you write into memory and flush at once ?
>>>
>>> e)      Sync/Asynch : http://camel.apache.org/async.html
>>>
>>>
>>>
>>> *From:* Gonzalo Vasquez [mailto:gvasquez@altiuz.cl]
>>> *Sent:* Friday, November 09, 2012 8:32 PM
>>> *To:* users@camel.apache.org
>>> *Subject:* Camel performance tuning
>>>
>>>
>>>
>>> I'm running a route that basically adds a character per line to a plain
>>> text file, but it's taking to long, and it seems that it's due to some kind
>>> of buffering issue when reading/writing from disk.
>>>
>>>
>>>
>>> I'm processing a 5MB file (attached as DC_FACCL132_0000
>>> MORA_1075_16-10-2012_19-09-47_15.txt.zip), with the corresponding XSL
>>> template (also attached).
>>>
>>>
>>>
>>> It's taking for ever to process such a file, I understand I'm tokenizing
>>> on line breaks, which could be the source of the problem as there are many
>>> lines in the file (48198 exactly), but when running jvisualvm (see attached
>>> images/snapshot)I can see the writing op is invoked 20386 times, which seem
>>> not related to the line count. Is there an output buffer size that I can
>>> configure? Or something like that?
>>>
>>>
>>>
>>> This is the route:
>>>
>>> <camel:route id="pager" autoStartup="true">
>>>
>>> <camel:from
>>>
>>> uri="
>>> file:///tmp/in?charset=Windows-1252&amp;move=${file:parent}/../paged/${file:name.noext}.paged.ack&amp;preMove=${file:name.noext}-${date:now:yyyyMMddHHmmssSSS}.${file:ext}
>>> " />
>>>
>>> <camel:split streaming="true" parallelProcessing="false">
>>>
>>> <camel:tokenize token="\n" />
>>>
>>> <camel:to uri="bean:pager" />
>>>
>>> <camel:to
>>>
>>> uri="
>>> file:///tmp/paged?charset=utf8&amp;fileName=${file:name.noext}.paged&amp;fileExist=Append
>>> " />
>>>
>>> </camel:split>
>>>
>>> </camel:route>
>>>
>>>
>>>
>>> This is the referenced bean:
>>>
>>>
>>>
>>> <bean id="pager" class="cl.altiuz.reports.etl.TextProcessor">
>>>
>>> <property name="xsltPath"
>>>
>>> value=
>>> "/Users/gonzalovasquez/Documents/workspace/altiuz-reports/reports-etl/xsl/pager.xsl"
>>> />
>>>
>>> <property name="param" value="C.*PAG.* 1" />
>>>
>>> </bean>
>>>
>>>
>>>
>>> Camel versión is 2,10.1, and happens both on OSX & MS Windows, so I think
>>> isn't a platform dependent problem, but a configuration one.
>>>
>>>
>>>
>>> Any ideas? Any thing else that I should send?
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>> *Gonzalo Vásquez Sáez*
>>>
>>> *Gerente Investigación y Desarrollo (R&D)*
>>> *Altiuz* Soluciones Tecnológicas de Negocios Ltda.
>>> Av. Nueva Tajamar 555 Of. 802, Las Condes
>>> (56-2) 335 2461
>>> *gvasquez@altiuz.c <gcoppa@altiuz.com>l*
>>>
>>> *http://www.altiuz.cl*
>>>
>>>
>>>
>>>
>>>
>>>        This e-mail and any files transmitted with it are for the sole use
>>> of the intended recipient(s) and may contain confidential and privileged
>>> information. If you are not the intended recipient(s), please reply to the
>>> sender and destroy all copies of the original message. Any unauthorized
>>> review, use, disclosure, dissemination, forwarding, printing or copying of
>>> this email, and/or any action taken in reliance on the contents of this
>>> e-mail is strictly prohibited and may be unlawful.
>>>
>>>
>>> This e-mail and any files transmitted with it are for the sole use of
>>> the intended recipient(s) and may contain confidential and privileged
>>> information. If you are not the intended recipient(s), please reply to the
>>> sender and destroy all copies of the original message. Any unauthorized
>>> review, use, disclosure, dissemination, forwarding, printing or copying of
>>> this email, and/or any action taken in reliance on the contents of this
>>> e-mail is strictly prohibited and may be unlawful.
>>>
>>
>>
>> --


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message