ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From diopek <deha.pe...@gmail.com>
Subject Re: Data Loading Performance Issue
Date Tue, 17 Nov 2015 16:36:22 GMT
Hi Denis, please see my answers below;

>>1) How big is the data set you're preloading? 
~10 GB
>>2) What is your grid configuration? How many nodes are there? Attache
Ignite configuration you use to start the grid nodes. 
1 node/JVM, My ignite configuration is as the following;
ignite-cache.xml
<http://apache-ignite-users.70518.x6.nabble.com/file/n1988/ignite-cache.xml>  
I also have this code to be used in Cache initialization using as template
for my cache config.
<bean id="rwaCacheTemplate"
class="org.apache.ignite.configuration.CacheConfiguration">
		
		<property name="atomicityMode" value="ATOMIC" />
		<property name="cacheMode" value="LOCAL" />
		<property name="backups" value="0" />
		<property name="swapEnabled" value="false" />
		<property name="memoryMode" value="ONHEAP_TIERED" />
		<property name="offHeapMaxMemory" value="0" />
	</bean>

>>3) You mentioned that you're getting the data from some DB. What kind of
DB are you working with? How do you connect to it from your local machine
and from remote machines? 
Our DB is Oracle 11g, I am using Spring Data source as you can see in above
config file. In my local machine when I ping DB server it is 19 ms on, in
other Linux server 0.32 ms 

>>4) Preloading on the servers might work slower because the data is
transferred across network while in case of your local laptop the localhost
may be leveraged. 
As I mentioned above item, ironically my PC to DB connection is slower than
server connection as app server and DB servers are in same data centers.
  
>>5) Measure the time that takes to load the data from DB. Probably this is
the slowest point. 
I measured that if I just populate regular Java HashMap with the same
database connection. It is 6 times faster that initializing the Ignite
cache.

>>6) Attache a snippet of your source code you use for data loading. 
// this snippert is from loadCache invoking function client file
public IgniteCache<Integer, ArrayList&lt;RwaRsExistingDTO>>
loadExistingCache(Date asOf, Integer scenId, String existingSql) {
		
		StringBuffer rep = new StringBuffer("'");
		rep.append(fastDateFormatddMMMyyyy.format(asOf)).append("'");
		existingSql = existingSql.replace("?", rep);
		String extSQL = existingSql.substring(0, existingSql.indexOf("ORDER"));
		StringBuffer cntExtSQL = new StringBuffer(
				"SELECT COUNT(*) FROM ( SELECT /*+ parallel (16) */ DISTINCT GOC, ACCT,
SUM_AFFIL_CODE, CURRENCY, FRCST_PROD_ID, COUNT(*) TOT FROM (");
		cntExtSQL.append(extSQL).append(" ) GROUP BY GOC,ACCT, SUM_AFFIL_CODE,
CURRENCY, FRCST_PROD_ID ");
		cntExtSQL.append(" ORDER BY GOC,ACCT, SUM_AFFIL_CODE, CURRENCY,
FRCST_PROD_ID)");
		logger.debug("cntExtSQL::{}", cntExtSQL);

		final int extStartSize = jdbcTemplate.queryForObject(cntExtSQL.toString(),
null, Integer.class);
		logger.info("--->>>extHashBucket::" + extStartSize);
		StopWatch sw1 = new StopWatch();
		stopWatchStart(sw1, "loadExistingCache::createCache");
		CacheConfiguration<Integer, ArrayList&lt;RwaRsExistingDTO>>
existingCacheCfg = new CacheConfiguration<>("MY_CACHE"); 
		existingCacheCfg.setStartSize(extStartSize);
		existingCacheCfg.setCacheMode(CacheMode.LOCAL);
	
existingCacheCfg.setCacheStoreFactory(FactoryBuilder.factoryOf(ExistingCacheStore.class));
		existingCacheCfg.setReadThrough(false);
		existingCacheCfg.setWriteThrough(false);
		
		Ignite ignite = Ignition.ignite();
		IgniteCache<Integer, ArrayList&lt;RwaRsExistingDTO>> cache =
ignite.createCache(existingCacheCfg);
		stopWatchEnd(sw1);
		stopWatchStart(sw1, "loadExistingCache::loadCache");
		cache.loadCache(null, asOf,scenId,existingSql);
		stopWatchEnd(sw1);
		return cache;
	}
//This snipppet s from another MyCacheStore.java file
@Override
	public void loadCache(final IgniteBiInClosure<Integer,
ArrayList&lt;RwaRsExistingDTO>> clo, Object... args) {

		if (args == null || args.length < 2)
			throw new CacheLoaderException("Expected asOf and scenId parameter is not
provided.");

		
		final Date asOf = (Date) args[0];
		final Integer scenId = (Integer) args[1];
		final String existingSql = (String) args[2];
		StringBuffer rep = new StringBuffer("'");

		logger.debug("loadExistingCache::SQL::{}", existingSql);

		ResultSetExtractor<Void> extMapResultSetExtractor = new
ResultSetExtractor<Void>() {
			@Override
			public Void extractData(ResultSet rs) throws SQLException,
DataAccessException {
				String prevGoc = null, prevAcct = null, prevSac = null, prevCcy = null;
				Integer prevFpId = null;
				ArrayList<RwaRsExistingDTO> currDTOList = null, prevDTOList = null;
				RwaRsExistingDTO dto = null, prevDto = null;
				final AtomicInteger entryCnt = new AtomicInteger(1);
				while (rs.next()) {
					int i = 1;
					dto = new RwaRsExistingDTO();
					dto.setAsOf(asOf);
					dto.setScnId(scenId);

					dto.setGoc(rs.getString(i++));
					dto.setAcct(rs.getString(i++));
					dto.setSumAffilCode(rs.getString(i++));
					dto.setCcyCode(rs.getString(i++));
					dto.setFrcstProdId(rs.getInt(i++));

					dto.setFrsBu(rs.getString(i++));
					dto.setRwaExposureType(rs.getString(i++));
					dto.setRiskAssetClass(rs.getString(i++));
					dto.setRiskSubAssetClass(rs.getString(i++));
					dto.setTreasLiqClass(rs.getString(i++));
					dto.setCounterpartyRating(rs.getString(i++));
					dto.setClearedStatus(rs.getString(i++));
					dto.setMaturityBand(rs.getString(i++));
					dto.setDerivativeType(rs.getString(i++));
					dto.setStartDate(rs.getDate(i++));
					dto.setMaturityDate(rs.getDate(i++));
					dto.setAmount(rs.getDouble(i++));
					dto.setReplenishFlag("N");


					if (dto.getGoc().equals(prevGoc) && dto.getAcct().equals(prevAcct)
							&& dto.getSumAffilCode().equals(prevSac) &&
dto.getCcyCode().equals(prevCcy)
							&& dto.getFrcstProdId().equals(prevFpId)) {
						prevDTOList.add(prevDto);
					} else {
						if (prevDto != null) {
							prevDTOList.add(prevDto);
							clo.apply(entryCnt.incrementAndGet(), prevDTOList);
						}
						currDTOList = new ArrayList<RwaRsExistingDTO>();
					}
					prevDto = dto;
					prevDTOList = currDTOList;
					prevGoc = dto.getGoc();
					prevAcct = dto.getAcct();
					prevSac = dto.getSumAffilCode();
					prevCcy = dto.getCcyCode();
					prevFpId = dto.getFrcstProdId();
				}
				if (prevDto != null) {
					prevDTOList.add(prevDto);
					clo.apply(entryCnt.incrementAndGet(), prevDTOList);
				}
				return null;
			}

		};

		jdbcTemplate.setFetchSize(SQL_FETCH_SIZE);
		jdbcTemplate.query(existingSql, extMapResultSetExtractor);
	}

>>7) What is CPU load during the cache preloading on all the machines? What
kind of VM options do you use (heap size, kind of GC, etc.) 
As I mentioned, my local machine is less powerful than UNIX server (8CPU vs.
64CPU and 32GB RAM vs. 1TB RAM and DB Network access is slower than UNIX
server (19ms vs. 0.32ms), both JVM is JDK 1.8.0_65 64 bit version and heap
configured as -Xms2048m -Xmx20480m

I can summarize my issues as below; 

1)Highest priority one, ironically, my local machine batch run time is much
lower 42 mins (local) vs. 2hr 56 mins (linux) both running on single node
eve though it is less powerful and I have other other apps running at the
same time such as Eclipse IDE, SQLdeveloper, Outlook, Word, UNIX box is
dedicated app server no other applications running on that server.

2) Second one is in general loading data from DB to Java Collection is much
faster than loading into Ignite cache by factor of ~6. Which is not a show
stopper but if we need to resolve the first problem as highest priority. I
appreciate your help on this. Thanks

Deha



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Data-Loading-Performance-Issue-tp1958p1988.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Mime
View raw message