I am running a bulk import of large columns size average 15KB (web pages
source) or so per record
I have one region server with only 1 region no splits yet
I have one other server running thrift server and the same server running 1
thread import process
I am seeing at start about 60-80 records inserted per 3 secs reported by the
GUI of the master
but once I hit my 64MB memcache limit on the region server it blocks and
flushes the column.
Then immediately after that I see insert rate of about 600-700 per 3 sec
said the gui of the master and this
last until I am done inserting only to slow down for more flushes 20-25 secs
later and continues to speed along.
Any idea why it starts slow and jumps to such a higher rate of insert after
the memcache flush?
Again this is all single threaded so no MR job or anything like I have ran
this and seen it happen each time with the flushes
Happening at different times in the import and the same results happen so
that rules out smaller data in the end half
So wondering if this is something related to the region server or the thrift