Incrementally load HFiles outside of MR/Spark

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Incrementally load HFiles outside of MR/Spark

Mike Thomsen
I'm looking into creating HFiles directly from NiFi using the HBase API. It
seems pretty straight forward:

1. Open a HFile.Writer pointing to a file path in HDFS.
2. Write the cells with the HFile API.
3. Call the incremental loader API to have it tell HBase to load the
generated segments.

Is that right? If so, are there any gotchas that I should be aware of?

Thanks,

Mike
Reply | Threaded
Open this post in threaded view
|

Re: Incrementally load HFiles outside of MR/Spark

Ted Yu-3
You can refer to HFilePerformanceEvaluation where creation of Writer is
demonstrated:

      writer = HFile.getWriterFactoryNoCache(conf)

          .withPath(fs, mf)

          .withFileContext(hFileContext)

          .withComparator(CellComparator.getInstance())

          .create();

Cheers

On Sun, Feb 25, 2018 at 12:42 PM, Mike Thomsen <[hidden email]>
wrote:

> I'm looking into creating HFiles directly from NiFi using the HBase API. It
> seems pretty straight forward:
>
> 1. Open a HFile.Writer pointing to a file path in HDFS.
> 2. Write the cells with the HFile API.
> 3. Call the incremental loader API to have it tell HBase to load the
> generated segments.
>
> Is that right? If so, are there any gotchas that I should be aware of?
>
> Thanks,
>
> Mike
>