region files

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

region files

Rajeshkumar J
Hi,

   we have region max file size as 10 GB. Whether the hfiles of a region
exists in same region server or will it be distributed?

Thanks
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: region files

Ted Yu-3
The hfiles of a region are stored on hdfs. By default, hdfs has replication
factor of 3.
If you're not using read replica feature, any single region is served by
one region server (however the data blocks of the hfile may not be on the
same node as the region server).

Cheers

On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <[hidden email]
> wrote:

> Hi,
>
>    we have region max file size as 10 GB. Whether the hfiles of a region
> exists in same region server or will it be distributed?
>
> Thanks
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: region files

Rajeshkumar J
Thanks Ted. If data blocks of the hfile may not be on the same node as the
region server then how data locality is achieved when mapreduce is run over
hbase tables



On Fri, May 26, 2017 at 6:15 PM, Ted Yu <[hidden email]> wrote:

> The hfiles of a region are stored on hdfs. By default, hdfs has replication
> factor of 3.
> If you're not using read replica feature, any single region is served by
> one region server (however the data blocks of the hfile may not be on the
> same node as the region server).
>
> Cheers
>
> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
> [hidden email]
> > wrote:
>
> > Hi,
> >
> >    we have region max file size as 10 GB. Whether the hfiles of a region
> > exists in same region server or will it be distributed?
> >
> > Thanks
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: region files

Ted Yu-3
Consider running major compaction which restores data locality.

Thanks

> On May 26, 2017, at 6:08 AM, Rajeshkumar J <[hidden email]> wrote:
>
> Thanks Ted. If data blocks of the hfile may not be on the same node as the
> region server then how data locality is achieved when mapreduce is run over
> hbase tables
>
>
>
>> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <[hidden email]> wrote:
>>
>> The hfiles of a region are stored on hdfs. By default, hdfs has replication
>> factor of 3.
>> If you're not using read replica feature, any single region is served by
>> one region server (however the data blocks of the hfile may not be on the
>> same node as the region server).
>>
>> Cheers
>>
>> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
>> [hidden email]
>>> wrote:
>>
>>> Hi,
>>>
>>>   we have region max file size as 10 GB. Whether the hfiles of a region
>>> exists in same region server or will it be distributed?
>>>
>>> Thanks
>>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: region files

Rajeshkumar J
I have seen the code in that while creating input split they are also
sending region info with that splits. Is there any reason for that as all
the hfiles are not going to be in that server

On Fri, May 26, 2017 at 7:06 PM, Ted Yu <[hidden email]> wrote:

> Consider running major compaction which restores data locality.
>
> Thanks
>
> > On May 26, 2017, at 6:08 AM, Rajeshkumar J <[hidden email]>
> wrote:
> >
> > Thanks Ted. If data blocks of the hfile may not be on the same node as
> the
> > region server then how data locality is achieved when mapreduce is run
> over
> > hbase tables
> >
> >
> >
> >> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <[hidden email]> wrote:
> >>
> >> The hfiles of a region are stored on hdfs. By default, hdfs has
> replication
> >> factor of 3.
> >> If you're not using read replica feature, any single region is served by
> >> one region server (however the data blocks of the hfile may not be on
> the
> >> same node as the region server).
> >>
> >> Cheers
> >>
> >> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
> >> [hidden email]
> >>> wrote:
> >>
> >>> Hi,
> >>>
> >>>   we have region max file size as 10 GB. Whether the hfiles of a region
> >>> exists in same region server or will it be distributed?
> >>>
> >>> Thanks
> >>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: region files

Josh Elser-2
The assumption is that one of those three copies of the HDFS block
comprising your HFiles are stored on the local datanode.

That is what the major compaction process guarantee.

On 5/26/17 9:59 AM, Rajeshkumar J wrote:

> I have seen the code in that while creating input split they are also
> sending region info with that splits. Is there any reason for that as all
> the hfiles are not going to be in that server
>
> On Fri, May 26, 2017 at 7:06 PM, Ted Yu <[hidden email]> wrote:
>
>> Consider running major compaction which restores data locality.
>>
>> Thanks
>>
>>> On May 26, 2017, at 6:08 AM, Rajeshkumar J <[hidden email]>
>> wrote:
>>>
>>> Thanks Ted. If data blocks of the hfile may not be on the same node as
>> the
>>> region server then how data locality is achieved when mapreduce is run
>> over
>>> hbase tables
>>>
>>>
>>>
>>>> On Fri, May 26, 2017 at 6:15 PM, Ted Yu <[hidden email]> wrote:
>>>>
>>>> The hfiles of a region are stored on hdfs. By default, hdfs has
>> replication
>>>> factor of 3.
>>>> If you're not using read replica feature, any single region is served by
>>>> one region server (however the data blocks of the hfile may not be on
>> the
>>>> same node as the region server).
>>>>
>>>> Cheers
>>>>
>>>> On Thu, May 25, 2017 at 11:45 PM, Rajeshkumar J <
>>>> [hidden email]
>>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>    we have region max file size as 10 GB. Whether the hfiles of a region
>>>>> exists in same region server or will it be distributed?
>>>>>
>>>>> Thanks
>>>>
>>
>
Loading...