Quantcast

Spatial data posting in HBase

classic Classic list List threaded Threaded
16 messages Options
cto
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Spatial data posting in HBase

cto
Hi ,

I am very new in HBase. Could you please let me know , how to insert spatial data (Latitude / Longitude) in HBase using Java .
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Ted Yu-3
There're plenty of examples in unit tests.
e.g. :

      Put put = new Put(Bytes.toBytes("row" + String.format("%1$04d", i)));
      put.add(family, null, value);
      table.put(put);

value can be obtained through Bytes.toBytes().
table is an HTable.

Cheers


On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:

> Hi ,
>
> I am very new in HBase. Could you please let me know , how to insert
> spatial
> data (Latitude / Longitude) in HBase using Java .
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Adrien Mogenet
If you mean "insert and query" spatial data, look at algorithms that are
"distributed databases compliant" : geohash, z-index, voronoi diagram...

Well, that makes me want to write a blog article about these topics :)


On Tue, Sep 24, 2013 at 3:43 PM, Ted Yu <[hidden email]> wrote:

> There're plenty of examples in unit tests.
> e.g. :
>
>       Put put = new Put(Bytes.toBytes("row" + String.format("%1$04d", i)));
>       put.add(family, null, value);
>       table.put(put);
>
> value can be obtained through Bytes.toBytes().
> table is an HTable.
>
> Cheers
>
>
> On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:
>
> > Hi ,
> >
> > I am very new in HBase. Could you please let me know , how to insert
> > spatial
> > data (Latitude / Longitude) in HBase using Java .
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>



--
Adrien Mogenet
http://www.borntosegfault.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Enis Söztutar
You can look at the "HBase in Action" book, which contains a whole chapter
on an example GIS system on HBase.

Enis


On Fri, Oct 4, 2013 at 1:01 AM, Adrien Mogenet <[hidden email]>wrote:

> If you mean "insert and query" spatial data, look at algorithms that are
> "distributed databases compliant" : geohash, z-index, voronoi diagram...
>
> Well, that makes me want to write a blog article about these topics :)
>
>
> On Tue, Sep 24, 2013 at 3:43 PM, Ted Yu <[hidden email]> wrote:
>
> > There're plenty of examples in unit tests.
> > e.g. :
> >
> >       Put put = new Put(Bytes.toBytes("row" + String.format("%1$04d",
> i)));
> >       put.add(family, null, value);
> >       table.put(put);
> >
> > value can be obtained through Bytes.toBytes().
> > table is an HTable.
> >
> > Cheers
> >
> >
> > On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:
> >
> > > Hi ,
> > >
> > > I am very new in HBase. Could you please let me know , how to insert
> > > spatial
> > > data (Latitude / Longitude) in HBase using Java .
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> > > Sent from the HBase User mailing list archive at Nabble.com.
> > >
> >
>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel
In reply to this post by Adrien Mogenet
I don't think you really want to use geo hashes.

Check out InfoQ and Search on Boris Lublinsky.

On Oct 4, 2013, at 3:01 AM, Adrien Mogenet <[hidden email]> wrote:

> If you mean "insert and query" spatial data, look at algorithms that are
> "distributed databases compliant" : geohash, z-index, voronoi diagram...
>
> Well, that makes me want to write a blog article about these topics :)
>
>
> On Tue, Sep 24, 2013 at 3:43 PM, Ted Yu <[hidden email]> wrote:
>
>> There're plenty of examples in unit tests.
>> e.g. :
>>
>>      Put put = new Put(Bytes.toBytes("row" + String.format("%1$04d", i)));
>>      put.add(family, null, value);
>>      table.put(put);
>>
>> value can be obtained through Bytes.toBytes().
>> table is an HTable.
>>
>> Cheers
>>
>>
>> On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:
>>
>>> Hi ,
>>>
>>> I am very new in HBase. Could you please let me know , how to insert
>>> spatial
>>> data (Latitude / Longitude) in HBase using Java .
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>>
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>
>>
>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com





Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel
Actually just to follow up..

Geohash kinda has some issues at the global scale. Edge conditions on quadrant boundaries.

You're better off tiling the map, and then within each tile you can filter the list.
I think you'd also want to be a bit more OO than what's in the HBase book.

You will want to create a spatial point object that has x,y and z coordinates instead of storing the x, y, and z coordinates as separate columns.

Depending on the size of the tile the number of coordinates that you have to filter can become sparse, such that its actually faster to just have a single thread run through it.

Of course YMMV, however from experience... this works the best.

HTH

-Mike

On Oct 7, 2013, at 7:20 PM, Michael Segel <[hidden email]> wrote:

> I don't think you really want to use geo hashes.
>
> Check out InfoQ and Search on Boris Lublinsky.
>
> On Oct 4, 2013, at 3:01 AM, Adrien Mogenet <[hidden email]> wrote:
>
>> If you mean "insert and query" spatial data, look at algorithms that are
>> "distributed databases compliant" : geohash, z-index, voronoi diagram...
>>
>> Well, that makes me want to write a blog article about these topics :)
>>
>>
>> On Tue, Sep 24, 2013 at 3:43 PM, Ted Yu <[hidden email]> wrote:
>>
>>> There're plenty of examples in unit tests.
>>> e.g. :
>>>
>>>     Put put = new Put(Bytes.toBytes("row" + String.format("%1$04d", i)));
>>>     put.add(family, null, value);
>>>     table.put(put);
>>>
>>> value can be obtained through Bytes.toBytes().
>>> table is an HTable.
>>>
>>> Cheers
>>>
>>>
>>> On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:
>>>
>>>> Hi ,
>>>>
>>>> I am very new in HBase. Could you please let me know , how to insert
>>>> spatial
>>>> data (Latitude / Longitude) in HBase using Java .
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>
>>>
>>
>>
>>
>> --
>> Adrien Mogenet
>> http://www.borntosegfault.com
>
> The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com
>
>
>
>
>
>

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com





Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Otis Gospodnetic
In reply to this post by cto
Consider using Solr, which provides a lot of geospatial search support.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:

> Hi ,
>
> I am very new in HBase. Could you please let me know , how to insert
> spatial
> data (Latitude / Longitude) in HBase using Java .
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel-3
And Solr has what to do with storing data in HBase?

I guess its true… if all you have is a hammer…

The point I was raising was that geohash isn't the most efficient way to go when you look at the problem at a global level…

On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <[hidden email]> wrote:

> Consider using Solr, which provides a lot of geospatial search support.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
>
>> Hi ,
>>
>> I am very new in HBase. Could you please let me know , how to insert
>> spatial
>> data (Latitude / Longitude) in HBase using Java .
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>


smime.p7s (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Otis Gospodnetic
The point is that there are options (multiple different hammers) if
HBase support for geospatial is not there or doesn't meet OP's needs.

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
<[hidden email]> wrote:

> And Solr has what to do with storing data in HBase?
>
> I guess its true… if all you have is a hammer…
>
> The point I was raising was that geohash isn't the most efficient way to go when you look at the problem at a global level…
>
> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <[hidden email]> wrote:
>
>> Consider using Solr, which provides a lot of geospatial search support.
>>
>> Otis
>> Solr & ElasticSearch Support
>> http://sematext.com/
>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
>>
>>> Hi ,
>>>
>>> I am very new in HBase. Could you please let me know , how to insert
>>> spatial
>>> data (Latitude / Longitude) in HBase using Java .
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Nick Dimiduk
In reply to this post by cto
Hi there,

Just like other data modeling questions in HBase, how to store your spatial
data will depend on how you want to access it. Are you focused on update
performance or read queries? Are you accessing data based on point-in-space
(ie, match lng,lat within a level of accuracy), spatial extent queries, or
other attribute queries (ie, street address). Can you provide us with any
more information?

Thanks,
Nick

*disclaimer* I'm responsible for the chapter on spatial data in HBase in
Action. Any errors in that example are my own ;)


On Tue, Sep 24, 2013 at 4:15 AM, cto <[hidden email]> wrote:

> Hi ,
>
> I am very new in HBase. Could you please let me know , how to insert
> spatial
> data (Latitude / Longitude) in HBase using Java .
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel-3
In reply to this post by Otis Gospodnetic
HBase in Action goes through great depth of showing you how you could implement GIS information in HBase.

Unfortunately there are issues with Geohash and edge conditions which make it difficult to use when you're dealing with data on an edge of a quadrant.

A better way would be to create a point (geospatial point object) and store it in a single column.
(This goes beyond the example of what's in the book. ) And then index the data by tiles.


The downside is that you end up creating a lot more data…

Take a look at some of the stuff Boris Lublinsky published on InfoQ. There are also other articles on the net….

On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <[hidden email]> wrote:

> The point is that there are options (multiple different hammers) if
> HBase support for geospatial is not there or doesn't meet OP's needs.
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
> <[hidden email]> wrote:
>> And Solr has what to do with storing data in HBase?
>>
>> I guess its true… if all you have is a hammer…
>>
>> The point I was raising was that geohash isn't the most efficient way to go when you look at the problem at a global level…
>>
>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <[hidden email]> wrote:
>>
>>> Consider using Solr, which provides a lot of geospatial search support.
>>>
>>> Otis
>>> Solr & ElasticSearch Support
>>> http://sematext.com/
>>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
>>>
>>>> Hi ,
>>>>
>>>> I am very new in HBase. Could you please let me know , how to insert
>>>> spatial
>>>> data (Latitude / Longitude) in HBase using Java .
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>
>>
>


smime.p7s (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Adrien Mogenet
Michael, don't you think Geohashes can be satisfying and well-suited for
many cases anyway? Searching in a bounding box or arbitrary polygon is not
that heavy with Geohash, even on edge conditions. The biggest risk IMHO is
to have to deal with tons of invalid extra points if the geohash query is
not accurate enough and your points distribution is very sparse so that
many points will be found in a geohash despite they don't respond to your
query criteria.

However, if your query embeds enough bits of precision, Geohashes offer
some nice guarantees for distributed databases and your queries should
remain efficient enough.

Another worst case of course is to look for K-NN since Geohash is not a
real longest-common-prefix algorithm but once again, if your points
distribution is approximately well balanced, this works not that bad
without doing lots of recursive queries or fetching tons of useless data
(but I do agree looking into your tiles would probably be more appropriate
in that case).

I'm planning to write an article on that points, so further technical
arguments are welcome :-}

On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <[hidden email]>wrote:

> HBase in Action goes through great depth of showing you how you could
> implement GIS information in HBase.
>
> Unfortunately there are issues with Geohash and edge conditions which make
> it difficult to use when you're dealing with data on an edge of a quadrant.
>
> A better way would be to create a point (geospatial point object) and
> store it in a single column.
> (This goes beyond the example of what's in the book. ) And then index the
> data by tiles.
>
>
> The downside is that you end up creating a lot more data…
>
> Take a look at some of the stuff Boris Lublinsky published on InfoQ. There
> are also other articles on the net….
>
> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <[hidden email]>
> wrote:
>
> > The point is that there are options (multiple different hammers) if
> > HBase support for geospatial is not there or doesn't meet OP's needs.
> >
> > Otis
> > --
> > Solr & ElasticSearch Support -- http://sematext.com/
> > Performance Monitoring -- http://sematext.com/spm
> >
> >
> >
> > On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
> > <[hidden email]> wrote:
> >> And Solr has what to do with storing data in HBase?
> >>
> >> I guess its true… if all you have is a hammer…
> >>
> >> The point I was raising was that geohash isn't the most efficient way
> to go when you look at the problem at a global level…
> >>
> >> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
> [hidden email]> wrote:
> >>
> >>> Consider using Solr, which provides a lot of geospatial search support.
> >>>
> >>> Otis
> >>> Solr & ElasticSearch Support
> >>> http://sematext.com/
> >>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
> >>>
> >>>> Hi ,
> >>>>
> >>>> I am very new in HBase. Could you please let me know , how to insert
> >>>> spatial
> >>>> data (Latitude / Longitude) in HBase using Java .
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>>
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> >>>> Sent from the HBase User mailing list archive at Nabble.com.
> >>>>
> >>
> >
>
>


--
Adrien Mogenet
http://www.borntosegfault.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel
Adrien,

In terms of efficiency...

A general solution that can be applied to all problems in all areas is going to be best.
Geohash gets ugly when you're around the equator.  You can have two points literally a couple of km away that would have two very different geo hashes.

So if you tile the globe, depending on the size of the tile, you calculate the tile, its surrounding tiles (if necessary) and then sweep through the data to find your object.

I'm not suggesting you not to use geohash, just that its not going to be the most efficient.

Note that the the downside to tiling is that if you're doing a geospatial index... your data volume explodes because you are storing references to the data at different tile levels.

Its a trade off.



On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <[hidden email]> wrote:

> Michael, don't you think Geohashes can be satisfying and well-suited for
> many cases anyway? Searching in a bounding box or arbitrary polygon is not
> that heavy with Geohash, even on edge conditions. The biggest risk IMHO is
> to have to deal with tons of invalid extra points if the geohash query is
> not accurate enough and your points distribution is very sparse so that
> many points will be found in a geohash despite they don't respond to your
> query criteria.
>
> However, if your query embeds enough bits of precision, Geohashes offer
> some nice guarantees for distributed databases and your queries should
> remain efficient enough.
>
> Another worst case of course is to look for K-NN since Geohash is not a
> real longest-common-prefix algorithm but once again, if your points
> distribution is approximately well balanced, this works not that bad
> without doing lots of recursive queries or fetching tons of useless data
> (but I do agree looking into your tiles would probably be more appropriate
> in that case).
>
> I'm planning to write an article on that points, so further technical
> arguments are welcome :-}
>
> On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <[hidden email]>wrote:
>
>> HBase in Action goes through great depth of showing you how you could
>> implement GIS information in HBase.
>>
>> Unfortunately there are issues with Geohash and edge conditions which make
>> it difficult to use when you're dealing with data on an edge of a quadrant.
>>
>> A better way would be to create a point (geospatial point object) and
>> store it in a single column.
>> (This goes beyond the example of what's in the book. ) And then index the
>> data by tiles.
>>
>>
>> The downside is that you end up creating a lot more data…
>>
>> Take a look at some of the stuff Boris Lublinsky published on InfoQ. There
>> are also other articles on the net….
>>
>> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <[hidden email]>
>> wrote:
>>
>>> The point is that there are options (multiple different hammers) if
>>> HBase support for geospatial is not there or doesn't meet OP's needs.
>>>
>>> Otis
>>> --
>>> Solr & ElasticSearch Support -- http://sematext.com/
>>> Performance Monitoring -- http://sematext.com/spm
>>>
>>>
>>>
>>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
>>> <[hidden email]> wrote:
>>>> And Solr has what to do with storing data in HBase?
>>>>
>>>> I guess its true… if all you have is a hammer…
>>>>
>>>> The point I was raising was that geohash isn't the most efficient way
>> to go when you look at the problem at a global level…
>>>>
>>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
>> [hidden email]> wrote:
>>>>
>>>>> Consider using Solr, which provides a lot of geospatial search support.
>>>>>
>>>>> Otis
>>>>> Solr & ElasticSearch Support
>>>>> http://sematext.com/
>>>>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
>>>>>
>>>>>> Hi ,
>>>>>>
>>>>>> I am very new in HBase. Could you please let me know , how to insert
>>>>>> spatial
>>>>>> data (Latitude / Longitude) in HBase using Java .
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>
>>>>
>>>
>>
>>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com





Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Nick Dimiduk
You can treat a geohash of a fixed precision as a tile and calculate the
neighbors of that tile. This is precisely what I did in the chapter in
HBaseIA. In that way, it's no different than a tile system.


On Sat, Oct 12, 2013 at 11:33 AM, Michael Segel
<[hidden email]>wrote:

> Adrien,
>
> In terms of efficiency...
>
> A general solution that can be applied to all problems in all areas is
> going to be best.
> Geohash gets ugly when you're around the equator.  You can have two points
> literally a couple of km away that would have two very different geo hashes.
>
> So if you tile the globe, depending on the size of the tile, you calculate
> the tile, its surrounding tiles (if necessary) and then sweep through the
> data to find your object.
>
> I'm not suggesting you not to use geohash, just that its not going to be
> the most efficient.
>
> Note that the the downside to tiling is that if you're doing a geospatial
> index... your data volume explodes because you are storing references to
> the data at different tile levels.
>
> Its a trade off.
>
>
>
> On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <[hidden email]>
> wrote:
>
> > Michael, don't you think Geohashes can be satisfying and well-suited for
> > many cases anyway? Searching in a bounding box or arbitrary polygon is
> not
> > that heavy with Geohash, even on edge conditions. The biggest risk IMHO
> is
> > to have to deal with tons of invalid extra points if the geohash query is
> > not accurate enough and your points distribution is very sparse so that
> > many points will be found in a geohash despite they don't respond to your
> > query criteria.
> >
> > However, if your query embeds enough bits of precision, Geohashes offer
> > some nice guarantees for distributed databases and your queries should
> > remain efficient enough.
> >
> > Another worst case of course is to look for K-NN since Geohash is not a
> > real longest-common-prefix algorithm but once again, if your points
> > distribution is approximately well balanced, this works not that bad
> > without doing lots of recursive queries or fetching tons of useless data
> > (but I do agree looking into your tiles would probably be more
> appropriate
> > in that case).
> >
> > I'm planning to write an article on that points, so further technical
> > arguments are welcome :-}
> >
> > On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <
> [hidden email]>wrote:
> >
> >> HBase in Action goes through great depth of showing you how you could
> >> implement GIS information in HBase.
> >>
> >> Unfortunately there are issues with Geohash and edge conditions which
> make
> >> it difficult to use when you're dealing with data on an edge of a
> quadrant.
> >>
> >> A better way would be to create a point (geospatial point object) and
> >> store it in a single column.
> >> (This goes beyond the example of what's in the book. ) And then index
> the
> >> data by tiles.
> >>
> >>
> >> The downside is that you end up creating a lot more data…
> >>
> >> Take a look at some of the stuff Boris Lublinsky published on InfoQ.
> There
> >> are also other articles on the net….
> >>
> >> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <
> [hidden email]>
> >> wrote:
> >>
> >>> The point is that there are options (multiple different hammers) if
> >>> HBase support for geospatial is not there or doesn't meet OP's needs.
> >>>
> >>> Otis
> >>> --
> >>> Solr & ElasticSearch Support -- http://sematext.com/
> >>> Performance Monitoring -- http://sematext.com/spm
> >>>
> >>>
> >>>
> >>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
> >>> <[hidden email]> wrote:
> >>>> And Solr has what to do with storing data in HBase?
> >>>>
> >>>> I guess its true… if all you have is a hammer…
> >>>>
> >>>> The point I was raising was that geohash isn't the most efficient way
> >> to go when you look at the problem at a global level…
> >>>>
> >>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
> >> [hidden email]> wrote:
> >>>>
> >>>>> Consider using Solr, which provides a lot of geospatial search
> support.
> >>>>>
> >>>>> Otis
> >>>>> Solr & ElasticSearch Support
> >>>>> http://sematext.com/
> >>>>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
> >>>>>
> >>>>>> Hi ,
> >>>>>>
> >>>>>> I am very new in HBase. Could you please let me know , how to insert
> >>>>>> spatial
> >>>>>> data (Latitude / Longitude) in HBase using Java .
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> View this message in context:
> >>>>>>
> >>
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> >>>>>> Sent from the HBase User mailing list archive at Nabble.com.
> >>>>>>
> >>>>
> >>>
> >>
> >>
> >
> >
> > --
> > Adrien Mogenet
> > http://www.borntosegfault.com
>
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Adrien Mogenet
This is also what I had in mind. Computing the neighbors and/or the higher
level of a "tile" is a quite easy bit manipulation. Dealing with equator
corner cases must not be considered as an issue.


On Sun, Oct 13, 2013 at 1:16 AM, Nick Dimiduk <[hidden email]> wrote:

> You can treat a geohash of a fixed precision as a tile and calculate the
> neighbors of that tile. This is precisely what I did in the chapter in
> HBaseIA. In that way, it's no different than a tile system.
>
>
> On Sat, Oct 12, 2013 at 11:33 AM, Michael Segel
> <[hidden email]>wrote:
>
> > Adrien,
> >
> > In terms of efficiency...
> >
> > A general solution that can be applied to all problems in all areas is
> > going to be best.
> > Geohash gets ugly when you're around the equator.  You can have two
> points
> > literally a couple of km away that would have two very different geo
> hashes.
> >
> > So if you tile the globe, depending on the size of the tile, you
> calculate
> > the tile, its surrounding tiles (if necessary) and then sweep through the
> > data to find your object.
> >
> > I'm not suggesting you not to use geohash, just that its not going to be
> > the most efficient.
> >
> > Note that the the downside to tiling is that if you're doing a geospatial
> > index... your data volume explodes because you are storing references to
> > the data at different tile levels.
> >
> > Its a trade off.
> >
> >
> >
> > On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <[hidden email]>
> > wrote:
> >
> > > Michael, don't you think Geohashes can be satisfying and well-suited
> for
> > > many cases anyway? Searching in a bounding box or arbitrary polygon is
> > not
> > > that heavy with Geohash, even on edge conditions. The biggest risk IMHO
> > is
> > > to have to deal with tons of invalid extra points if the geohash query
> is
> > > not accurate enough and your points distribution is very sparse so that
> > > many points will be found in a geohash despite they don't respond to
> your
> > > query criteria.
> > >
> > > However, if your query embeds enough bits of precision, Geohashes offer
> > > some nice guarantees for distributed databases and your queries should
> > > remain efficient enough.
> > >
> > > Another worst case of course is to look for K-NN since Geohash is not a
> > > real longest-common-prefix algorithm but once again, if your points
> > > distribution is approximately well balanced, this works not that bad
> > > without doing lots of recursive queries or fetching tons of useless
> data
> > > (but I do agree looking into your tiles would probably be more
> > appropriate
> > > in that case).
> > >
> > > I'm planning to write an article on that points, so further technical
> > > arguments are welcome :-}
> > >
> > > On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <
> > [hidden email]>wrote:
> > >
> > >> HBase in Action goes through great depth of showing you how you could
> > >> implement GIS information in HBase.
> > >>
> > >> Unfortunately there are issues with Geohash and edge conditions which
> > make
> > >> it difficult to use when you're dealing with data on an edge of a
> > quadrant.
> > >>
> > >> A better way would be to create a point (geospatial point object) and
> > >> store it in a single column.
> > >> (This goes beyond the example of what's in the book. ) And then index
> > the
> > >> data by tiles.
> > >>
> > >>
> > >> The downside is that you end up creating a lot more data…
> > >>
> > >> Take a look at some of the stuff Boris Lublinsky published on InfoQ.
> > There
> > >> are also other articles on the net….
> > >>
> > >> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <
> > [hidden email]>
> > >> wrote:
> > >>
> > >>> The point is that there are options (multiple different hammers) if
> > >>> HBase support for geospatial is not there or doesn't meet OP's needs.
> > >>>
> > >>> Otis
> > >>> --
> > >>> Solr & ElasticSearch Support -- http://sematext.com/
> > >>> Performance Monitoring -- http://sematext.com/spm
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
> > >>> <[hidden email]> wrote:
> > >>>> And Solr has what to do with storing data in HBase?
> > >>>>
> > >>>> I guess its true… if all you have is a hammer…
> > >>>>
> > >>>> The point I was raising was that geohash isn't the most efficient
> way
> > >> to go when you look at the problem at a global level…
> > >>>>
> > >>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
> > >> [hidden email]> wrote:
> > >>>>
> > >>>>> Consider using Solr, which provides a lot of geospatial search
> > support.
> > >>>>>
> > >>>>> Otis
> > >>>>> Solr & ElasticSearch Support
> > >>>>> http://sematext.com/
> > >>>>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
> > >>>>>
> > >>>>>> Hi ,
> > >>>>>>
> > >>>>>> I am very new in HBase. Could you please let me know , how to
> insert
> > >>>>>> spatial
> > >>>>>> data (Latitude / Longitude) in HBase using Java .
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> View this message in context:
> > >>>>>>
> > >>
> >
> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
> > >>>>>> Sent from the HBase User mailing list archive at Nabble.com.
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> > >
> > > --
> > > Adrien Mogenet
> > > http://www.borntosegfault.com
> >
> > The opinions expressed here are mine, while they may reflect a cognitive
> > thought, that is purely accidental.
> > Use at your own risk.
> > Michael Segel
> > michael_segel (AT) hotmail.com
> >
> >
> >
> >
> >
> >
>



--
Adrien Mogenet
http://www.borntosegfault.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Spatial data posting in HBase

Michael Segel-3
Yes, you can..  but you're doing more work to calculate  the geohash when you don't have to.

On Oct 13, 2013, at 5:33 AM, Adrien Mogenet <[hidden email]> wrote:

> This is also what I had in mind. Computing the neighbors and/or the higher
> level of a "tile" is a quite easy bit manipulation. Dealing with equator
> corner cases must not be considered as an issue.
>
>
> On Sun, Oct 13, 2013 at 1:16 AM, Nick Dimiduk <[hidden email]> wrote:
>
>> You can treat a geohash of a fixed precision as a tile and calculate the
>> neighbors of that tile. This is precisely what I did in the chapter in
>> HBaseIA. In that way, it's no different than a tile system.
>>
>>
>> On Sat, Oct 12, 2013 at 11:33 AM, Michael Segel
>> <[hidden email]>wrote:
>>
>>> Adrien,
>>>
>>> In terms of efficiency...
>>>
>>> A general solution that can be applied to all problems in all areas is
>>> going to be best.
>>> Geohash gets ugly when you're around the equator.  You can have two
>> points
>>> literally a couple of km away that would have two very different geo
>> hashes.
>>>
>>> So if you tile the globe, depending on the size of the tile, you
>> calculate
>>> the tile, its surrounding tiles (if necessary) and then sweep through the
>>> data to find your object.
>>>
>>> I'm not suggesting you not to use geohash, just that its not going to be
>>> the most efficient.
>>>
>>> Note that the the downside to tiling is that if you're doing a geospatial
>>> index... your data volume explodes because you are storing references to
>>> the data at different tile levels.
>>>
>>> Its a trade off.
>>>
>>>
>>>
>>> On Oct 12, 2013, at 2:34 AM, Adrien Mogenet <[hidden email]>
>>> wrote:
>>>
>>>> Michael, don't you think Geohashes can be satisfying and well-suited
>> for
>>>> many cases anyway? Searching in a bounding box or arbitrary polygon is
>>> not
>>>> that heavy with Geohash, even on edge conditions. The biggest risk IMHO
>>> is
>>>> to have to deal with tons of invalid extra points if the geohash query
>> is
>>>> not accurate enough and your points distribution is very sparse so that
>>>> many points will be found in a geohash despite they don't respond to
>> your
>>>> query criteria.
>>>>
>>>> However, if your query embeds enough bits of precision, Geohashes offer
>>>> some nice guarantees for distributed databases and your queries should
>>>> remain efficient enough.
>>>>
>>>> Another worst case of course is to look for K-NN since Geohash is not a
>>>> real longest-common-prefix algorithm but once again, if your points
>>>> distribution is approximately well balanced, this works not that bad
>>>> without doing lots of recursive queries or fetching tons of useless
>> data
>>>> (but I do agree looking into your tiles would probably be more
>>> appropriate
>>>> in that case).
>>>>
>>>> I'm planning to write an article on that points, so further technical
>>>> arguments are welcome :-}
>>>>
>>>> On Thu, Oct 10, 2013 at 7:51 PM, Michael Segel <
>>> [hidden email]>wrote:
>>>>
>>>>> HBase in Action goes through great depth of showing you how you could
>>>>> implement GIS information in HBase.
>>>>>
>>>>> Unfortunately there are issues with Geohash and edge conditions which
>>> make
>>>>> it difficult to use when you're dealing with data on an edge of a
>>> quadrant.
>>>>>
>>>>> A better way would be to create a point (geospatial point object) and
>>>>> store it in a single column.
>>>>> (This goes beyond the example of what's in the book. ) And then index
>>> the
>>>>> data by tiles.
>>>>>
>>>>>
>>>>> The downside is that you end up creating a lot more data…
>>>>>
>>>>> Take a look at some of the stuff Boris Lublinsky published on InfoQ.
>>> There
>>>>> are also other articles on the net….
>>>>>
>>>>> On Oct 9, 2013, at 1:35 PM, Otis Gospodnetic <
>>> [hidden email]>
>>>>> wrote:
>>>>>
>>>>>> The point is that there are options (multiple different hammers) if
>>>>>> HBase support for geospatial is not there or doesn't meet OP's needs.
>>>>>>
>>>>>> Otis
>>>>>> --
>>>>>> Solr & ElasticSearch Support -- http://sematext.com/
>>>>>> Performance Monitoring -- http://sematext.com/spm
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 9, 2013 at 11:14 AM, Michael Segel
>>>>>> <[hidden email]> wrote:
>>>>>>> And Solr has what to do with storing data in HBase?
>>>>>>>
>>>>>>> I guess its true… if all you have is a hammer…
>>>>>>>
>>>>>>> The point I was raising was that geohash isn't the most efficient
>> way
>>>>> to go when you look at the problem at a global level…
>>>>>>>
>>>>>>> On Oct 9, 2013, at 9:34 AM, Otis Gospodnetic <
>>>>> [hidden email]> wrote:
>>>>>>>
>>>>>>>> Consider using Solr, which provides a lot of geospatial search
>>> support.
>>>>>>>>
>>>>>>>> Otis
>>>>>>>> Solr & ElasticSearch Support
>>>>>>>> http://sematext.com/
>>>>>>>> On Sep 24, 2013 8:29 AM, "cto" <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>> Hi ,
>>>>>>>>>
>>>>>>>>> I am very new in HBase. Could you please let me know , how to
>> insert
>>>>>>>>> spatial
>>>>>>>>> data (Latitude / Longitude) in HBase using Java .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> View this message in context:
>>>>>>>>>
>>>>>
>>>
>> http://apache-hbase.679495.n3.nabble.com/Spatial-data-posting-in-HBase-tp4051123.html
>>>>>>>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Adrien Mogenet
>>>> http://www.borntosegfault.com
>>>
>>> The opinions expressed here are mine, while they may reflect a cognitive
>>> thought, that is purely accidental.
>>> Use at your own risk.
>>> Michael Segel
>>> michael_segel (AT) hotmail.com
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
>
> --
> Adrien Mogenet
> http://www.borntosegfault.com


smime.p7s (1K) Download Attachment
Loading...