2

I got connectors from https://cloud.google.com/hadoop/datastore-connector But I'm trying to add the datastore-connector (and bigquery-connector too) as a dependency in the pom... I don't know if it this is possible. I could not find the right artifact and groupId.

Is there some maven repository that contain the datastore-connector?

Furthermore, I am looking for the source of datastore-connector, but I didn't find it. By the notes in the CHANGES.txt, it seems to be coming from:

https://github.com/GoogleCloudPlatform/bigdata-interop

The source should be in the package com.google.cloud.hadoop.io.datastore (src/main/***/com/google/cloud/hadoop/io/datastore/) but it's not there.

In fact, the source of bigquery-connector appears to be on GitHub along with its pom, but is the source of datastore-connector available?

Misha Brukman
  • 10,866
  • 4
  • 54
  • 71

2 Answers2

1

The datastore-connector source is not available, nor is there a maven repo with the artifact. Your best option is to create a local raven repo in your source tree as described in this helpful article.

David
  • 7,623
  • 1
  • 19
  • 46
1

What David says in the other answer is correct. To elaborate more, the connector under the hood uses the Protocol Buffers SDK, and uses, for example, the QuerySplitter to define splits. In the near future, we will be posting more information to gcp-hadoop-announce with further guidance regarding the future of the Datastore connector for Hadoop.

You may want to familiarize yourself with other Datastore features that may suit your purposes better, including Datastore backup to GCS, and this codelab walking through an AppEngine-friendly approach to extracting data from Datastore and loading it into BigQuery for analysis. You may notice at the top of that page an announcement of trusted-tester availability for direct backend loading of Datastore backups into BigQuery.

Dennis Huo
  • 9,799
  • 20
  • 41