Questions tagged [hawq]

This tag is for questions about Pivotal HAWQ, a SQL on Hadoop implementation

Pivotal HAWQ supports low-latency analytic SQL queries, coupled with massively parallel machine learning capabilities, to shorten data-driven innovation cycles for the enterprise. HAWQ enables discovery-based analysis of large data sets and rapid, iterative development of data analytics applications that apply deep machine learning. It reads data from and writes data to HDFS natively. Using HAWQ functionality, you can interact with petabyte range data sets. HAWQ provides users with a complete, standards-compliant SQL interface to Hadoop.

Homepage

Official Documentation

127 questions
8
votes
3 answers

Error : relation does not exist, on greenplum database

I'm working on PostgreSQL 8.2.15 (Greenplum database 4.2.0 build 1)(HAWQ 1.2.1.0 build 10335). I wrote a function like create or replace function my_function ( ... select exists(select 1 from my_table1 where condition) into result; I tested it…
Clxy
  • 435
  • 5
  • 12
7
votes
1 answer

Talend greenplumRow error handling

I want to create views in greenplum HAWQ using a simple talend job, that would basically have a fileinput that contains all the views then I need to execute the CREATE VIEW script. Since these views (50-60.000) come from an oracle system I need to…
Balazs Gunics
  • 1,849
  • 2
  • 15
  • 23
4
votes
3 answers

Why HDFS not preferred with applications that require low latency?

I am new to Hadoop and HDFS and it confuses me as to why HDFS is not preferred with applications that require low latency. In a big data scenerio, we would have data spread over different community hardware, so accessing the data should be faster.
Nidhi
  • 174
  • 1
  • 1
  • 13
3
votes
1 answer

psql: database "template0" is not currently accepting connections

We have Installed fresh gpdb database.But,when trying to connect with template0 database. [gpadmin@mdw~]$ psql -d template0 psql: FATAL: database "template0" is not currently accepting connections [gpadmin@mdw~]$ We tried to Update the FLAG…
NEO
  • 319
  • 7
  • 26
3
votes
1 answer

When should I use Greenplum Database versus HAWQ?

We are having use case for retail industry data. We are into making of EDW. We are currently doing reporting from HAWQ.But We wanted to shift our MPP database from Hawq into Greenplum. Basically,We would like to make changes into current data…
NEO
  • 319
  • 7
  • 26
3
votes
1 answer

Connecting Spark to HAWQ via JDBC driver

Trying to connect to HAWQ from Spark, using greenplum's odbc/jdbc drivers (downloaded from the proper Pivotal page). Using Spark 1.4, here's the sample code written in python: (All capitals have proper variable assignments) ... from pyspark import…
WaveRider
  • 367
  • 2
  • 10
3
votes
2 answers

Greenplum, Pivotal HD + Spark, or HAWQ for TBs of Structured Data?

I have TBs of structured data in a Greenplum DB. I need to run what is essentially a MapReduce job on my data. I found myself reimplementing at least the features of MapReduce just so that this data would fit in memory (in a streaming fashion). …
BAR
  • 12,752
  • 18
  • 79
  • 153
2
votes
3 answers

HAWQ PostgreSQL - Increment row based on previous row

I need to create a table2 from this table1 trying to update the below table : TABLE1: ID Rank Event 123456 1 178 123456 2 123456 3 123456 4 155 123456 5 123456 6 192 123456 7 356589 1 165 356589 2 356589 3 …
2
votes
3 answers

Greenplum Database :psql: could not connect to server: No such file or directory

I am bashing my head against the wall. its been 4 days.but psql is not connecting. We have a small array of Greenplum database.In that, We have the master node. when i am trying to use psql utility Getting this error : [gpadmin@master gpseg-1]$…
NEO
  • 319
  • 7
  • 26
2
votes
1 answer

How do I avoid date type column of MSSQL INTO PIVOTAL HAWQ null at DBMS migration

We are trying to pull data from external source (mssql) to postgres. But when i checked for invoicedate column entries are getting blank at the same time mssql is showing invoicedate values for those entries. ie We tried following query on both…
NEO
  • 319
  • 7
  • 26
2
votes
0 answers

Setting spark memory allocations for extracting 125 Gb of data...ExecutorLostFailure

I'm trying to pull a 126 Gb table out of HAWQ (PostgreSQL, in this case 8.2) into Spark and it is not working. I can pull smaller tables no problem. For this one I keep getting the error: org.apache.spark.SparkException: Job aborted due to stage…
WaveRider
  • 367
  • 2
  • 10
2
votes
3 answers

How to load data from oracle and sql server to HAWQ using Spring XD

Hi I have tables in Oracle and SQL Server. I need to load data from oracle and sql server into Pivotal HAWQ using Spring XD. Couldn't find in documentation.
zniv
  • 156
  • 1
  • 1
  • 11
1
vote
0 answers

Greenplum's pg_dump cannot support lock statement

I'm using pg_dump to backup and recreate a database's structure. To that end I'm calling pg_dump -scf backup.sql for the backup. But it fails with the following error: pg_dump: [archiver (db)] query failed: ERROR: Cannot support lock statement…
Mike S
  • 1,031
  • 1
  • 9
  • 21
1
vote
1 answer

HAWQ - how to choose a node to install segments

I have six-node cluster and I want to install HAWQ database and PXF on it. My cluster looks like that: Node1 - NameNode, ResourceManager, HiveMetastore, HiveClient Node2 - SNameNode, NodeManager Node3 - DataNode, NodeManager Node4 - DataNode,…
Mrgr8m4
  • 417
  • 8
  • 24
1
vote
1 answer

GPDB : SSH permission denied (public key)

When trying to ssh (from greenplum system user) one of datanode from Master gpdb host. env - gpdb 4.3.10 Getting error [gpadmin@mdw ~]$ ssh datanode Permission denied (publickey,gssapi-keyex,gssapi-with-mic). [gpadmin@mdw ~]$ WE Tried Tried on …
1
2 3
8 9